Commit Graph

1435 Commits

Author SHA1 Message Date
Kazu Hirata 60db8d9b4e Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
2022-07-30 10:35:48 -07:00
Rafael Auler fc0ced73dc Add BAT testing framework
This patch refactors BAT to be testable as a library, so we
can have open-source tests on it. This further fixes an issue with
basic blocks that lack a valid input offset, making BAT omit those
when writing translation tables.

Test Plan: new testcases added, new testing tool added (llvm-bat-dump)

Differential Revision: https://reviews.llvm.org/D129382
2022-07-29 14:55:04 -07:00
Fangrui Song 7430894a65 Replace Optional::hasValue with has_value or operator bool. NFC 2022-07-29 10:57:25 -07:00
Fangrui Song 999514bb9a [bolt] Replace Optional::getValue with value or operator*. NFC 2022-07-29 01:15:24 -07:00
Huan Nguyen 52cd00cabf [BOLT] Ignore functions accessing false positive jump tables
Disassembly and branch target analysis are not decoupled, so any
analysis that depends on disassembly may not operate properly.

In specific, analyzeJumpTable uses instruction bounds check property.
A jump table was analyzed twice: (a) during disassembly, and (b) after
disassembly, so there are potentially some mismatched results.

In this update, functions that access JTs which fail the second check
will be marked as ignored.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D130431
2022-07-28 23:22:17 -07:00
Huan Nguyen ccabbfff86 [BOLT] Remove --allow-stripped option
AllowStripped has not been used in BOLT.
This option is replaced by actively detecting stripped binary.

Test Plan:

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D130036
2022-07-28 23:15:53 -07:00
Huan Nguyen 986362d4a3 [BOLT] Add BinaryContext::IsStripped
Determine stripped status of a binary based on .symtab

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D130034
2022-07-28 23:11:03 -07:00
Simon Tatham 0db13e10c5 [bolt,AArch64] Fix one more test failure from D130358.
This one actually makes the test simpler, because lit doesn't have to
reconstitute a 32-bit little-endian value from individual bytes any
more: llvm-objdump is printing the desired 32-bit value in the first
place, so we can move straight on to doing the arithmetic on it.
2022-07-26 16:41:09 +01:00
Amir Ayupov 77c1977384 [BOLT] Support files with no symbols
`LastSymbol` handling in `discoverFileObjects` assumes a non-zero number of
symbols in an object file. It's not the case for broken_dynsym.test added in
D130073, and potentially other stripped binaries.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D130544
2022-07-26 00:07:59 -07:00
Amir Ayupov 79c2fe066d [BOLT][TEST] Update fptr.test
The test exercises an implicit ptr-to-int conversion which is made an error in
D129881. We acknowledge the error but still want to test this case.
Add `-Wno-int-conversion` to silence the error.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D130546
2022-07-25 22:00:46 -07:00
Fabian Parzefall 83882606db [BOLT] Process each block only once in fixCFGForPIC
Rather than iterating over the whole function from the start until no
internal calls are found, process each block only once and continue
processing after splitting. This version of the function also does not
seemingly invalidate iterators from within the loop.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D130436
2022-07-25 15:06:24 -07:00
Huan Nguyen 8eb68d92d4 [BOLT] Handle broken .dynsym in stripped binaries
Strip tools cause a few symbols in .dynsym to have bad section index.
This update safely keeps such broken symbols intact.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D130073
2022-07-22 11:24:09 -07:00
zr33 a2035c566f [BOLT][DWARF] Fix bolt/test/X86/shared-abbrev.s
There should not be a end of child mark before DW_AT_ranges, removed it and fixed unit offset.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D130335
2022-07-22 10:45:28 -07:00
Maksim Panchenko 661577b5f4 [BOLT] Add support for the latest perf tool
The latest perf tool can return non-empty buffer when executing
buildid-list command, even when perf.data was recorded with -B flag.
Some binaries will be listed without the ID, while others may have a
recorded ID. Allow invalid entires on the input, while checking the
valid ones for the match.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D130223
2022-07-22 07:56:15 -07:00
John Ericson 07b749800c [cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.

Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.

@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.

Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.

To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.

For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117977
2022-07-21 19:04:00 +00:00
Sriraman Tallam 116ee23f4c [bolt] std::atomic_uint64_t to std::atomic<uint64_t>
Differential Revision: https://reviews.llvm.org/D129903
2022-07-19 16:09:11 -07:00
zr33 1a1324a303 [BOLT][DWARF] Fix incorrect DW_AT_type offset for unittest
Some unit tests has incorrect DW_AT_type offset since they are manual crafted, fix them to the correct offset.

Reviewed By: Amir, ayermolo

Differential Revision: https://reviews.llvm.org/D129828
2022-07-18 14:20:22 -07:00
zr33 66a41e0807 [BOLT][DWARF] Add Unit test for DW_AT_high_pc [DW_FORM_addr]
Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D127613
2022-07-18 14:03:53 -07:00
Fabian Parzefall 8477bc6761 [BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's
layout. It also lays the groundwork for splitting functions into
multiple fragments (as opposed to a strict hot/cold split).

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129518
2022-07-16 17:23:24 -07:00
Amir Ayupov 77b72fbc71 [BOLT][TEST] Add icp-inline.s test
Add a test for `-icp-inline` knob, which ensures that ICP is only performed for
functions that can be subsequently inlined.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D129803
2022-07-15 20:49:26 -07:00
Fangrui Song d11ac9641b [bolt] Include <atomic> 2022-07-15 14:27:01 -07:00
Huan Nguyen ae563c9146 [BOLT] Support split landing pad
We previously support split jump table, where some jump table entries
target different fragments of same function. In this fix, we provide
support for another type of intra-indirect transfer: landing pad.

When C++ exception handling is used, compiler emits .gcc_except_table
that describes the location of catch block (landing pad) for specific
range that potentially invokes a throw(). Normally landing pads reside
in the function, but with -fsplit-machine-functions, landing pads can
be moved to another fragment. The intuition is, landing pads are rarely
executed, so compiler can move them to .cold section.

This update will mark all fragments that have landing pad to another
fragment as non-simple, and later propagate non-simple to all related
fragments.

This update also includes one manual test case: split-landing-pad.s

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128561
2022-07-14 18:10:22 -07:00
Fabian Parzefall d55dfeaf32 [BOLT] Replace uses of layout with basic block list
As we are moving towards support for multiple fragments, loops that
iterate over all basic blocks of a function, but do not depend on the
order of basic blocks in the final layout, should iterate over binary
functions directly, rather than the layout.

Eventually, all loops using the layout list should either iterate over
the function, or be aware of multiple layouts. This patch replaces
references to binary function's block layout with the binary function
itself where only little code changes are necessary.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129585
2022-07-14 13:07:05 -07:00
Huan Nguyen 05523dc32d [BOLT] Support multiple parents for split jump table
There are two assumptions regarding jump table:
(a) It is accessed by only one fragment, say, Parent
(b) All entries target instructions in Parent

For (a), BOLT stores jump table entries as relative offset to Parent.
For (b), BOLT treats jump table entries target somewhere out of Parent
as INVALID_OFFSET, including fragment of same split function.

In this update, we extend (a) and (b) to include fragment of same split
functinon. For (a), we store jump table entries in absolute offset
instead. In addition, jump table will store all fragments that access
it. A fragment uses this information to only create label for jump table
entries that target to that fragment.

For (b), using absolute offset allows jump table entries to target
fragments of same split function, i.e., extend support for split jump
table. This can be done using relocation (fragment start/size) and
fragment detection heuristics (e.g., using symbol name pattern for
non-stripped binaries).

For jump table targets that can only be reached by one fragment, we
mark them as local label; otherwise, they would be the secondary
function entry to the target fragment.

Test Plan
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128474
2022-07-13 23:37:31 -07:00
Vladislav Khmelevsky 35efe1d806 [BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols,
so we to handle it specially in BOLT.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D129260
2022-07-13 14:47:22 +03:00
Denis Revunov 7564167885 [BOLT][AArch64] Use all supported CPU features on AArch64
Since we now have +all feature for AArch64 disassembler, we can use it
in BOLT and allow it to disassemble all ARM instructions supported by LLVM.

Reviewed by: rafauler

Differential Revision: https://reviews.llvm.org/D129139
2022-07-12 03:56:04 -04:00
Rafael Auler 42a66fb727 [BOLT] Restrict execution of tests that fail on Windows
Turn off execution of tests that use UNIX-specific features.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126933
2022-07-11 17:59:58 -07:00
Rafael Auler a3cfdd746e [BOLT] Increase coverage of shrink wrapping [5/5]
Add -experimental-shrink-wrapping flag to control when we
want to move callee-saved registers even when addresses of the stack
frame are captured and used in pointer arithmetic, making it more
challenging to do alias analysis to prove that we do not access
optimized stack positions. This alias analysis is not yet implemented,
hence, it is experimental. In practice, though, no compiler would emit
code to do pointer arithmetic to access a saved callee-saved register
unless there is a memory bug or we are failing to identify a
callee-saved reg, so I'm not sure how useful it would be to formally
prove that.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126115
2022-07-11 17:30:13 -07:00
Rafael Auler 3e5f67f356 [BOLT] Increase coverage of shrink wrapping [4/5]
Change shrink-wrapping to try a priority list of save
positions, instead of trying the best one and giving up if it doesn't
work. This also increases coverage.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126114
2022-07-11 17:30:05 -07:00
Rafael Auler 3332904ad6 [BOLT] Increase coverage of shrink wrapping [3/5]
Add the option to run -equalize-bb-counts before shrink
wrapping to avoid unnecessarily optimizing some CFGs where profile is
inaccurate but we can prove two blocks have the same frequency.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126113
2022-07-11 17:30:00 -07:00
Rafael Auler 3508ced6ea [BOLT] Increase coverage of shrink wrapping [2/5]
Refactor isStackAccess() to reflect updates by D126116. Now we only
handle simple stack accesses and delegate the rest of the cases to
getMemDataSize.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126112
2022-07-11 17:29:54 -07:00
Rafael Auler 42465efd17 [BOLT] Increase coverage of shrink wrapping [1/5]
Change how function score is calculated and provide more
detailed statistics when reporting back frame optimizer and shrink
wrapping results. In this new statistics, we provide dynamic coverage
numbers. The main metric for shrink wrapping is the number of executed
stores that were saved because of shrink wrapping (push instructions
that were either entirely moved away from the hot block or converted
to a stack adjustment instruction). There is still a number of reduced
load instructions (pop) that we are not counting at the moment. Also
update alloc combiner to report dynamic numbers, as well as frame
optimizer.

For debugging purposes, we also include a list of top 10 functions
optimized by shrink wrapping. These changes are aimed at better
understanding the impact of shrink wrapping in a given binary.

We also remove an assertion in dataflow analysis to do not choke on
empty functions (which makes no sense).

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126111
2022-07-11 17:29:22 -07:00
spupyrev 228970f612 Revert "Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations""
This reverts commit 76029cc53e.
2022-07-11 09:50:47 -07:00
spupyrev eecd41aa09 Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type"
This reverts commit 6d0528636a.
2022-07-11 09:50:47 -07:00
spupyrev 7228371054 [BOLT] Do not merge cold and hot chains of basic blocks
There is a post-processing in ext-tsp block reordering that merges some blocks
into chains. This allows to maintain the original block order in the absense of
profile data and can be beneficial for code size (when fallthroughs are merged).
In the earlier version we could merge hot and cold (with zero execution count)
chains, that later were split by SplitFunction.cpp (when split-all-cold=1). The
diff eliminates the redundant merging.

It is unlikely the change will affect the performance of a binary in a
measurable way, as it is mostly operates with cold basic blocks. However, after
the diff the impact of split-all-cold is almost negligible and we can avoid the
extra function splitting.

Measuring on the clang binary (negative is good, positive is a regression):
**clang12**
benchmark1:  `0.0253`
benchmark2:  `-0.1843`
benchmark3:  `0.3234`
benchmark4:  `0.0333`

**clang10**
benchmark1  `-0.2517`
benchmark2  `-0.3703`
benchmark3  `-0.1186`
benchmark4  `-0.3822`

**clang7**
benchmark1  `0.2526`
benchmark2  `0.0500`
benchmark3  `0.3024`
benchmark4  `-0.0489`

**Overall**: `-0.0671 ± 0.1172` (insignificant)

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129397
2022-07-11 09:31:52 -07:00
Maksim Panchenko 76029cc53e Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations"
Summary:
This reverts commit 729d29e167.

Needed as a workaround for T112872562.

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D35230076
https://phabricator.intern.facebook.com/D35681740

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: spupyrev

Differential Revision: https://phabricator.intern.facebook.com/D37098481
2022-07-11 09:31:52 -07:00
Rafael Auler 6d0528636a Rebase: [Facebook] [MC] Introduce NeverAlign fragment type
Summary:
Introduce NeverAlign fragment type.

The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.

In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).

This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.

The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.

LLVM: https://reviews.llvm.org/D97982

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D30142613

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: phabricatorlinter

Differential Revision: https://phabricator.intern.facebook.com/D31361547
2022-07-11 09:31:52 -07:00
Vladislav Khmelevsky e10e120cea [BOLT][Runtime] Fix memset definition
Differential Revision: https://reviews.llvm.org/D129321
2022-07-09 01:17:08 +03:00
Michał Chojnowski bd301a418b [BOLT] Fix concurrent hash table modification in the instrumentation runtime
`__bolt_instr_data_dump()` does not lock the hash tables when iterating
over them, so the iteration can happen concurrently with a modification
done in another thread, when the table is in an inconsistent state. This
also has been observed in practice, when it caused a segmentation fault.

We fix this by locking hash tables during iteration. This is done by taking
the lock in `forEachElement()`.
The only other site of iteration, `resetCounters()`, has been correctly
locking the table even before this patch. This patch removes its `Lock`
because the lock is now taken in the inner `forEachElement()`.

Reviewed By: maksfb, yota9

Differential Revision: https://reviews.llvm.org/D129089
2022-07-07 14:27:29 +03:00
Maksim Panchenko ea2182fedd [BOLT] Add runtime functions required by freestanding environment
Compiler can generate calls to some functions implicitly, even under
constraints of freestanding environment. Make sure these functions are
available in our runtime objects.

Fixes test failures on some systems after https://reviews.llvm.org/D128960.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D129168
2022-07-06 11:22:22 -07:00
Elvina Yakubova 35155a0716 [BOLT] Change mutex implementation
Changed acquire implemetaion to __atomic_test_and_set() and release
to __atomic_clear() so it eliminates inline asm usage and is arch
independent.

Elvina Yakubova,
Advanced Software Technology Lab, Huawei

Reviewers: yota9, maksfb, rafauler

Differential Revision: https://reviews.llvm.org/D129162
2022-07-06 08:19:54 +03:00
Maksim Panchenko 3a47037fcc [BOLT] Fix instrumentation problem with floating point
If BOLT instrumentation runtime uses XMM registers, it can interfere
with the user program causing crashes and unexpected behavior. This
happens as the instrumentation code preserves general purpose registers
only.

Build BOLT instrumentation runtime with "-mno-sse".

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128960
2022-07-01 15:29:36 -07:00
Alexander Yermolovich e159abdb04 [BOLT][DWARF] Support mix mode DWARF
Added support for mixing monolithic DWARF5 with legacy DWARF, and monolithic legacy and DWARF5 split dwarf.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D128232
2022-06-30 16:53:15 -07:00
Amir Ayupov 66b01a8934 [BOLT] Fix getDynoStats to handle BCs with no functions
Address fuzzer crash

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D120696
2022-06-30 01:18:45 -07:00
Amir Ayupov cb75faf40c [X86][BOLT] Use getOperandType to determine memory access size
Generate INSTRINFO_OPERAND_TYPE table in X86GenInstrInfo.inc.

This diff adds support for instructions that were previously reported as having
memory access size 0. It replaces the heuristic of looking at instruction
register width to determine memory access width by instead checking the memory
operand type using tablegen-provided tables.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D126116
2022-06-30 00:25:32 -07:00
Amir Ayupov 798e92c6c4 [BOLT] Respect shouldPrint in dump-dot-all
Don't dump dot CFG graph for functions that should not be printed.

Reviewed By: rafauler, maksfb

Differential Revision: https://reviews.llvm.org/D128699
2022-06-29 17:01:17 -07:00
Maksim Panchenko ed74304506 [BOLT] Fix EH trampoline backout code
When SplitFunctions pass adds a trampoline code for exception landing
pads (limited to shared objects), it may increase the size of the hot
fragment making it larger than the whole function pre-split. When this
happens, the pass reverts the splitting action by restoring the original
block order and marking all blocks hot.

However, if createEHTrampolines() added new blocks to the CFG and
modified invoke instructions, simply restoring the original block layout
will not suffice as the new CFG has more blocks.

For proper backout of the split, modify the original layout by merging
in trampoline blocks immediately before their matching targets. As a
result, the number of blocks increases, but the number of instructions
and the function size remains the same as pre-split.

Add an assertion for the number of blocks when updating a function
layout.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128696
2022-06-29 14:35:57 -07:00
Fabian Parzefall e341e9f094 [BOLT] Add option to randomize function split point
For test purposes, we want to split functions at a random split point
to be able to test different layouts without relying on the profile.
This patch introduces an option, that randomly chooses a split point
to partition blocks of a function into hot and cold regions.

Reviewed By: Amir, yota9

Differential Revision: https://reviews.llvm.org/D128773
2022-06-29 13:02:05 -07:00
Rafael Auler fc2d96c334 Revert "[BOLT][AArch64] Handle gold linker veneers"
This reverts commit 425dda76e9.

This commit is currently causing BOLT to crash in one of our
binaries and needs a bit more checking to make sure it is safe
to land.
2022-06-28 19:23:28 -07:00
Vladislav Khmelevsky 425dda76e9 [BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols,
so we to handle it specially in BOLT.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D128082
2022-06-28 16:14:05 +03:00
Amir Ayupov d58b5a0614 [BOLT] Restrict icp-inline to callsites
ICP peel for inline mode only makes sense for calls, not jump tables.
Plus, add a check that the Target BinaryFunction is found.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128404
2022-06-27 11:08:55 -07:00
Amir Ayupov 0d477f63b0 [BOLT][NFC] Add aliases for ICP flags
- `indirect-call-promotion` -> `icp`
- `indirect-call-promotion-mispredict-threshold` -> `icp-mp-threshold`
- `indirect-call-promotion-use-mispredicts` -> `icp-use-mp`
- `indirect-call-promotion-topn` -> `icp-topn`
- `indirect-call-promotion-calls-topn` -> `icp-calls-topn`
- `indirect-call-promotion-jump-tables-topn` -> `icp-jt-topn`
- `icp-jump-table-targets` -> `icp-jt-targets`

This also fixes an inconsistency in ICP flag names that some start with
`indirect-call-promotion` while others start with `icp`.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128375
2022-06-27 10:29:26 -07:00
Amir Ayupov c4302e4fc2 [BOLT][NFC] Use llvm::less_first
Follow the case of https://reviews.llvm.org/D126068 and simplify call sites
with `llvm::less_first`.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128242
2022-06-27 10:27:17 -07:00
Fabian Parzefall 96f6ec5090 [BOLT] Mark option values of --split-functions deprecated
The SplitFunctions pass does not distinguish between various splitting
modes anymore. This change updates the command line interface to
reflect this behavior by deprecating values passed to the
--split-function option.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128558
2022-06-24 17:01:13 -07:00
Alexander Yermolovich 11a8dd65ec [BOLT][DWARF] Add support for DW_AT_call_pc/DW_AT_call_return_pc
DWARF 5 added two new attributes DW_AT_call_pc and DW_AT_call_return_pc.
Adding support for them.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D128526
2022-06-24 12:37:58 -07:00
Amir Ayupov d2c8769936 [BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts
accepting ranges.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128154
2022-06-23 22:16:27 -07:00
Maksim Panchenko 30a6d3ada6 [BOLT][TEST] Fix stack alignment in section-reloc-with-addend.s
Misaligned stack can cause a runtime crash.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128227
2022-06-20 14:47:37 -07:00
Maksim Panchenko f263a66ba0 [BOLT] Split functions with exceptions in shared objects and PIEs
Add functionality to allow splitting code with C++ exceptions in shared
libraries and PIEs. To overcome a limitation in exception ranges format,
for functions with fragments spanning multiple sections, add trampoline
landing pads in the same section as the corresponding throwing range.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127936
2022-06-19 16:48:48 -07:00
Amir Ayupov 445bc88501 [BOLT] Use 32-bit MOV to zero 64-bit register in instrumentation code
Instead of `movabsq $0x0, %rax` emit shorter equivalent `movl $0x0, %eax`.
Intel SDM, 3.4.1.1 General-Purpose Registers in 64-Bit Mode:
>32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in
> the destination general-purpose register.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D127045
2022-06-19 11:34:32 -07:00
Huan Nguyen 543f13c99b [BOLT] Allow function entry to be a cold fragment
Allow cold fragment to get new address.

Our previous assumption is that a fragment (.cold) is only reached
through the main fragment of same function. In addition, .cold fragment
must be reached through either (a) direct transfer, or (b) split jump
table. For (a), we perform a simple fix-up. For (b), we currently mark
all relevant fragments as non-simple. Therefore, there is no need to
get new address for .cold fragment.

This is not always the case, as function entry can be rarely executed,
and is placed in .text.cold segment. Essentially we cannot tell which
the source-level function entry is based on hot and cold segments,
so we must treat each fragment a function on its own. Therfore, we
remove the assertion that a function entry cannot be cold fragment.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128111
2022-06-18 11:39:51 -07:00
Huan Nguyen 28b1dcb122 [BOLT] Allow function fragments to point to one jump table
Resolve a crash related to split functions

Due to split function optimization, a function can be divided to two

fragments, and both fragments can access same jump table. This
violates 
the assumption that a jump table can only have one parent
function, 
which causes a crash during instrumentation.

We want to support the case: different functions cannot access same
jump tables, but different fragments of same function can!

As all fragments are from same function, we point JT::Parent to one
specific fragment. Right now it is the first disassembled fragment, but
we can point it to the function's main fragment later.

Functions are disassembled sequentially. Previously, at the end of
processing a function, JT::OffsetEntries is cleared, so other fragment
can no longer reuse JT::OffsetEntries. To extend the support for split
function, we only clear JT::OffsetEntries after all functions are
disassembled.

Let say A.hot and A.cold access JT of three targets {X, Y, Z}, where
X and Y are in A.hot, and Z is in A.cold. Suppose that A.hot is
disassembled first, JT::OffsetEntries = {X',Y',INVALID_OFFSET}. When
A.cold is disassembled, it cannot reuse JT::OffsetEntries above due to
different fragment start. A simple solution:
A.hot  = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, INVALID_OFFSET}

We update the assertion to allow different fragments of same function
to get the same JumpTable object.

Potential improvements:
A.hot  = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, Z'}
The main issue is A.hot and A.cold have separate CFGs, thus jump table
targets are still constrained within fragment bounds.

Future improvements:
A.hot  = {X, Y, Z}
A.cold = {X, Y, Z}

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127924
2022-06-17 16:22:30 -07:00
Rafael Auler 9d5e6ccd9b [BOLT] Fix for missing entry offset
Temporary fix for missing entry offset when creating address
translation tables (BAT) after D127935 landed. Will later work on
assigning a more reasonable offset different than zero.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128092
2022-06-17 13:14:42 -07:00
Maksim Panchenko 8228c70358 [BOLT][NFCI] Refactor interface for adding basic blocks
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127935
2022-06-16 11:51:57 -07:00
Maksim Panchenko 401a425d20 [BOLT][NFCI] Remove redundant code
createBasicBlock() will create a label for addBasicBlock().

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D127928
2022-06-15 21:26:30 -07:00
Amir Ayupov 02d510b416 [BOLT][NFC] Pass Function to BC.printInstructions in BinaryBasicBlock::dump
BC::printInstruction(s) has many uses of Function ptr if it's available:
# printing CFI instructions (unconditional)
# printing debug line information (-print-debug-info)
# printing instruction relocations (-print-relocations)

Enable these uses by passing Function ptr from the primary printing entry point:
BinaryBasicBlock::dump.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126916
2022-06-13 14:26:51 -07:00
Amir Ayupov a2c4d6d332 [BOLT][NFC] Forward declare ReorderBlocks for MSVC19
Fix bolt-x86_64-wine-msvc builder:
https://lab.llvm.org/buildbot/#/builders/222/builds/1154

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D127612
2022-06-13 10:26:58 -07:00
Vladislav Khmelevsky 6e26ffa064 [BOLT][AARCH64] Skip R_AARCH64_LD_PREL_LO19 relocation
Supress failed to analyze relocations warning for R_AARCH64_LD_PREL_LO19
relocation. This relocation is mostly used to get value stored in CI and
we don't process it since we are caluclating target address using the
instruction value in evaluateMemOperandTarget().

Differential Revision: https://reviews.llvm.org/D127413
2022-06-13 15:40:06 +03:00
Amir Ayupov 7dee646b28 [BOLT][NFC] Move printDebugInfo out of BC::printInstruction
Simplify `BinaryContext::printInstruction`.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D127561
2022-06-11 11:58:36 -07:00
Fangrui Song adf4142f76 [MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Maksim Panchenko d648aa1b8e [BOLT][TEST] Use double dash flags in tests
Replace a single dash with a double dash for options that have more
than a single letter.

llvm-bolt-wrapper.py has special treatment for output options such as
"-o" and "-w" causing issues when a single dash is used, e.g. for
"-write-dwp". The wrapper can be fixed as well, but using a double dash
has other advantages as well.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D127538
2022-06-10 16:27:33 -07:00
Huan Nguyen 82095bd5ed [BOLT] Mark fragments related to split jump table as non-simple
Mark fragments related to split jump table as non-simple.

A function could be splitted into hot and cold fragments. A split jump table is
challenging for correctly reconstructing control flow graphs, so it was marked
as ignored. This update marks those fragments as non-simple, allowing them
to be printed and partial control flow graph construction.

Test Plan:
```
llvm-lit -a tools/bolt/test/X86/split-func-icf.s
```
This test has two functions (main, main2), each has a jump table target to the
same cold portion main2.cold.1(*2). We try to print out only this cold portion.
If it is ignored, it cannot be printed. If it is non-simple, it can be printed. We
verify that it can be printed.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127464
2022-06-10 15:49:32 -07:00
John Ericson 0bb317b7bf Revert "[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore"
This reverts commit d5daa5c5b0.
2022-06-10 19:26:12 +00:00
John Ericson d5daa5c5b0 [cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.

Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.

@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.

Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.

To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.

For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117977
2022-06-10 14:35:18 +00:00
Denis Revunov 0b7e8baf83 [BOLT][AArch64] Handle data at the beginning of a function when disassembling and building CFG.
This patch adds getFirstInstructionOffset method for BinaryFunction
which is used to properly handle cases where data is at zero offset in
a function. The main change is that we add basic block at first
instruction offset when disassembling, which prevents assertion
failures in buildCFG.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D127111
2022-06-09 15:26:32 -07:00
Maksim Panchenko 1817642684 [BOLT] Add support for GOTPCRELX relocations
The linker can convert instructions with GOTPCRELX relocations into a
form that uses an absolute addressing with an immediate. BOLT needs to
recognize such conversions and symbolize the immediates.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126747
2022-06-09 13:37:04 -07:00
Alexander Yermolovich 901867b1ef [BOLT][DWARF] Fix dwarf5-loclist-offset-form test
I put it into wrong directory. As the result it is failing for aarch64. Moving
the test under X86. Orignial diff D126999.

Differential Revision: https://reviews.llvm.org/D127417
2022-06-09 10:54:09 -07:00
Alexander Yermolovich 1c6dc43de9 [BOLT]DWARF] Eagerly write out loclists
Taking advantage of us being able to re-write .debug_info to reduce memory
footprint loclists. Writing out loc-list as they are added, similar to how
we handle ranges.

Collected on clang-14
trunk
4:41.20 real,   389.50 user,    59.50 sys,      0 amem, 38412532 mmem
4:30.08 real,   376.10 user,    63.75 sys,      0 amem, 38477844 mmem
4:25.58 real,   373.76 user,    54.71 sys,      0 amem, 38439660 mmem
diff
4:34.66 real,   392.83 user,    57.73 sys,      0 amem, 38382560 mmem
4:35.96 real,   377.70 user,    58.62 sys,      0 amem, 38255840 mmem
4:27.61 real,    390.18 user,    57.02 sys,      0 amem, 38223224 mmem

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126999
2022-06-08 16:52:59 -07:00
Huan Nguyen db313a00b6 [BOLT][NFC] Replace stdio with raw_ostream in CallGraph
Replacing stdio functions, e.g., fopen, fprintf, with raw_ostream.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126826
2022-06-08 15:38:04 -07:00
Vladislav Khmelevsky fd9604952d [BOLT] Set valid index for functions with profiles
Some of the passes that calculates tentative layout like LongJmp and
Golang are expecting that only functions with valid index will be
located in hot text section. But currently functions with valid profiles
and not set index are breaking this logic, to fix this we can move the
hasValidProfile() condition from AssignSections pass to ReorderFunctions.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D127223
2022-06-08 14:13:12 +03:00
Fangrui Song 15d82c62dc [MC] De-capitalize MCStreamer functions
Follow-up to c031378ce0 .
The class is mostly consistent now.
2022-06-07 00:31:02 -07:00
Fangrui Song b92436efcb [bolt] Remove unneeded cl::ZeroOrMore for cl::opt options 2022-06-05 13:29:49 -07:00
Fangrui Song 36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Amir Ayupov b346af6d44 [BOLT][UTILS] Usability improvements for nfc-check-setup
# Stash local changes before checkout.
# Print a message that the source repository revision has been changed, with
  instructions to switch back.
# Make the script executable.
# Print sample instructions how to run bolt tests.
# Assume that llvm-bolt-wrapper script is in the same source directory.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126941
2022-06-03 22:54:56 -07:00
Fangrui Song 72f9c69421 [Hexagon][bolt] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Similar to 557efc9a8b
2022-06-03 22:04:57 -07:00
Huan Nguyen 5ac26156fe [BOLT][NFC] Warning for deprecated option '-reorder-blocks=cache+'
Emit warning when using deprecated option '-reorder-blocks=cache+'.
Auto switch to option '-reorder-blocks=ext-tsp'.

Test Plan:
```
ninja check-bolt
```
Added a new test cache+-deprecated.test.
Run and verify that the upstream tests are passed.

Reviewed By: rafauler, Amir, maksfb

Differential Revision: https://reviews.llvm.org/D126722
2022-06-03 14:16:55 -07:00
spupyrev 5904836b8a [BOLT] Cache-Aware Tail Duplication
A new "cache-aware" strategy for tail duplication.

Differential Revision: https://reviews.llvm.org/D123050
2022-06-03 09:08:45 -07:00
Amir Ayupov e2142ff47c [BOLT][NFC] Make ICP::verifyProfile static
Follow LLVM style guide suggestion to avoid function definitions in anonymous
namespaces: https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124896
2022-06-02 19:09:29 -07:00
Amir Ayupov 65a84195ca [BOLT][DOCS] Add PACKAGE_VERSION to doxygen config
Clang's doxygen documentation specifies LLVM revision. Do the same for BOLT.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126912
2022-06-02 18:05:44 -07:00
Maksim Panchenko 986e5dedf2 [BOLT][NFC] Fix braces in BinaryEmitter
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126844
2022-06-02 12:45:25 -07:00
Amir Ayupov 6333e5dde9 [BOLT][NFC] Use colors in CFG dumps
Use color coding to distinguish nodes:
- Entry nodes have bold border
- Scalar (non-loopy) code is milk white
- Outer loops are light yellow
- Innermost loops are light blue

`-print-loops` needs to be enabled to provide BinaryLoopInfo.
Examples:
{F23170673}
{F23170680}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126248
2022-06-02 00:27:12 -07:00
Amir Ayupov cc23c64ff1 [BOLT][NFC] Print block instructions in dumpGraph as part of node label
Reuse the option `-dot-tooltip-code` to put block instructions into the label.
This way, the instructions are displayed by default when used with dot viewer.

When the .dot file is used with dot2html, instructions are hidden by default,
and are shown by clicking on a node.

{F23169510}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126237
2022-06-01 23:41:41 -07:00
Amir Ayupov 51c20e5804 [BOLT][UTILS] Add dot2html helper tool to embed dot into html
To be rendered in browser using d3-graphviz.
Example: {F23169510}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126218
2022-06-01 23:37:43 -07:00
Alexander Yermolovich ab9a175990 [BOLT][DWARF] Fix TU Index handling for DWARF4/5
When we generate split dwarf with -fdebug-types-section we will have
.debug_types.dwo sections. These go into TU Index when we run llvm-dwp. BOLT was
not handling DWP input correctly with this section.

Added support for handling DWP with TU Index as an input and output for DWARF4.
Added support for handling DWP with TU Index as an input for DWARF5

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126087
2022-06-01 18:16:12 -07:00
Huan Nguyen 38fb7d56e5 [BOLT][TEST] Replace cache+ option with ext-tsp
Replace "cache+" with "ext-tsp" in all BOLT tests

Test Plan:
```
ninja check-bolt
grep -rnw . -e "cache+"
```
no more tests containing "cache+"
"cache+" and "ext-tsp" are aliases

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126714
2022-06-01 14:00:16 -07:00
Maksim Panchenko 0426100ff4 [BOLT][NFC] Remove unused variable
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126808
2022-06-01 13:43:10 -07:00
Yi Kong 716d428ab5 [BOLT] Add `-o` option to merge-fdata
Differential Revision: https://reviews.llvm.org/D126788
2022-06-02 01:29:04 +08:00
Alexander Yermolovich ec2711b354 [BOLT][DWARF] Fix dwarf5-debug-line test
After D126484, order in .debug-line-str and .debug-line is different. Changed
test accordingly.

Differential Revision: https://reviews.llvm.org/D126733
2022-05-31 18:17:16 -07:00
Maksim Panchenko e290133c76 [BOLT] Add new class for symbolizing X86 instructions
Summary:
While disassembling instructions, we need to replace certain immediate
operands with symbols. This symbolizing process relies on reading
relocations against instructions. However, some X86 instructions can
have multiple immediate operands and up to two relocations against
them. Thus, correctly matching a relocation to an operand is not
always possible without knowing the operand offset within the
instruction.

Luckily, LLVM provides an interface for passing the required info from
the disassembler via a virtual MCSymbolizer class. Creating a
target-specific version allows a precise matching of relocations to
operands.

This diff adds X86MCSymbolizer class that performs X86-specific
symbolizing (currently limited to non-branch instructions).

Reviewers: yota9, Amir, ayermolo, rafauler, zr33

Differential Revision: https://reviews.llvm.org/D120928
2022-05-31 17:48:19 -07:00
Denis Revunov 8579db96e8 [BOLT] [AArch64] Handle constant islands spanning multiple functions
Fix BOLT's constant island mapping when a constant island marked by $d
spans multiple functions. Currently, because BOLT only marks the
constant island in the first function where $d is located, if the next
function contains data at its start, BOLT will miss the data and try
to disassemble it. This patch adds code to explicitly go through all
symbols between $d and $x markers and mark their respective offsets as
data, which stops BOLT from trying to disassemble data. It also adds
MarkerType enum and refactors related functions.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D126177
2022-05-31 13:51:35 -07:00
Yi Kong 2a42f7f72a [BOLT] Allow merge-fdata to take a directory as input
and recursively merge all files under said directory. This is similar
to `llvm-profdata merge`.

Differential Revision: https://reviews.llvm.org/D126695
2022-06-01 03:01:14 +08:00
Rafael Auler b8a6345554 [BOLT] Fix LIT tests on Windows VS2019
Fix newline issue in link_fdata.py, as well as how to call the tool.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126437
2022-05-31 11:45:39 -07:00
Yi Kong 97715104c5 [BOLT][NFC] Don't over-specify the size of SmallVector
This is the recommended way, should make merging profiles ever so
slightly faster.
2022-05-31 16:16:38 +08:00
Balazs Benics a73b50ad06 Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable"
This reverts commit 3988bd1398.

Did not build on this bot:
https://lab.llvm.org/buildbot#builders/215/builds/6372

/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to
‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’
  177 |  { return bool(_M_comp(*__it, __val)); }
2022-05-27 11:19:18 +02:00
Balazs Benics 3988bd1398 [llvm][clang][bolt][NFC] Use llvm::less_first() when applicable
One could reuse this functor instead of rolling out your own version.
There were a couple other cases where the code was similar, but not
quite the same, such as it might have an assertion in the lambda or other
constructs. Thus, I've not touched any of those, as it might change the
behavior in some way.

As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal
Chris Lattner
> LLVM intentionally has a “yes, you can apply common sense judgement to
> things” policy when it comes to code review. If you are doing mechanical
> patches (e.g. adopting less_first) that apply to the entire monorepo,
> then you don’t need everyone in the monorepo to sign off on it. Having
> some +1 validation from someone is useful, but you don’t need everyone
> whose code you touch to weigh in.

Differential Revision: https://reviews.llvm.org/D126068
2022-05-27 11:15:23 +02:00
Rafael Auler c09cd64e5c [BOLT] Fix AND evaluation bug in shrink wrapping
Fix a bug where shrink-wrapping would use wrong stack offsets
because the stack was being aligned with an AND instruction, hence,
making its true offsets only available during runtime (we can't
statically determine where are the stack elements and we must give up
on this case).

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126110
2022-05-26 14:59:28 -07:00
zr33 e51a6b7374 [BOLT][DWARF] Convert dwarf5-df-* tests to assembly tests
Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D126086
2022-05-25 13:41:18 -07:00
Amir Ayupov f7581a3969 [BOLT][NFC] Use ListSeparator in BinaryFunction print methods
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126243
2022-05-24 18:29:24 -07:00
Amir Ayupov 69f87b6c29 [BOLT][NFC] Customize endline character for printInstruction(s)
This would be used in `BF::dumpGraph` to dump left-justified text.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126232
2022-05-24 18:26:12 -07:00
Amir Ayupov 5d8247d4c7 [BOLT][NFC] Use for_each to simplify printLoopInfo
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126242
2022-05-24 18:05:43 -07:00
Amir Ayupov b976fac6ee [BOLT][NFC] Remove unused BF::computeLocalUDChain method definition
The function is only used inside AArch64MCPlusBuilder class, there are no uses
of it as a BinaryFunction method.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126220
2022-05-24 18:02:44 -07:00
Rafael Auler 6cc741bcbf [BOLT] Testcase to repro R_X86_64_REX_GOTPCRELX bug
Add a new testcase that reproduces a bug when BOLTing current
trunk LLD bootstrapped with trunk clang. This makes it official
that we do not support this transformation but are working on
it. When the support is ready, XFAIL should be removed.

Reviewed By: maksfb, Amir, yota9

Differential Revision: https://reviews.llvm.org/D125843
2022-05-18 16:07:14 -07:00
Amir Ayupov c907d6e0e9 [BOLT][NFC] Suppress unused variable warnings
Addresses the warnings emitted by Apple Clang 13.1.6 (Xcode 13.3.1).
Tip @tschuett issue #55404.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125733
2022-05-17 14:30:23 -07:00
Amir Ayupov a7b69dbdd1 [BOLT][NFC] Move BinaryDominatorTree out of BinaryLoop header
Split up the BinaryLoop header and move BinaryDominatorTree into its own header,
preparing it for a standalone use.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125664
2022-05-17 14:20:11 -07:00
Rafael Auler 2fdc5d336e [BOLT] Fix merge-fdata handling of BAT profiles
When a profile is collected in a BOLTed binary, the generated
profile is tagged with a header string "boltedcollection" in the first
line of the fdata file. Fix merge-fdata to recognize this header
string and preserve it into the output.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D125591
2022-05-13 19:41:55 -07:00
Amir Ayupov bdba3d091c [BOLT][CMAKE] Fix DYLIB build
Move BOLT libraries out of `LLVM_LINK_COMPONENTS` to `target_link_libraries`.
Addresses issue #55432.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125568
2022-05-13 13:27:21 -07:00
Amir Ayupov da766cea56 [BOLT][TEST] Fix testing on macos
- Fix common (arch-independent) tests to explicitly target -linux triple.
- Override the triple inside arch-specific tests.
- Add cflags to common tests.
- Update individual tests.
- Expand pipe stderr `|&` shorthand.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125548
2022-05-13 13:03:47 -07:00
Amir Ayupov 253b8f0abd [BOLT][NFC] Use refs for loop variables to avoid copies
Addresses warnings when built with Apple Clang.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D125483
2022-05-13 20:18:29 +01:00
Amir Ayupov 139744ac53 [BOLT][NFC] Suppress unused variable warnings
Address warnings in Release build without assertions.
Tip @tschuett for reporting the issue #55404.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125475
2022-05-13 20:10:19 +01:00
Amir Ayupov c1532ac4aa [BOLT][CMAKE] Add missing clauses to bolt/runtime/CMakeLists.txt
Fix build with Apple Clang.
Tip @tschuett for reporting the issue #55404.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125480
2022-05-13 19:51:55 +01:00
Amir Ayupov d63c5a38fe [BOLT][NFC] Use BitVector::set_bits
Refactor and use `set_bits` BitVector interface.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125374
2022-05-11 16:23:44 -07:00
Amir Ayupov 8cb7a873ab [BOLT][NFC] Add MCPlus::primeOperands iterator_range
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D125397
2022-05-11 09:34:51 -07:00
Amir Ayupov 4a58eb9e4e [BOLT][TEST] Remove -gdwarf-4 override from %cflags
As BOLT support for monolithic and split DWARF5 is added, remove DWARF version
override for BOLT tests.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D125366
2022-05-11 03:38:26 -07:00
Amir Ayupov c2d40f1dfb [BOLT] Add icp-inline option
Add an option to only peel ICP targets that can be subsequently inlined.
Yet there's no guarantee that they will be inlined.

The mode is independent from the heuristic used to choose ICP targets: by exec
count, mispredictions, or memory profile.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124900
2022-05-11 03:21:24 -07:00
Alexander Yermolovich 3abb68a626 [BOLT][DWARF] Fix assert for split dwarf.
Fixing a small bug where it would assert if CU does not modify .debug_addr section.

Differential Revision: https://reviews.llvm.org/D125181
2022-05-08 19:18:17 -07:00
Alexander Yermolovich ba1ac98c62 [BOLT][DWARF] Add version 5 split dwarf support
Added support for DWARF5 Split Dwarf.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D122988
2022-05-05 14:59:05 -07:00
Rahman Lavaee 733dc3e50b [BOLT] Report per-section hotness in bolt-heatmap.
This patch adds a new feature to bolt heatmap to print the hotness of each section in terms of the percentage of samples within that section.

Sample output generated for the clang binary:

Section Name, Begin Address, End Address, Percentage Hotness
.text, 0x1a7b9b0, 0x20a2cc0, 1.4709
.init, 0x20a2cc0, 0x20a2ce1, 0.0001
.fini, 0x20a2ce4, 0x20a2cf2, 0.0000
.text.unlikely, 0x20a2d00, 0x431990c, 0.3061
.text.hot, 0x4319910, 0x4bc6927, 97.2197
.text.startup, 0x4bc6930, 0x4c10c89, 0.0058
.plt, 0x4c10c90, 0x4c12010, 0.9974

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124412
2022-05-05 11:37:46 -07:00
Amir Ayupov aff52d1f08 [BOLT][CMAKE] Check build target architecture for runtime libs
Account for cross-compilation build scenarios (X86 to ARM, Linux
to Windows, etc).

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124712
2022-05-05 10:40:14 -07:00
Amir Ayupov f8d2d8b587 [BOLT][NFC] Move getInliningInfo out of Inliner class
`getInliningInfo` is useful in other passes that need to check inlining
eligibility for some function. Move the declaration and InliningInfo definition
out of Inliner class. Prepare for subsequent use in ICP.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124899
2022-05-04 14:08:06 -07:00
Amir Ayupov 2ad1c7540e [BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite
Minor refactoring. NFC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124898
2022-05-04 14:06:53 -07:00
Amir Ayupov 68c7299f16 [BOLT][NFC] Fix MCPlusBuilder::getAliases caching behavior
Caching behavior of `getAliases` causes a failure in unit tests where two
MCPlusBuilder objects are created corresponding to AArch64 and X86:
the alias cache is created for AArch64 but then used for X86.

https://lab.llvm.org/staging/#/builders/211/builds/126

The issue only affects unit tests as we only construct one MCPlusBuilder
for ELF binary.

Resolve the issue by moving alias bitvectors to MCPlusBuilder object.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D124942
2022-05-04 12:53:26 -07:00
Amir Ayupov 60957a5a08 [BOLT] Fix ICPJumpTablesTopN option use
Fix non-sensical `opts::ICPJumpTablesTopN != 0 ? opts::ICPTopN : opts::ICPTopN`.
Refactor/simplify another similar assignment.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124880
2022-05-03 19:34:10 -07:00
Amir Ayupov c3d5372093 [BOLT][NFC] Make ICP options naming uniform
Rename `opts::IndirectCallPromotion*` to `opts::ICP*`, making option naming
uniform and easier to follow.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124879
2022-05-03 19:32:45 -07:00
Amir Ayupov d0b1c98c96 [BOLT][NFC] ICP: simplify findTargetsIndex
Unnest lambda and use `llvm::is_contained`.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124877
2022-05-03 19:31:20 -07:00
Amir Ayupov ec02227bf7 [BOLT][NFC] Refactor ICP::findCallTargetSymbols
Reduce nesting making it easier to read.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124876
2022-05-03 19:29:22 -07:00
Amir Ayupov f9db6d2d5b [BOLT][CMAKE] Fix llvm-bolt-fuzzer build
Add X86/AArch64 targets to resolve missing dependencies, e.g.:
`undefined reference to `LLVMInitializeX86AsmParser'`

Follow-up to D124206

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124886
2022-05-03 19:25:48 -07:00
Amir Ayupov 1d5263c554 [BOLT][TEST] Fix test failures on AArch64 builder
Address X86 tests failures on AArch64 builder:
https://lab.llvm.org/staging/#/builders/211/builds/82

Inputs fail to cross-compile due to a missing header:
```
/usr/include/stdio.h:27:10: fatal error: 'bits/libc-header-start.h' file not found
#include <bits/libc-header-start.h>
```

As inputs are linked with `-nostdlib` anyway, don't include stdio.h.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D124863
2022-05-03 15:39:43 -07:00
Amir Ayupov 39492ba5d6 Revert "[BOLT][TEST] Fix test failures on AArch64 builder"
This reverts commit 88b6d3211c.
2022-05-03 12:45:15 -07:00
Amir Ayupov 88b6d3211c [BOLT][TEST] Fix test failures on AArch64 builder
Address X86 tests failures on AArch64 builder:
https://lab.llvm.org/staging/#/builders/211/builds/82

Inputs fail to cross-compile due to a missing header:
```
/usr/include/stdio.h:27:10: fatal error: 'bits/libc-header-start.h' file not found
#include <bits/libc-header-start.h>
```

As inputs are linked with `-nostdlib` anyway, don't include stdio.h.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D124863
2022-05-03 12:42:30 -07:00
Paul Kirth 625e0e611b [BOLT] [NFC] Remove unused variable
This patch fixes a warning from -Wunused-but-set-variable
MismatchedBranches are counted, but are never reported.
Since evaluateProfileData() should already identify and report
these cases, we can safely remove the unused variable.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124588
2022-05-03 15:15:56 +00:00
Amir Ayupov 64421e191b [BOLT][NFC] Reduce Target/{AArch64,X86} dependencies
We don't actually depend on entire X86/AArch64 components that pull in CodeGen,
SelectionDAG etc., just the Desc part with opcode and other definitions.

Note that it doesn't decouple BOLT from these components - we still pull in X86
and AArch64 from top-level llvm-bolt dependencies as we use assembler and
disassembler. It's difficult to reduce these as this requires non-trivial
changes to X86/AArch64 components themselves (e.g. moving out AsmPrinter).

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124206
2022-04-29 20:37:53 -07:00
Alexey Moksyakov 61d54259ed [BOLT] Fix r_aarch64_prelxx test
The relocation value is calculated using the formula S + A - P,
the verification of the value is performed by inversely calculating the location address

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D124270
2022-04-28 23:52:24 +03:00
Paul Kirth a0b8ab1ba3 [BOLT][NFC] Fix warning for unqualified call to std::move
Fixes warning from RetpolineInsertion.cpp:171:44:
warning: unqualified call to std::move [-Wunqualified-std-cast-call]

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D124482
2022-04-26 23:18:20 +00:00
Rahman Lavaee e59e580116 [BOLT] Refactor DataAggregator::printLBRHeatMap.
This also fixes some logs that were impacted by D123067.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D124281
2022-04-25 11:39:44 -07:00
Amir Ayupov 8634aa2503 [BOLT][CMAKE] Simplify Clang/LLD identification
Refactor nested conditions. NFC

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D123861
2022-04-23 12:17:24 -04:00
Alexander Yermolovich 014cd37f51 [BOLT][DWARF] Implement monolithic DWARF5
Added implementation to support DWARF5 in monolithic mode.
Next step DWARF5 split dwarf support.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D121876
2022-04-21 16:02:23 -07:00
Alexey Moksyakov 48e894a536 [BOLT] Add R_AARCH64_PREL16/32/64 relocations support
Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D122294
2022-04-21 13:52:47 +03:00
Vladislav Khmelevsky 63686af1e1 [BOLT] Fix build with GCC 7.3.0
The gcc 7.3.0 version raises "could not covert" error without std::move
used explicitly.

Differential Revision: https://reviews.llvm.org/D124009
2022-04-21 13:47:58 +03:00
Maksim Panchenko 76981fbcf6 [BOLT] Add fuzzy function name matching for LLVM LTO
LLVM with LTO can generate function names in the form
func.llvm.<number>, where <number> could vary based on the compilation
environment. As a result, if a profiled binary originated from a
different build than a corresponding binary used for BOLT optimization,
then profiles for such LTO functions will be ignored.

To fix the problem, use "fuzzy" matching with "func.llvm.*" form.

Reviewed By: yota9, Amir

Differential Revision: https://reviews.llvm.org/D124117
2022-04-20 17:00:21 -07:00
Alexander Yermolovich 7d6716786f [BOLT][DWARF] Handle Error returned by visitLocationList
Looks like implementation in llvm changed, and now we need to process error
being returned.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D124133
2022-04-20 16:40:46 -07:00
Aaron Ballman e0ee080574 Speculatively fix build bots
This should address the issue found in:
https://lab.llvm.org/buildbot/#/builders/215/builds/4610
2022-04-20 12:05:33 -04:00