Since we now re-write .debug_info the DWARF CU Offsets can change.
Just like for .debug_aranges the GDB Index will need to be updated.
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D118273
This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level:
* Make createMCPlusBuilder accessible externally
* Remove positional InputFilename argument to bolt utlity sources
And prepare the cmake and lit for the new tests.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb, Amir
Differential Revision: https://reviews.llvm.org/D118271
Summary:
Move the annotation to avoid dynamic memory allocations.
Improves the CPU time of instrumenting a large binary by 1% (+-0.8%, p-value 0.01)
Test Plan: NFC
Reviewers: maksfb
FBD30091656
In case the case the DW_AT_ranges tag already exists for the object the
low pc values won't be updated and will be incorrect in
after-bolt binaries.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D117216
Summary:
Address @smeenai feedback https://reviews.llvm.org/D117061#inline-1122106:
>CMake has if(IN_LIST) now, which you can use instead of the string(FIND)
IN_LIST is available since CMake 3.3 released in 2015.
Reviewed By: smeenai
FBD33590959
The DW_FORM_addr form of highPC address is written in absolute addres,
the data form is written in offset-from-low pc format.
Due to the large test binary the test is prepared separately in
https://github.com/rafaelauler/bolt-tests/pull/8
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: ayermolo
Differential Revision: https://reviews.llvm.org/D117217
Summary:
Reformat code and put options in lexicographical order.
Comparing to clang-format output, manual formatting looks cleaner to me.
(cherry picked from FBD33481692)
Summary:
Adding support for DW_FORM_data_2, DW_FORM_data_1, DW_FORM_udata.
With new .debug_info code only need to modify the check.
(cherry picked from FBD33302731)
Summary:
Now that we are re-writing .debug_info we are not longer restricted to have same size patches.
Simplifying logic to use direct forms.
(cherry picked from FBD32971159)
Summary:
Changed the behavior of how we handle .debug_info section.
Instead of patching it will now rewrite it.
With this approach we are no longer constrained to having new values
of the same size.
It handles re-writing by treating .debug_info as raw data.
It copies chunks of data between patches, with new data written in
between.
(cherry picked from FBD32519952)
Summary:
Since nops are now removed in a separate pass, the profile is consumed
on a CFG with nops. If previously a profile was generated without nops,
the offsets in the profile could be different if branches included nops
either as a source or a destination.
This diff adjust offsets to make the profile reading backwards
compatible.
(cherry picked from FBD33231254)
Summary:
The patch moves the shortenInstructions and nop remove to separate binary
passes. As a result when llvm-bolt optimizations stage will begin the
instructions of the binary functions will be absolutely the same as it
was in the binary. This is needed for the golang support by llvm-bolt.
Some of the tests must be changed, since bb alignment nops might create
unreachable BBs in original functions.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD32896517)
Summary:
This patch adds AArch64 relocations handling in case updating of
debug sections is enabled
Elvina Yakubova,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD33077609)
Summary:
Gracefully handle binaries with split functions where two fragments are folded
into one, resulting in a fragment with two parent functions.
This behavior is expected in GCC8+ with -O2 optimization level, where both
function splitting and ICF are enabled by default.
On the BOLT side, the changes are:
- BinaryFunction: allow multiple parent fragments:
- `ParentFragment` --> `ParentFragments`,
- `setParentFragment` --> `addParentFragment`.
- BinaryContext:
- `populateJumpTables`: mark fragments to be skipped later,
- `registerFragment`: add a name heuristic check, return false if it failed,
- `processInterproceduralReferences`: check if `registerFragment`
succeeded, otherwise issue a warning,
- `skipMarkedFragments`: move out fragment traversal and skipping from
`populateJumpTables` into a separate function.
This change fixes an issue where unrelated functions might be registered
as fragments:
```
BOLT-WARNING: interprocedural reference between unrelated fragments:
bad_gs/1(*2) and amd_decode_mce.cold.27/1(*2)
```
(Linux kernel binary)
(cherry picked from FBD32786688)
Summary:
Some optimizations may remove all instructions in a basic block.
The pass will cleanup the CFG afterwards by removing empty basic
blocks and merging duplicate CFG edges.
The normalized CFG is printed under '-print-normalized' option.
(cherry picked from FBD32774360)
Summary:
As pointed out by Vladislav in issue 217, if our RTDyld-based
linker fails to locate a symbol, it will crash with segfault. Fix that.
(cherry picked from FBD32481543)
Summary:
Currently there are two issues rendering the use of bughunter/BOLT on a binary
with a large number of functions (100k) impossible:
1) `selectFunctionsToProcess` has O(binary_fn * force_fn) run-time, which is up
to quadratic with the number of functions in the binary.
2) It unnecessarily treats supplied function names as regexes.
This diff proposes the following changes to address the issue:
1. Add two options that treat function names as is, not as regexes, matching
bughunter usage model: `-funcs-no-regex`/`-funcs-file-no-regex`.
These options are complementary to `-funcs`/`-funcs-file` and `-skip-funcs`/
`-skip-funcs-file`. `funcs` takes precedence over `funcs-no-regex`.
2. Use string set to speed up function eligibility checking with
`-funcs-file-no-regex` to O(binary_fn * log force_fn).
(cherry picked from FBD28917225)
Summary:
Make BOLT build in VisualStudio compiler and run without
crashing on a simple test. Other tests are not running.
(cherry picked from FBD32378736)
Summary:
With LTO, it's possible for multiple DWARF compile units to share the
same abbreviation section set, i.e. to have the same abbrev_offset.
When units sharing the same abbrev set are located next to each other
and neither of them is being processed (i.e. contain processed
functions), it can trigger a bug in BOLT. When this happened,
the abbrev set is considered empty. Additionally, different units
may patch abbrev section differently.
The fix is to not rely on the next unit offset when detecting
abbreviation set boundaries and to delay writing abbrev section
until all units are processed.
(cherry picked from FBD31985046)
Summary:
Added new functionality of dumping simple functions into assembly.
This includes:
- function control flow (basic blocks, instructions),
- profile information as `FDATA` directives, to be consumed by link_fdata,
- data labels,
- CFI directives,
- symbols for callee functions,
- jump table symbols.
Envisioned usage:
1. Find a function that triggers BOLT crash (e.g. with `bughunter.sh`).
2. Generate reproducer asm source for that function (using `-funcs`).
3. Attach it to an issue.
4. Reduce and include as a test case.
Current limitations:
1. Emitted assembly won't match input file relocations.
2. No DWARF support.
3. Data is not emitted.
(cherry picked from FBD32746857)
Summary:
Moves source files into separate components, and make explicit
component dependency on each other, so LLVM build system knows how to
build BOLT in BUILD_SHARED_LIBS=ON.
Please use the -c merge.renamelimit=230 git option when rebasing your
work on top of this change.
To achieve this, we create a new library to hold core IR files (most
classes beginning with Binary in their names), a new library to hold
Utils, some command line options shared across both RewriteInstance
and core IR files, a new library called Rewrite to hold most classes
concerned with running top-level functions coordinating the binary
rewriting process, and a new library called Profile to hold classes
dealing with profile reading and writing.
To remove the dependency from BinaryContext into X86-specific classes,
we do some refactoring on the BinaryContext constructor to receive a
reference to the specific backend directly from RewriteInstance. Then,
the dependency on X86 or AArch64-specific classes is transfered to the
Rewrite library. We can't have the Core library depend on targets
because targets depend on Core (which would create a cycle).
Files implementing the entry point of a tool are transferred to the
tools/ folder. All header files are transferred to the include/
folder. The src/ folder was renamed to lib/.
(cherry picked from FBD32746834)