We can have a scenario where multiple CUs share an abbrev table.
We modify or don't modify one CU, which leads to other CUs having invalid abbrev section.
Example that caused it.
All of CUs shared the same abbrev table. First CU just had compile_unit and sub_program.
It was not modified. Next CU had DW_TAG_lexical_block with
DW_AT_low_pc/DW_AT_high_pc converted to DW_AT_low_pc/DW_AT_ranges.
We used unmodified abbrev section for first and subsequent CUs.
So when parsing subsequent CUs debug info was corrupted.
In this patch we will now duplicate all sections that are modified and are different.
This also means that if .debug_types is present and it shares Abbrev table, and
they usually are, we now can have two Abbrev tables. One for CU that was modified,
and unmodified one for TU.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D118517
The aarch64 platform has special registers like X0_X1_X2_X3_X4_X5_X6_X7.
Using the downwards propagation this register will become a super
register for all X0..X7 and its super registers which is not right. This
patch replaces the downwards propagation with caching all the aliases using MCRegAliasIterator.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D117394
Since we now re-write .debug_info the DWARF CU Offsets can change.
Just like for .debug_aranges the GDB Index will need to be updated.
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D118273
This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level:
* Make createMCPlusBuilder accessible externally
* Remove positional InputFilename argument to bolt utlity sources
And prepare the cmake and lit for the new tests.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb, Amir
Differential Revision: https://reviews.llvm.org/D118271
This patch fixes the removal of unreachable uncondtional branch located
after return instruction.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D117677
Summary:
Move the annotation to avoid dynamic memory allocations.
Improves the CPU time of instrumenting a large binary by 1% (+-0.8%, p-value 0.01)
Test Plan: NFC
Reviewers: maksfb
FBD30091656
In case the case the DW_AT_ranges tag already exists for the object the
low pc values won't be updated and will be incorrect in
after-bolt binaries.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D117216
This is a follow up to Fix size mismatch error with jemalloc.
4243b6582c
Although that fix works it increased memory footprint.
With this patch we go back to original memory footprint.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D117341
Summary:
Address @smeenai feedback https://reviews.llvm.org/D117061#inline-1122106:
>CMake has if(IN_LIST) now, which you can use instead of the string(FIND)
IN_LIST is available since CMake 3.3 released in 2015.
Reviewed By: smeenai
FBD33590959
The DW_FORM_addr form of highPC address is written in absolute addres,
the data form is written in offset-from-low pc format.
Due to the large test binary the test is prepared separately in
https://github.com/rafaelauler/bolt-tests/pull/8
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: ayermolo
Differential Revision: https://reviews.llvm.org/D117217
Summary:
Follow the clang-tidy suggestion to replace reset-release with move assignment.
Move assignment's effect for unique_ptr:
> Effects: Transfers ownership from `u` to `*this` as if by calling `reset(u.release())`
followed by an assignment from `std::forward<D>(u.get_deleter())`.
Summary:
Remove X86MCPlusBuilder code that duplicates checks in X86BaseInfo.
Remove isINC and isDEC as redundant.
The new code of `X86MCPlusBuilder::isMacroOpFusionPair` is functionally
equivalent to `X86AsmBackend::isMacroFused`. However, as the method is
declared/defined in X86AsmBackend.cpp and not exported in a header file,
there's no way to use it in BOLT without changes in LLVM code.
(cherry picked from FBD33440373)
Summary:
Remove patterns ineligible for macro-fusion:
- First instruction has a memory destination
This is a temporary commit to align BOLT with LLVM MC interfaces.
(cherry picked from FBD33479340)
Summary:
Reformat code and put options in lexicographical order.
Comparing to clang-format output, manual formatting looks cleaner to me.
(cherry picked from FBD33481692)
Summary:
Adding support for DW_FORM_data_2, DW_FORM_data_1, DW_FORM_udata.
With new .debug_info code only need to modify the check.
(cherry picked from FBD33302731)
Summary:
Now that we are re-writing .debug_info we are not longer restricted to have same size patches.
Simplifying logic to use direct forms.
(cherry picked from FBD32971159)
Summary:
If `addUnknownControlFlow` in `BinaryFunction::postProcessIndirectBranches`
is invoked with a basic block that has multiple edges to the same successor,
it leads to an assertion in `BinaryBasicBlock::removePredecessor`.
For basic blocks with multiple edges to the same successor, the default
behavior of removePredecessor is to remove all occurrences of the
predecessor block in its predecessor list (Multiple=true).
Example:
```A -> B (two edges)
A->removeAllSuccessors()
for each successor of block A: // B twice
// this removes both occurrences of A in B's predecessors list
B->removePredecessor(A);
// this invocation triggers an assert as A is no longer in B's
// predecessor list
B->removePredecessor(A);
```
This issue is not fixed by NormalizeCFG as `removeAllSuccessor` is called
earlier (from `buildCFG` -> `postProcessIndirectBranches`).
Solve this issue by collecting the successors into a set (`SmallPtrSet`) first,
before invoking `SuccessorBB->removePredecessor(this)`.
GitHub issue: https://github.com/facebookincubator/BOLT/issues/187
(cherry picked from FBD30796979)
Summary:
Changed the behavior of how we handle .debug_info section.
Instead of patching it will now rewrite it.
With this approach we are no longer constrained to having new values
of the same size.
It handles re-writing by treating .debug_info as raw data.
It copies chunks of data between patches, with new data written in
between.
(cherry picked from FBD32519952)
Summary:
Refactor remaining bolt sources to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry picked from FBD33345885)
Summary:
Refactor bolt/*/Profile to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry picked from FBD33345741)
Summary:
Refactor bolt/lib/Target to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry picked from FBD33345353)
Summary:
Refactor bolt/*/Passes to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).
(cherry picked from FBD33344642)
Summary:
The lower_bound might return the end iterator, the ignoring of which will
cause memory corruption.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD33307803)
Summary:
Fix missing string header file inclusion and link_fdata find
problem in lit tests. Change root-level tests to require
linux. Re-enable Windows in our root CMakeLists.txt.
(cherry picked from FBD33296290)
Summary:
Fix according to Coding Standards doc, section Don't Use
Braces on Simple Single-Statement Bodies of if/else/loop Statements.
This set of changes applies to lib Core only.
(cherry picked from FBD33240028)
Summary:
Since nops are now removed in a separate pass, the profile is consumed
on a CFG with nops. If previously a profile was generated without nops,
the offsets in the profile could be different if branches included nops
either as a source or a destination.
This diff adjust offsets to make the profile reading backwards
compatible.
(cherry picked from FBD33231254)
Summary:
The patch moves the shortenInstructions and nop remove to separate binary
passes. As a result when llvm-bolt optimizations stage will begin the
instructions of the binary functions will be absolutely the same as it
was in the binary. This is needed for the golang support by llvm-bolt.
Some of the tests must be changed, since bb alignment nops might create
unreachable BBs in original functions.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD32896517)
Summary:
This patch adds AArch64 relocations handling in case updating of
debug sections is enabled
Elvina Yakubova,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD33077609)
Summary:
Gracefully handle binaries with split functions where two fragments are folded
into one, resulting in a fragment with two parent functions.
This behavior is expected in GCC8+ with -O2 optimization level, where both
function splitting and ICF are enabled by default.
On the BOLT side, the changes are:
- BinaryFunction: allow multiple parent fragments:
- `ParentFragment` --> `ParentFragments`,
- `setParentFragment` --> `addParentFragment`.
- BinaryContext:
- `populateJumpTables`: mark fragments to be skipped later,
- `registerFragment`: add a name heuristic check, return false if it failed,
- `processInterproceduralReferences`: check if `registerFragment`
succeeded, otherwise issue a warning,
- `skipMarkedFragments`: move out fragment traversal and skipping from
`populateJumpTables` into a separate function.
This change fixes an issue where unrelated functions might be registered
as fragments:
```
BOLT-WARNING: interprocedural reference between unrelated fragments:
bad_gs/1(*2) and amd_decode_mce.cold.27/1(*2)
```
(Linux kernel binary)
(cherry picked from FBD32786688)
Summary:
Refactor members of BinaryBasicBlock. Replace some std containers with
ADT equivalents. The size of BinaryBasicBlock on x86-64 Linux is reduced
from 232 bytes to 192 bytes.
(cherry picked from FBD33081850)
Summary:
Switched members of BinaryFunction to ADT where it was possible and
made sense. As a result, the size of BinaryFunction on x86-64 Linux
reduced from 1624 bytes to 1448.
(cherry picked from FBD32981555)