Summary:
If we have two consecutive JMP instructions and no branches to the
second one, the second one is dead code, but llvm-flo does not handle these
cases properly and put two JMPs in the same BB. This patch fixes this, putting
the extraneous JMP in a separate block, making it easy for us to detect it is
dead code and remove it later in a separate step.
(cherry picked from FBD2562465)
Summary:
Nop instructions are primarily used for alignment purposes on the input.
We remove all nops when we build CFG and derive alignment of basic blocks
based on existing alignment and a presence of nops before it. This
will not always work as some basic blocks will be naturally aligned
without necessity for nops. However, it's better than random alignment.
We would also add heuristics for BB alignment based on execution profile.
(cherry picked from FBD2561740)
Summary:
Adds logic in BinaryFunction to be able to fix branches (invert
its condition, delete or add a branch), making the new function work with the
new layout proposed by the layout pass. All the architecture-specific content
was designed to live in the LLVM Target library, in the MCInstrAnalysis pass.
For now, we only introduce such logic to the X86 backend.
(cherry picked from FBD2551479)
Summary:
Tests with SPEC CPU2006 400.perlbench exposed a bug in the block reordering
heuristic that happened when two blocks are both successor and predecessor of
each other. This patch fixes this.
(cherry picked from FBD2555835)
Summary:
SPEC CPU2006 perlbench exposed a bug in BinaryFunction::optimizeLayout()
where it would try to optimize the layout even though the function had zero
basic blocks. This patch simply checks if the function has zero basic blocks and
bails out.
(cherry picked from FBD2556831)
Summary:
In a recent commit, we changed local symbols to be specially tagged
with the number 2 (local sym) instead of 1 (sym). This patch modifies the reader
to don't choke when seeing a 2 in the symbol id field.
(cherry picked from FBD2552776)
Summary:
This patch implements a dynamic programming approach to solve reorder
basic blocks with profiling information in an optimal way. Since this is
analogous to TSP, it is NP-hard and the algorithm is exponential in time and
memory consumption. Therefore, we only use the optimal algorithm to decide the
layout of small functions (with less than 11 basic blocks).
(cherry picked from FBD2544124)
Summary:
This patch introduces a first approach to reorder basic blocks based on
profiling data that gives us the execution frequency for each edge. Our strategy
is to layout basic blocks in a order that maximizes the weight (hotness) of
branches that will be deleted. We can delete branches when src comes right
before dst in the new layout order. This can be reduced to the TSP problem. This
patch uses a greedy heuristic to solve the problem: we start with a graph with
no edges and progressively add edges by choosing the hottest edges first,
building a layout order that attempts to put BBs with hot edges together.
(cherry picked from FBD2544076)
Summary:
The LBR only has information about taken branches and does not record
information when a branch is not taken. In our CFG, we call these edges
"fall-through" edges. This patch teaches llvm-flo how to infer fall-through
edge frequencies.
(cherry picked from FBD2536633)
Summary:
Changes DataReader to organize branch perf data per function name and
sets up logistics to bring this data to BinaryFunction::buildCFG(). To do this,
we expand BinaryContext with a const reference to DataReader. This patch also
adds the "-dump-functions" flag to force llvm-flo to dump the current state of
BinaryFunctions once they are disassembled and their CFG built, allowing us to
test whether the builder is sane with LLVM LIT tests.
(cherry picked from FBD2534675)
Summary:
This patch introduces DataReader, a module responsible for
parsing llvm flo data files into in-memory data structures.
(cherry picked from FBD2515754)