Summary: LLVM's non-standard notion of phi nodes means we can't both try to substitute for undef in phi nodes *and* use phi nodes as leaders all the time. This changes NewGVN to use the same semantics as SimplifyPHINode to decide which phi nodes are equivalent.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28312
llvm-svn: 291308
Apparently my suggestion of using ternary doesn't really work
as clang complains about incompatible types on LHS and RHS. Some
GCC versions happen to accept the code but clang behaviour is
correct here.
llvm-svn: 290822
Summary:
This avoids the very fragile code for null expressions. We could also use a denseset that tracks which things have null expressions instead, but that seems pretty fragile and premature optimization.
This resolves a number of infinite loop cases, test reductions coming.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28193
llvm-svn: 290816
Summary: Previously, we tried to fix up the equivalences during symbolic evaluation. This does not work. Now, we change the equivalences during congruence finding, where it belongs. We also initialize the equivalence table to give a maximal answer.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28192
llvm-svn: 290815
Summary:
The optimal iteration order for this problem is RPO order. We want to
process as many preds of a backedge as we can before we process the
backedge.
At the same time, as we add predicate handling, we want to be able to
touch instructions that are dominated by a given block by
ranges (because a change in value numbering a predicate possibly
affects all users we dominate that are using that predicate).
If we don't do it this way, we can't do value inference over
backedges (the paper covers this in depth).
The newgvn branch currently overshoots the last part, and guarantees
that it will touch *at least* the right set of instructions, but it
does touch more. This is because the bitvector instruction ranges are
currently generated in RPO order (so we take the max and the min of
the ranges of dominated blocks, which means there are some in the
middle we didn't have to touch that we did).
We can do better by sorting the dominator tree, and then just using
dominator tree order.
As a preliminary, the dominator tree has some RPO guarantees, but not
enough. It guarantees that for a given node, your idom must come
before you in the RPO ordering. It guarantees no relative RPO ordering
for siblings. We add siblings in whatever order they appear in the module.
So that is what we fix.
We sort the children array of the domtree into RPO order, and then use
the dominator tree for ordering, instead of RPO, since the dominator
tree is now a valid RPO ordering.
Note: This would help any other pass that iterates a forward problem
in dominator tree order. Most of them are single pass. It will still
maximize whatever result they compute. We could also build the
dominator tree in this order, but our incremental updates would still
put it out of sort order, and recomputing the sort order is almost as
hard as general incremental updates of the domtree.
Also note that the sorting does not affect any tests, etc. Nothing
depends on domtree order, including the verifier, the equals
functions for domtree nodes, etc.
How much could this matter, you ask?
Here are the current numbers.
This is generated by running NewGVN over all files in LLVM.
Note that once we propagate equalities, the differences go up by an
order of magnitude or two (IE instead of 29, the max ends up in the
thousands, since the worst case we add a factor of N, where N is the
number of branch predicates). So while it doesn't look that stark for
the default ordering, it gets *much much* worse. There are also
programs in the wild where the difference is already pretty stark
(2 iterations vs hundreds).
RPO ordering:
759040 Number of iterations is 1
112908 Number of iterations is 2
Default dominator tree ordering:
755081 Number of iterations is 1
116234 Number of iterations is 2
603 Number of iterations is 3
27 Number of iterations is 4
2 Number of iterations is 5
1 Number of iterations is 7
Dominator tree sorted:
759040 Number of iterations is 1
112908 Number of iterations is 2
<yay!>
Really bad ordering (sort domtree siblings in postorder. not quite the
worst possible, but yeah):
754008 Number of iterations is 1
21 Number of iterations is 10
8 Number of iterations is 11
6 Number of iterations is 12
5 Number of iterations is 13
2 Number of iterations is 14
2 Number of iterations is 15
3 Number of iterations is 16
1 Number of iterations is 17
2 Number of iterations is 18
96642 Number of iterations is 2
1 Number of iterations is 20
2 Number of iterations is 21
1 Number of iterations is 22
1 Number of iterations is 29
17266 Number of iterations is 3
2598 Number of iterations is 4
798 Number of iterations is 5
273 Number of iterations is 6
186 Number of iterations is 7
80 Number of iterations is 8
42 Number of iterations is 9
Reviewers: chandlerc, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28129
llvm-svn: 290699
emplace_back is not faster if it is equivalent to push_back. In this cases emplaced value had the
same type that the one stored in container. It is ugly and it might be even slower (see
Scott Meyers presentation about emplacement).
llvm-svn: 290685
Mostly use a bit more idiomatic C++ where we can,
so we can combine some things later.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28111
llvm-svn: 290550
The code have been developed by Daniel Berlin over the years, and
the new implementation goal is that of addressing shortcomings of
the current GVN infrastructure, i.e. long compile time for large
testcases, lack of phi predication, no load/store value numbering
etc...
The current code just implements the "core" GVN algorithm, although
other pieces (load coercion, phi handling, predicate system) are
already implemented in a branch out of tree. Once the core is stable,
we'll start adding pieces on top of the base framework.
The test currently living in test/Transform/NewGVN are a copy
of the ones in GVN, with proper `XFAIL` (missing features in NewGVN).
A flag will be added in a future commit to enable NewGVN, so that
interested parties can exercise this code easily.
Differential Revision: https://reviews.llvm.org/D26224
llvm-svn: 290346