and (or (lshr X, C), ...), 1 --> (X & C') != 0
I initially thought about implementing the minimal pattern in instcombine as mentioned here:
https://bugs.llvm.org/show_bug.cgi?id=37098#c6
...but we need to do better to catch the more general sequence from the motivating test
(more than 2 bits in the compare). And a test-suite run with statistics showed that this
pattern only happened 2 times currently. It would potentially happen more often if
reassociation worked better (D45842), but it's probably still not too frequent?
This is small enough that I didn't see a need to create a whole new class/file within
AggressiveInstCombine. There are likely other relatively small matchers like what was
discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome).
We could potentially also consolidate matchers for ctpop, bswap, etc under here.
Differential Revision: https://reviews.llvm.org/D45986
llvm-svn: 331311
This is a follow-up to r331272.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
https://reviews.llvm.org/D46290
llvm-svn: 331275
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
Summary:
This is a fix for PR23997.
The loop vectorizer is not preserving the inbounds property of GEPs that it creates.
This is inhibiting some optimizations. This patch preserves the inbounds property in
the case where a load/store is being fed by an inbounds GEP.
Reviewers: mkuper, javed.absar, hsaito
Reviewed By: hsaito
Subscribers: dcaballe, hsaito, llvm-commits
Differential Revision: https://reviews.llvm.org/D46191
llvm-svn: 331269
phi is on lhs of a comparison op.
For the following testcase,
L1:
%t0 = add i32 %m, 7
%t3 = icmp eq i32* %t2, null
br i1 %t3, label %L3, label %L2
L2:
%t4 = load i32, i32* %t2, align 4
br label %L3
L3:
%t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ]
%t6 = icmp eq i32 %t0, %t5
br i1 %t6, label %L4, label %L5
We know if we go through the path L1 --> L3, %t6 should always be true. However
currently, if the rhs of the eq comparison is phi, JumpThreading fails to
evaluate %t6 to true. And we know that Instcombine cannot guarantee always
canonicalizing phi to the left hand side of the comparison operation according
to the operand priority comparison mechanism in instcombine. The patch handles
the case when rhs of the comparison op is a phi.
Differential Revision: https://reviews.llvm.org/D46275
llvm-svn: 331266
unswitch and replace it with the amazingly simple update API code.
This addresses piles of FIXMEs around the update logic here and makes
everything substantially simpler.
llvm-svn: 331247
code review.
It turns out this *is* necessary, and I read the comment on the API
correctly the first time. ;]
The `applyUpdates` routine requires that updates are "balanced". This is
in order to cleanly handle cycles like inserting, removing, nad then
re-inserting the same edge. This precludes inserting the same edge
multiple times in a row as handling that would cause the insertion logic
to become *ordered* instead of *unordered* (which is what the API
provides).
It happens that in this specific case nothing (other than an assert and
contract violation) goes wrong because we're never inserting and
removing the same edge. The implementation *happens* to do the right
thing to eliminate redundant insertions in that case.
But the requirement is there and there is an assert to catch it.
Somehow, after the code review I never did another asserts-clang build
testing loop-unswich for a long time. As a consequence, I didn't notice
this despite a bunch of testing going on, but it shows up immediately
with an asserts build of clang itself.
llvm-svn: 331246
This patch updates some code responsible the skip debug info to use
BasicBlock::instructionsWithoutDebug. I think this makes things slightly
simpler and more direct.
Reviewers: aprantl, vsk, hans, danielcdh
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D46252
llvm-svn: 331221
This patch updates some code responsible the skip debug info to use
BasicBlock::instructionsWithoutDebug. I think this makes things slightly
simpler and more direct.
Reviewers: aprantl, vsk, chandlerc
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D46253
llvm-svn: 331217
Summary:
As discussed in D45733, we want to do this in InstCombine.
https://rise4fun.com/Alive/LGk
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: chandlerc, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D45867
llvm-svn: 331205
See r331124 for how I made a list of files missing the include.
I then ran this Python script:
for f in open('filelist.txt'):
f = f.strip()
fl = open(f).readlines()
found = False
for i in xrange(len(fl)):
p = '#include "llvm/'
if not fl[i].startswith(p):
continue
if fl[i][len(p):] > 'Config':
fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
found = True
break
if not found:
print 'not found', f
else:
open(f, 'w').write(''.join(fl))
and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.
No intended behavior change.
llvm-svn: 331184
This patch updates some code responsible the skip debug info to use
BasicBlock::instructionsWithoutDebug. I think this makes things
slightly simpler and more direct.
Reviewers: mkuper, rengolin, dcaballe, aprantl, vsk
Reviewed By: rengolin
Differential Revision: https://reviews.llvm.org/D46254
llvm-svn: 331174
Summary:
This is a follow up to D45420 (included here since it is still under review and this change is dependent on that) and D45072 (committed).
Actual change for this patch is LoopVectorize* and cmakefile. All others are all from D45420.
LoopVectorizationLegality is an analysis and thus really belongs to Analysis tree. It is modular enough and it is reusable enough ---- we can further improve those aspects once uses outside of LV picks up.
Hopefully, this will make it easier for people familiar with vectorization theory, but not necessarily LV itself to contribute, by lowering the volume of code they should deal with. We probably should start adding some code in LV to check its own capability (i.e., vectorization is legal but LV is not ready to handle it) and then bail out.
Reviewers: rengolin, fhahn, hfinkel, mkuper, aemerson, mssimpso, dcaballe, sguggill
Reviewed By: rengolin, dcaballe
Subscribers: egarcia, rogfer01, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D45552
llvm-svn: 331139
Summary:
Masked merge has a pattern of: `((x ^ y) & M) ^ y`.
But, there is no difference between `((x ^ y) & M) ^ y` and `((x ^ y) & ~M) ^ x`,
We should canonicalize the pattern to non-inverted mask.
https://rise4fun.com/Alive/Yol
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45664
llvm-svn: 331112
The effect of doing so is not disrupting the LoopPassManager when mixing this pass with other loop passes. This should help locality of access substaintially and avoids the cost of computing PostDom.
The assumption here is that the full GuardWidening (which does use PostDom) is run as a canonicalization before loop opts and that this version is just catching cases exposed by other loop passes. (i.e. LoopPredication, IndVarSimplify, LoopUnswitch, etc..)
llvm-svn: 331094
This patch adds support for fragment expressions
TryToShrinkGlobalToBoolean() which were previously just dropped.
Thanks to Reid Kleckner for providing me a reproducer!
llvm-svn: 331086
Summary:
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the `LHS` and `RHS` matchers:
1. match `RHS` matcher to the `first` operand of binary operator,
2. and then match `LHS` matcher to the `second` operand of binary operator.
This works ok.
But it complicates writing of commutative matchers, where one would like to match
(`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side.
This is additionally complicated by the fact that `m_Specific()` stores the `Value *`,
not `Value **`, so it won't work at all out of the box.
The last problem is trivially solved by adding a new `m_c_Specific()` that stores the
`Value **`, not `Value *`. I'm choosing to add a new matcher, not change the existing
one because i guess all the current users are ok with existing behavior,
and this additional pointer indirection may have performance drawbacks.
Also, i'm storing pointer, not reference, because for some mysterious-to-me reason
it did not work with the reference.
The first one appears trivial, too.
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ **operands**:
1. match ~~`RHS`~~ **`LHS`** matcher to the ~~`first`~~ **`second`** operand of binary operator,
2. and then match ~~`LHS`~~ **`RHS`** matcher to the ~~`second`~ **`first`** operand of binary operator.
Surprisingly, `$ ninja check-llvm` still passes with this.
But i expect the bots will disagree..
The motivational unittest is included.
I'd like to use this in D45664.
Reviewers: spatel, craig.topper, arsenm, RKSimon
Reviewed By: craig.topper
Subscribers: xbolva00, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D45828
llvm-svn: 331085
The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk.
This patch only includes a legacy pass. A follow up will add a new style pass as well.
llvm-svn: 331060
We currently support LCSSA PHI nodes in the outer loop exit, if their
incoming values do not come from the outer loop latch or if the
outer loop latch has a single predecessor. In that case, the outer loop latch
will be executed only if the inner loop gets executed. If we have multiple
predecessors for the outer loop latch, it may be executed even if the inner
loop does not get executed.
This is a first step to support the case described in
https://bugs.llvm.org/show_bug.cgi?id=30472
Reviewers: efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43237
llvm-svn: 331037
It doesn't unwind, and the wrong marking leads to the creation of an
.eh_frame section when it isn't necessary.
Differential Revision: https://reviews.llvm.org/D46082
llvm-svn: 331008
Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed,
Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer
Subscribers: lebedev.ri, llvm-commits
Differential Revision: https://reviews.llvm.org/D45736
llvm-svn: 331002
Summary:
Simplify integer add expression X % C0 + (( X / C0 ) % C1) * C0 to
X % (C0 * C1). This is a common pattern seen in code generated by the XLA
GPU backend.
Add test cases for this new optimization.
Patch by Bixia Zheng!
Reviewers: sanjoy
Reviewed By: sanjoy
Subscribers: efriedma, craig.topper, lebedev.ri, llvm-commits, jlebar
Differential Revision: https://reviews.llvm.org/D45976
llvm-svn: 330992
Summary:
Follow-up to D43690, the EliminateAvailableExternally pass currently
runs under -O0 and -O2 and up. Under -O1 we would still want to drop
available_externally symbols to reduce space without inlining having
run.
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: mehdi_amini, llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D46093
llvm-svn: 330961
Summary:
When performing indirect call promotion, current implementation inspects "all" parameters of the callsite and attemps to match with the formal argument type of the callee function. However, it is not possible to find the type for variable length arguments, and the compiler crashes when it attemps to match the type for variable lenght argument.
It seems that the bug is introduced with D40658. Prior to that, the type matching is performed only for the parameters whose ID is less than callee->getFunctionNumParams(). The attached test case will crash without the patch.
Reviewers: mssimpso, davidxl, davide
Reviewed By: mssimpso
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46026
llvm-svn: 330844
As discussed in D45862, we want to delete parts of
this code because it can create more instructions
than it removes. But we also want to preserve some
folds that are winners, so tidy up what's here to
make splitting the good from bad a bit easier.
llvm-svn: 330841
This also means we have to check if the latch is the exiting block now,
as `transform` expects the latches to be the exiting blocks too.
https://bugs.llvm.org/show_bug.cgi?id=36586
Reviewers: efriedma, davide, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45279
llvm-svn: 330806
Summary:
When Reassociate is rewriting an expression tree it may
reuse old binary expression nodes, for new expressions.
Whenever an expression node is reused, but with a non-trivial
change in the result, we need to invalidate any debug info
that is associated with the node.
If for example rewriting
x = mul a, b
y = mul c, x
into
x = mul c, b
y = mul a, x
we still get the same result for 'y', but 'x' is a new expression.
All debug info referring to 'x' must be invalidated (marked as
optimized out) since we no longer calculate the expected value.
As a side-effect this patch avoid (at least some) problems where
reassociate could end up creating IR with debug-use before def.
Earlier the dbg.value nodes where left untouched in the IR, while
the reused binary nodes where sinked to just before the root node
of the rewritten expression tree. See PR27273 for more info about
such problems.
Reviewers: dblaikie, aprantl, dexonsmith
Reviewed By: aprantl
Subscribers: JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D45975
llvm-svn: 330804
Summary:
Use a MapVector instead of a DenseMap for RemMap since it is iteratated
over and the order of iteration can effect the order that new
instructions are created. This can in turn effect the use list order of
div/rem input values if multiple new instructions are created that share
any input values.
Reviewers: spatel
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D45858
llvm-svn: 330792
update API for dominators rather than doing manual, hacky updates.
This is just the first step, but in some ways the most important as it
moves the non-trivial unswitching to update the domtree rather than
fully recalculating it each time.
Subsequent patches should remove the custom update logic used by the
trivial unswitch and replace it with uses of the update API.
This also fixes a number of bugs I was seeing when testing non-trivial
unswitch due to it querying the quasi-correct dominator tree. Now the
tree is 100% correct and safe to query. That said, there are still more
bugs I can see with non-trivial unswitch just running over the test
suite, so more bugfix patches are needed as well.
Thanks to both Sanjoy and Fedor for reviews and testing!
Differential Revision: https://reviews.llvm.org/D45943
llvm-svn: 330787
Patch #2 from VPlan Outer Loop Vectorization Patch Series #1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).
This patch introduces the basic infrastructure to detect, legality check
and process outer loops annotated with hints for explicit vectorization.
All these changes are protected under the feature flag
-enable-vplan-native-path. This should make this patch NFC for the existing
inner loop vectorizer.
Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso.
Differential Revision: https://reviews.llvm.org/D42447
llvm-svn: 330739
After D43236, we started interchanging loops with empty dependence
matrices. In isProfitableForVectorization, we try to determine if
interchanging makes the loop dependences more friendly to the
vectorizer. If there are no dependences, we should not interchange,
based on that heuristic.
Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource
Reviewed By: mcrosier
Differential Revision: https://reviews.llvm.org/D45208
llvm-svn: 330738
The memory location an invariant load is using can never be clobbered by
any store, so it's safe to move the load ahead of the store.
Differential Revision: https://reviews.llvm.org/D46011
llvm-svn: 330725
loop unswitch.
This code incorrectly added the header to the loop block set early. As
a consequence we would incorrectly conclude that a nested loop body had
already been visited when the header of the outer loop was the preheader
of the nested loop. In retrospect, adding the header eagerly doesn't
really make sense. It seems nicer to let the cycle be formed naturally.
This will catch crazy bugs in the CFG reconstruction where we can't
correctly form the cycle earlier rather than later, and makes the rest
of the logic just fall out.
I've also added various asserts that make these issues *much* easier to
debug.
llvm-svn: 330707
This code path can very clearly be called in a context where we have
baselined all the cloned blocks to a particular loop and are trying to
handle nested subloops. There is no harm in this, so just relax the
assert. I've added a test case that will make sure we actually exercise
this code path.
llvm-svn: 330680
(notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be
included by InstCombine which doesn't/shouldn't need to depend on
ScalarOpts)
llvm-svn: 330669