There are two conditions ORed here with similar checks and each contain two matches that must be true for the if to succeed. With the commutable match on the first half of the OR then both ifs basically have the same first part and only the second part distinguishs. With this change we move the commutable match to second half and make the first half unique.
This caused some tests to change because we now produce a commuted result, but this shouldn't matter in practice.
llvm-svn: 306800
Summary:
As discussed on the mailing list it is legal to propagate TBAA to loads/stores
from/to smaller regions of a larger load tagged with TBAA. Do so for
(load->extractvalue)=>(gep->load) and similar foldings.
Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D31954
llvm-svn: 306615
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.
This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.
Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper
Reviewed By: efriedma
Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D33186
llvm-svn: 306525
The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less.
This patch changes it to use m_APInt to remove both these issues
Differential Revision: https://reviews.llvm.org/D34699
llvm-svn: 306457
This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform.
We have this transform for icmp+br, so unless there's some reason that icmp+select should be
treated differently, we should do the same thing here.
The benefit comes from increasing the chances of creating identical instructions. This is shown in
the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE
can simplify the identical cmps, and then InstCombine can fold the selects together.
The possible regression for the tests in select.ll raises questions about poison/undef:
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html
...but that transform is just as likely to be triggered by this canonicalization as it is to be
missed, so we're just pointing out a commutation deficiency in the pattern matching:
https://reviews.llvm.org/rL228409
Differential Revision: https://reviews.llvm.org/D34242
llvm-svn: 306435
metadata out of InstCombine and into helpers.
NFC, this just exposes the logic used by InstCombine when propagating
metadata from one load instruction to another. The plan is to use this
in SROA to address PR32902.
If anyone has better ideas about how to factor this or name variables,
I'm all ears, but this seemed like a pretty good start and lets us make
progress on the PR.
This is based on a patch by Ariel Ben-Yehuda (D34285).
llvm-svn: 306267
http://rise4fun.com/Alive/i8Q
A narrow bitwise logic op is obviously better than math for value tracking,
and zext is better than sext. Typically, the 'not' will be folded into an
icmp predicate.
The IR difference would even survive through codegen for x86, so we would see
worse code:
https://godbolt.org/g/C14HMF
one_or_zero(int, int): # @one_or_zero(int, int)
xorl %eax, %eax
cmpl %esi, %edi
setle %al
retq
one_or_zero_alt(int, int): # @one_or_zero_alt(int, int)
xorl %ecx, %ecx
cmpl %esi, %edi
setg %cl
movl $1, %eax
subl %ecx, %eax
retq
llvm-svn: 306243
Summary:
InstCombine replaces large allocas with small globals consts causing buffer overflows
on valid code, see PR33372.
This fix permits this optimization only if the global is dereference for alloca size.
Fixes PR33372
Reviewers: eugenis, majnemer, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34311
llvm-svn: 306194
Summary:
Many languages have a three way comparison idiom where comparing two values
produces not a boolean, but a tri-state value. Typical values (e.g. as used in
the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1
for greater than.
We actually do a great job already of converting three way comparisons into
binary comparisons when the result produced has one a single use. Unfortunately,
such values can have more than one use, and in that case, our existing
optimizations break down.
The patch adds a peephole which converts a three-way compare + test idiom into a
binary comparison on the original inputs. It focused on replacing the test on
the result of the three way compare and does nothing about removing the three
way compare itself. That's left to other optimizations (which do actually kick
in commonly.)
We currently recognize one idiom on signed integer compare. In the future, we
plan to recognize and simplify other comparison idioms on
other signed/unsigned datatypes such as floats, vectors etc.
This is a resurrection of Philip Reames' original patch:
https://reviews.llvm.org/D19452
Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev
Reviewed by: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34278
llvm-svn: 306100
Summary:
InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y).
This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used.
Reviewers: spatel, majnemer, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34184
llvm-svn: 306029
If the components of the and/or had multiple uses, this transform created an additional instruction.
This patch makes sure we remove one of the components.
Differential Revision: https://reviews.llvm.org/D34498
llvm-svn: 306027
There are 2 parts to this patch made simultaneously to avoid a regression.
We're reversing the canonicalization that moves bitwise vector ops before bitcasts.
We're moving bitwise vector ops *after* bitcasts instead. That's the 1st and 3rd hunks
of the patch. The motivation is that there's only one fold that currently depends on
the existing canonicalization (see next), but there are many folds that would
automatically benefit from the new canonicalization.
PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these
patterns in IR.
There's an or(and,andn) pattern that requires an adjustment in order to continue matching
to 'select' because the bitcast changes position. This match is unfortunately complicated
because it requires 4 logic ops with optional bitcast and sext ops.
Test diffs:
1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference -
bitcast comes before logic.
2. There are also tests with no diffs in bitcast.ll that verify that we're still doing
folds that were enabled by the previous canonicalization.
3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns
to look through bitcasts.
4. logical-select.ll contains several tests for the or(and,andn) --> select fold to
verify that we are still handling those cases. The lone diff shows the movement of
the bitcast from the new canonicalization rule.
Differential Revision: https://reviews.llvm.org/D33517
llvm-svn: 306011
Summary:
I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know.
This patch adds range metadata to these intrinsics based on the known bits.
Currently the patch punts if we already have range metadata present.
Reviewers: spatel, RKSimon, davide, majnemer
Reviewed By: RKSimon
Subscribers: sanjoy, hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D32582
llvm-svn: 305927
Summary:
Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184
This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision.
Reviewers: spatel, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34437
llvm-svn: 305926
We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine,
but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds,
we can use the truth table definition of xor:
X ^ Y --> (X | Y) & !(X & Y)
...to see if we can convert the xor to and/or and then use the existing folds.
http://rise4fun.com/Alive/J9v
Differential Revision: https://reviews.llvm.org/D33342
llvm-svn: 305792
Summary:
Some optimizations in AddReachableCodeToWorklist did not update
the MadeIRChange state. This could happen both when removing
trivially dead instructions (DCE) and at constant folds.
It is essential that changes to the IR is reported correctly,
since for example InstCombinePass::run() will indicate that all
analyses are preserved otherwise.
And the CGPassManager determines if the CallGraph is up-to-date
based on status from InstructionCombiningPass::runOnFunction().
The new test case early_dce_clobbers_callgraph.ll is a reproducer
for some asserts that started to trigger after changes in the
inliner in r305245. With this patch the test case passes again.
Reviewers: sanjoy, craig.topper, dblaikie
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34346
llvm-svn: 305725
Summary:
These 4 patterns have the same one use check repeated twice for each. Once without a cast and one with. But the cast has no effect on what method is called.
For the OR case I believe it is always profitable regardless of the number of uses since we'll never increase the instruction count.
For the AND case I believe it is profitable if the pair of xors has one use such that we'll get rid of it completely. Or if the C value is something freely invertible, in which case the not doesn't cost anything.
Reviewers: spatel, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34308
llvm-svn: 305705
Summary:
When we fold vector constants that are operands of phi's that feed into select,
we need to set the correct insertion point for the *new* selects that get generated.
The correct insertion point is the incoming block for the phi.
Such cases can occur with patch r298845, which fixed folding of
vector constants, but the new selects could be inserted incorrectly (as the added
test case shows).
Reviewers: majnemer, spatel, sanjoy
Reviewed by: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34162
llvm-svn: 305591
Summary:
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic.
Reviewers: reames, sanjoy, efriedma
Reviewed By: reames
Subscribers: mzolotukhin, anna, llvm-commits, skatkov
Differential Revision: https://reviews.llvm.org/D33240
llvm-svn: 305558
Summary: This is the demorganed version of the case we already handle for the OR of iszero.
Reviewers: spatel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34244
llvm-svn: 305548
Currently we expect A to be on the same side in both Ands but nothing guarantees that.
While there also switch to using matchers for some of the code.
Differential Revision: https://reviews.llvm.org/D34230
llvm-svn: 305487
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.
Reviewers: dberlin, hfinkel, spatel
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33604
llvm-svn: 305049
This was discussed in D33338. We have larger pattern-matching ending in a truncate that
we can reduce or remove by handling these smaller patterns first. Further motivation is
that narrower shift ops are easier for value tracking and zext is better than sext.
http://rise4fun.com/Alive/rhh
Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7
=>
%r = zext i1 %x to i8
Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7
=>
%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8
Differential Revision: https://reviews.llvm.org/D33879
llvm-svn: 304939
I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used.
llvm-svn: 304875
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
This fixes a bug that can cause extractelements with operands that
haven't been defined yet to be inserted at a wrong point when
optimising insertelements.
Patch by Karl Hylen.
Differential Revision: https://reviews.llvm.org/D33449
llvm-svn: 304701
Summary:
Fairly straightforward patch to fill in some of the holes in the
attributes API with respect to accessing parameter/argument attributes.
The patch aims to step further towards encapsulating the
idx+FirstArgIndex pattern to access these attributes to within the
AttributeList.
Patch by Daniel Neilson!
Reviewers: rnk, chandlerc, pete, javed.absar, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33355
llvm-svn: 304329
Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent.
Differential Revision: https://reviews.llvm.org/D33567
llvm-svn: 304018
We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo.
llvm-svn: 303924