Commit Graph

1369 Commits

Author SHA1 Message Date
Tilmann Scheller 2bc5cb687b [InstCombine] Reformat if statements to comply with LLVM Coding Standards.
Patch by Sonam Kumari!

Differential Revision: http://reviews.llvm.org/D5643

llvm-svn: 219198
2014-10-07 10:19:34 +00:00
Gerolf Hoflehner c0b4c20e5e [InstCombine] re-commit r218721 icmp-select-icmp optimization
Takes care of the assert that caused build fails.
Rather than asserting the code checks now that the definition
and use are in the same block, and does not attempt
to optimize when that is not the case.

llvm-svn: 219175
2014-10-07 00:16:12 +00:00
Hal Finkel 4564688806 [InstCombine] Simplify the logic from r219067 using ValueTracking
Joerg suggested on IRC that I look at generalizing the logic from r219067 to
handle more general redundancies (like removing an assume(x > 3) dominated by
an assume(x > 5)). The way to do this would be to ask ValueTracking to
determine the value of the i1 argument. It turns out that ValueTracking is not
very good at this right now (although it does get the trivial redundancy case)
because it does not understand ICmps. Nevertheless, the resulting code in
InstCombine is simpler than r219067, so we might as well do it now.

llvm-svn: 219070
2014-10-05 00:53:02 +00:00
Hal Finkel 04a156139e [InstCombine] Remove redundant @llvm.assume intrinsics
For any @llvm.assume intrinsic, if there is another which dominates it and uses
the same condition, then it is redundant and can be removed. While this does
not alter the semantics of the @llvm.assume intrinsics, it makes subsequent
handling more efficient (and the resulting IR easier to read).

llvm-svn: 219067
2014-10-04 21:27:06 +00:00
Sanjay Patel 12d1ce5408 Optimize square root squared (PR21126).
When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X.

This can happen in the real world when calculating x ** 3/2. This occurs
in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c.

Differential Revision: http://reviews.llvm.org/D5584

llvm-svn: 218906
2014-10-02 21:10:54 +00:00
Sanjay Patel b41d46118a Use the local variable that other clauses around here are already using.
llvm-svn: 218876
2014-10-02 15:20:45 +00:00
Evgeniy Stepanov 815f2869ad Revert r218721, r218735.
Failing bootstrap on Linux (arm, x86).

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13139/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/470
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/8518

llvm-svn: 218752
2014-10-01 10:07:28 +00:00
Gerolf Hoflehner 19fc3dafc8 [InstCombine] Fix for assert build failures caused by r218721
The icmp-select-icmp optimization made the implicit assumption
that the select-icmp instructions are in the same block and asserted on it.
The fix explicitly checks for that condition and conservatively suppresses
the optimization when it is violated.

llvm-svn: 218735
2014-10-01 03:24:39 +00:00
Gerolf Hoflehner 08cc4b950c [InstCombine] Optimize icmp-select-icmp
In special cases select instructions can be eliminated by
replacing them with a cheaper bitwise operation even when the
select result is used outside its home block. The instances implemented
are patterns like
    %x=icmp.eq
    %y=select %x,%r, null
    %z=icmp.eq|neq %y, null
    br %z,true, false
==> %x=icmp.ne
    %y=icmp.eq %r,null
    %z=or %x,%y
    br %z,true,false
The optimization is integrated into the instruction
combiner and performed only when all uses of the select result can
be replaced by the select operand proper. For this dominator information
is used and dominance is now a required analysis pass in the combiner.
The optimization itself is iterative. The critical step is to replace the
select result with the non-constant select operand. So the select becomes
local and the combiner iteratively works out simpler code pattern and
eventually eliminates the select.

rdar://17853760

llvm-svn: 218721
2014-10-01 00:13:22 +00:00
David Blaikie dba94ec3c7 Reapply fix in r217988 (reverted in r217989) and remove the alternative fix committed in r217987.
This type isn't owned polymorphically (as demonstrated by making the
dtor protected and everything still compiling) so just address the
warning by protecting the base dtor and making the derived class final.

llvm-svn: 217990
2014-09-17 22:27:36 +00:00
David Blaikie d8978ec085 Revert "Fix -Wnon-virtual-dtor warning introduced in r217982."
An alternative fix was already committed.

This reverts commit r217988.

llvm-svn: 217989
2014-09-17 22:17:59 +00:00
David Blaikie 20dd05ccfd Fix -Wnon-virtual-dtor warning introduced in r217982.
llvm-svn: 217988
2014-09-17 22:15:40 +00:00
Chris Bieneman ad070d0588 Refactoring SimplifyLibCalls to remove static initializers and generally cleaning up the code.
Summary: This eliminates ~200 lines of code mostly file scoped struct definitions that were unnecessary.

Reviewers: chandlerc, resistor

Reviewed By: resistor

Subscribers: morisset, resistor, llvm-commits

Differential Revision: http://reviews.llvm.org/D5364

llvm-svn: 217982
2014-09-17 20:55:46 +00:00
Andrea Di Biagio 5b92b4971a [InstCombine] Fix wrong folding of constant comparison involving ahsr and negative quantities (PR20945).
Example:
define i1 @foo(i32 %a) {
  %shr = ashr i32 -9, %a
  %cmp = icmp ne i32 %shr, -5
  ret i1 %cmp
}

Before this fix, the instruction combiner wrongly thought that %shr
could have never been equal to -5. Therefore, %cmp was always folded to 'true'.
However, when %a is equal to 1, then %cmp evaluates to 'false'. Therefore,
in this example, it is not valid to fold %cmp to 'true'.
The problem was only affecting the case where the comparison was between
negative quantities where one of the quantities was obtained from arithmetic
shift of a negative constant.

This patch fixes the problem with the wrong folding (fixes PR20945).
With this patch, the 'icmp' from the example is now simplified to a
comparison between %a and 1. This still allows us to get rid of the arithmetic
shift (%shr).

llvm-svn: 217950
2014-09-17 11:32:31 +00:00
Hal Finkel 93873cc10e Check for all known bits on ret in InstCombine
From a combination of @llvm.assume calls (and perhaps through other means, such
as range metadata), it is possible that all bits of a return value might be
known. Previously, InstCombine did not check for this (which is understandable
given assumptions of constant propagation), but means that we'd miss simple
cases where assumptions are involved.

llvm-svn: 217346
2014-09-07 21:28:34 +00:00
Hal Finkel 15aeaaf24a Add additional patterns for @llvm.assume in ValueTracking
This builds on r217342, which added the infrastructure to compute known bits
using assumptions (@llvm.assume calls). That original commit added only a few
patterns (to catch common cases related to determining pointer alignment); this
change adds several other patterns for simple cases.

r217342 contained that, for assume(v & b = a), bits in the mask
that are known to be one, we can propagate known bits from the a to v. It also
had a known-bits transfer for assume(a = b). This patch adds:

assume(~(v & b) = a) : For those bits in the mask that are known to be one, we
                       can propagate inverted known bits from the a to v.

assume(v | b = a) :    For those bits in b that are known to be zero, we can
                       propagate known bits from the a to v.

assume(~(v | b) = a):  For those bits in b that are known to be zero, we can
                       propagate inverted known bits from the a to v.

assume(v ^ b = a) :    For those bits in b that are known to be zero, we can
		       propagate known bits from the a to v. For those bits in
		       b that are known to be one, we can propagate inverted
                       known bits from the a to v.

assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can
		       propagate inverted known bits from the a to v. For those
		       bits in b that are known to be one, we can propagate
                       known bits from the a to v.

assume(v << c = a) :   For those bits in a that are known, we can propagate them
                       to known bits in v shifted to the right by c.

assume(~(v << c) = a) : For those bits in a that are known, we can propagate
                        them inverted to known bits in v shifted to the right by c.

assume(v >> c = a) :   For those bits in a that are known, we can propagate them
                       to known bits in v shifted to the right by c.

assume(~(v >> c) = a) : For those bits in a that are known, we can propagate
                        them inverted to known bits in v shifted to the right by c.

assume(v >=_s c) where c is non-negative: The sign bit of v is zero

assume(v >_s c) where c is at least -1: The sign bit of v is zero

assume(v <=_s c) where c is negative: The sign bit of v is one

assume(v <_s c) where c is non-positive: The sign bit of v is one

assume(v <=_u c): Transfer the known high zero bits

assume(v <_u c): Transfer the known high zero bits (if c is know to be a power
                 of 2, transfer one more)

A small addition to InstCombine was necessary for some of the test cases. The
problem is that when InstCombine was simplifying and, or, etc. it would fail to
check the 'do I know all of the bits' condition before checking less specific
conditions and would not fully constant-fold the result. I'm not sure how to
trigger this aside from using assumptions, so I've just included the change
here.

llvm-svn: 217343
2014-09-07 19:21:07 +00:00
Hal Finkel 60db05896a Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.

As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.

The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.

Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.

This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).

llvm-svn: 217342
2014-09-07 18:57:58 +00:00
Hal Finkel 74c2f355d2 Add an Assumption-Tracking Pass
This adds an immutable pass, AssumptionTracker, which keeps a cache of
@llvm.assume call instructions within a module. It uses callback value handles
to keep stale functions and intrinsics out of the map, and it relies on any
code that creates new @llvm.assume calls to notify it of the new instructions.
The benefit is that code needing to find @llvm.assume intrinsics can do so
directly, without scanning the function, thus allowing the cost of @llvm.assume
handling to be negligible when none are present.

The current design is intended to be lightweight. We don't keep track of
anything until we need a list of assumptions in some function. The first time
this happens, we scan the function. After that, we add/remove @llvm.assume
calls from the cache in response to registration calls and ValueHandle
callbacks.

There are no new direct test cases for this pass, but because it calls it
validation function upon module finalization, we'll pick up detectable
inconsistencies from the other tests that touch @llvm.assume calls.

This pass will be used by follow-up commits that make use of @llvm.assume.

llvm-svn: 217334
2014-09-07 12:44:26 +00:00
David Majnemer 6fe6ea740c InstCombine: Remove a special case pattern
The special case did not work when run under -reassociate and can easily
be expressed by a further generalization of an existing pattern.

llvm-svn: 217227
2014-09-05 06:09:24 +00:00
David Majnemer d2df50196f Revert "Revert two GEP-related InstCombine commits"
This reverts commit r216698 which reverted r216523 and r216598.

We would attempt to perform the transformation even if the match()
failed because, as a side effect, it would set V.  This would trick us
into believing that we correctly found a place to correctly apply the
transform.

An additional test case was added to getelementptr.ll so that we might
not regress in the future.

llvm-svn: 216890
2014-09-01 21:10:02 +00:00
David Majnemer 492e612e01 InstCombine: Respect recursion depth in visitUDivOperand
llvm-svn: 216817
2014-08-30 09:19:05 +00:00
David Majnemer 5e96f1b4c8 InstCombine: Try harder to combine icmp instructions
consider: (and (icmp X, Y), (and Z, (icmp A, B)))
It may be possible to combine (icmp X, Y) with (icmp A, B).
If we successfully combine, create an 'and' instruction with Z.

This fixes PR20814.

N.B. There is room for improvement after this change but I'm not
convinced it's worth chasing yet.

llvm-svn: 216814
2014-08-30 06:18:20 +00:00
David Majnemer 400e725bde Revert two GEP-related InstCombine commits
This reverts commit r216523 and r216598; people have reported
regressions.

llvm-svn: 216698
2014-08-29 00:06:43 +00:00
David Majnemer 074052b623 InstCombine: Remove redundant combines
InstSimplify already handles icmp (X+Y), X (and things like it)
appropriately.  The first thing that InstCombine does is run
InstSimplify on the instruction.

llvm-svn: 216659
2014-08-28 10:08:37 +00:00
David Majnemer 76d06bc613 InstSimplify: Move a transform from InstCombine to InstSimplify
Several combines involving icmp (shl C2, %X) C1 can be simplified
without introducing any new instructions.  Move them to InstSimplify;
while we are at it, make them more powerful.

llvm-svn: 216642
2014-08-28 03:34:28 +00:00
David Majnemer 22ccfc4484 InstCombine: Combine gep X, (Y-X) to Y
We try to perform this transform in InstSimplify but we aren't always
able to.  Sometimes, we need to insert a bitcast if X and Y don't have
the same time.

llvm-svn: 216598
2014-08-27 20:08:37 +00:00
Craig Topper e1d1294853 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
David Majnemer 54e97d5dc0 InstCombine: Optimize GEP's involving ptrtoint better
We supported transforming:
(gep i8* X, -(ptrtoint Y))

to:
(inttoptr (sub (ptrtoint X), (ptrtoint Y)))

However, this only fired if 'X' had type i8*.  Generalize this to
support various types of different sizes.  This results in much better
CodeGen, especially for pointers to packed structs.

llvm-svn: 216523
2014-08-27 05:16:04 +00:00
Dinesh Dwivedi 4919bbe29d This patch enables SimplifyUsingDistributiveLaws() to handle following pattens.
(X >> Z) & (Y >> Z)  -> (X&Y) >> Z  for all shifts.
(X >> Z) | (Y >> Z)  -> (X|Y) >> Z  for all shifts.
(X >> Z) ^ (Y >> Z)  -> (X^Y) >> Z  for all shifts.

These patterns were previously handled separately in visitAnd()/visitOr()/visitXor().

Differential Revision: http://reviews.llvm.org/D4951

llvm-svn: 216443
2014-08-26 08:53:32 +00:00
David Majnemer 0ffccf7fb5 InstCombine: Properly optimize or'ing bittests together
CFE, with -03, would turn:
bool f(unsigned x) {
  bool a = x & 1;
  bool b = x & 2;
  return a | b;
}

into:
  %1 = lshr i32 %x, 1
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

This sort of thing exposes a nasty pathology in GCC, ICC and LLVM.

Instead, we would rather want:
  %1 = and i32 %x, 3
  %2 = icmp ne i32 %1, 0

Things get a bit more interesting in the following case:
  %1 = lshr i32 %x, %y
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

Replacing it with the following sequence is better:
  %1 = shl nuw i32 1, %y
  %2 = or i32 %1, 1
  %3 = and i32 %2, %x
  %4 = icmp ne i32 %3, 0

This sequence is preferable because %1 doesn't involve %x and could
potentially be hoisted out of loops if it is invariant; only perform
this transform in the non-constant case if we know we won't increase
register pressure.

llvm-svn: 216343
2014-08-24 09:10:57 +00:00
David Majnemer 49775e0173 InstCombine: Don't unconditionally preserve 'nuw' when shrinking constants
Consider:
  %add = add nuw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nuw' from the
instruction.

llvm-svn: 216273
2014-08-22 17:11:04 +00:00
David Majnemer 0e6c986696 InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MIN
We can preserve nsw during this transform if -C won't overflow.

llvm-svn: 216269
2014-08-22 16:41:23 +00:00
David Majnemer 42b83a5e36 InstCombine: Don't unconditionally preserve 'nsw' when shrinking constants
Consider:
  %add = add nsw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nsw' from the
instruction.

This fixes PR20377.

llvm-svn: 216261
2014-08-22 07:56:32 +00:00
Craig Topper 71b7b68b74 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 216158
2014-08-21 05:55:13 +00:00
David Majnemer 5d1aeba2ea InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1
Adapted from a patch by Richard Smith, test-case written by me.

llvm-svn: 216157
2014-08-21 05:14:48 +00:00
Yi Jiang 1a4e73d7bf New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition
llvm-svn: 216135
2014-08-20 22:55:40 +00:00
David Majnemer 42158f3eea InstCombine: Annotate sub with nuw when we prove it's safe
We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is
negative and the right-hand side is non-negative.

llvm-svn: 216045
2014-08-20 07:17:31 +00:00
David Majnemer 57d5bc8849 InstCombine: Annotate sub with nsw when we prove it's safe
We can prove that a 'sub' can be a 'sub nsw' under certain conditions:
- The sign bits of the operands is the same.
- Both operands have more than 1 sign bit.

The subtraction cannot be a signed overflow in either case.

llvm-svn: 216037
2014-08-19 23:36:30 +00:00
Mayur Pandey 960507beb4 InstCombine: ((A & ~B) ^ (~A & B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.

Differential Revision: http://reviews.llvm.org/D4898

llvm-svn: 215974
2014-08-19 08:19:19 +00:00
Mayur Pandey 75b76c6a92 test commit (spelling correction)
llvm-svn: 215970
2014-08-19 06:41:55 +00:00
Craig Topper 6230691c91 Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.

llvm-svn: 215870
2014-08-18 00:24:38 +00:00
Craig Topper 5229cfd163 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 215868
2014-08-17 23:47:00 +00:00
Owen Anderson a4428aa484 Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to (select y, x, 0.0) when the multiply has fast math flags set.
While this might seem like an obvious canonicalization, there is one subtle problem with it.  The result of the original expression
is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN.  This means that the
new expression is strictly more defined than the original one.  One unfortunate consequence of this is that the transform is not reversible!
It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it.  Thus, targets that prefer the original
form of the expression cannot reverse the transform to recover it.  Another way to think of it is that the transform has lost source-level
information (the fast math flags), which is undesirable.

llvm-svn: 215825
2014-08-17 03:51:29 +00:00
David Majnemer 1a0bbc8a5c InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C)
While *most* (X sdiv 1) operations will get caught by InstSimplify, it
is still possible for a sdiv to appear in the worklist which hasn't been
simplified yet.

This means that it is possible for 0 - (X sdiv 1) to get transformed
into (X sdiv -1); dividing by -1 can make the transform produce undef
values instead of the proper result.

Sorry for the lack of testcase, it's a bit problematic because it relies
on the exact order of operations in the worklist.

llvm-svn: 215818
2014-08-16 09:23:42 +00:00
David Majnemer f9a095d606 InstCombine: Combine mul with div.
We can combne a mul with a div if one of the operands is a multiple of
the other:

%mul = mul nsw nuw %a, C1
%ret = udiv %mul, C2
  =>
%ret = mul nsw %a, (C1 / C2)

This can expose further optimization opportunities if we end up
multiplying or dividing by a power of 2.

Consider this small example:

define i32 @f(i32 %a) {
  %mul = mul nuw i32 %a, 14
  %div = udiv exact i32 %mul, 7
  ret i32 %div
}

which gets CodeGen'd to:

    imull       $14, %edi, %eax
    imulq       $613566757, %rax, %rcx
    shrq        $32, %rcx
    subl        %ecx, %eax
    shrl        %eax
    addl        %ecx, %eax
    shrl        $2, %eax
    retq

We can now transform this into:
define i32 @f(i32 %a) {
  %shl = shl nuw i32 %a, 1
  ret i32 %shl
}

which gets CodeGen'd to:

    leal        (%rdi,%rdi), %eax
    retq

This fixes PR20681.

llvm-svn: 215815
2014-08-16 08:55:06 +00:00
David Majnemer 698dca0b95 InstCombine: ((A | ~B) ^ (~A | B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A | ~B),(~A |B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.

Patch by Mayur Pandey!

Differential Revision: http://reviews.llvm.org/D4883

llvm-svn: 215621
2014-08-14 06:46:25 +00:00
David Majnemer f1eda23514 Added InstCombine Transform for ((B | C) & A) | B -> B | (A & C)
Transform ((B | C) & A) | B --> B | (A & C)

Z3 Link: http://rise4fun.com/Z3/hP6p

Patch by Sonam Kumari!

Differential Revision: http://reviews.llvm.org/D4865

llvm-svn: 215619
2014-08-14 06:41:38 +00:00
Benjamin Kramer a7c40ef022 Canonicalize header guards into a common format.
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)

Changes made by clang-tidy with minor tweaks.

llvm-svn: 215558
2014-08-13 16:26:38 +00:00
Karthik Bhat a4a4db91be InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b)
Correctness proof of the transform using CVC3-

$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR(A | B, BVXOR(A,B) ) = A & B;

$ cvc3 t.cvc
Valid.

llvm-svn: 215524
2014-08-13 05:13:14 +00:00
Matt Arsenault 4815f09bbe Allwo bitcast + struct GEP transform to work with addrspacecast
llvm-svn: 215467
2014-08-12 19:46:13 +00:00
David Majnemer ab07f00c64 InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b)
What follows bellow is a correctness proof of the transform using CVC3.

$ < t.cvc
A, B : BITVECTOR(32);

QUERY BVPLUS(32, A & B, A | B) = BVPLUS(32, A, B);

$ cvc3 < t.cvc
Valid.

llvm-svn: 215400
2014-08-11 22:32:02 +00:00
Suyog Sarda 56c9a87035 This patch implements transform for pattern "(A & ~B) ^ (~A) -> ~(A & B)".
Differential Revision: http://reviews.llvm.org/D4653

llvm-svn: 214479
2014-08-01 05:07:20 +00:00
Suyog Sarda 1c6c2f69f7 This patch implements transform for pattern "(A | B) & ((~A) ^ B) -> (A & B)".
Differential Revision: http://reviews.llvm.org/D4628

llvm-svn: 214478
2014-08-01 04:59:26 +00:00
Suyog Sarda 52324c82cc This patch implements transform for pattern "( A & (~B)) | (A ^ B) -> (A ^ B)"
Differential Revision: http://reviews.llvm.org/D4652

llvm-svn: 214477
2014-08-01 04:50:31 +00:00
Suyog Sarda 16d646594e This patch implements transform for pattern "(A & B) | ((~A) ^ B) -> (~A ^ B)".
Patch Credit to Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4655

llvm-svn: 214476
2014-08-01 04:41:43 +00:00
David Majnemer a92687d636 InstCombine: Correctly propagate NSW/NUW for x-(-A) -> x+A
We can only propagate the nsw bits if both subtraction instructions are
marked with the appropriate bit.

N.B.  We only propagate the nsw bit in InstCombine because the nuw case
is already handled in InstSimplify.

This fixes PR20189.

llvm-svn: 214385
2014-07-31 04:49:29 +00:00
David Majnemer 42af3601c2 InstCombine: Simplify (A ^ B) or/and (A ^ B ^ C)
While we can already transform A | (A ^ B) into A | B, things get bad
once we have (A ^ B) | (A ^ B ^ Cst) because reassociation will morph
this into (A ^ B) | ((A ^ Cst) ^ B).  Our existing patterns fail once
this happens.

To fix this, we add a new pattern which looks through the tree of xor
binary operators to see that, in fact, there exists a redundant xor
operation.

What follows bellow is a correctness proof of the transform using CVC3.

$ cat t.cvc
A, B, C : BITVECTOR(64);

QUERY BVXOR(A, B) | BVXOR(BVXOR(B, C), A) = BVXOR(A, B) | C;
QUERY BVXOR(BVXOR(A, C), B) | BVXOR(A, B) = BVXOR(A, B) | C;

QUERY BVXOR(A, B) & BVXOR(BVXOR(B, C), A) = BVXOR(A, B) & ~C;
QUERY BVXOR(BVXOR(A, C), B) & BVXOR(A, B) = BVXOR(A, B) & ~C;

$ cvc3 < t.cvc
Valid.
Valid.
Valid.
Valid.

llvm-svn: 214342
2014-07-30 21:26:37 +00:00
Hal Finkel f5867a79c5 Canonicalization for @llvm.assume
Adds simple logical canonicalization of assumption intrinsics to instcombine,
currently:
 - invariant(a && b) -> invariant(a); invariant(b)
 - invariant(!(a || b)) -> invariant(!a); invariant(!b)

llvm-svn: 213977
2014-07-25 21:45:17 +00:00
Hal Finkel cc39b67530 AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).

This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.

No functionality change intended.

llvm-svn: 213859
2014-07-24 12:16:19 +00:00
Suyog Sarda 3a8c2c1e6c This patch implements optimization as mentioned in PR19753: Optimize comparisons with "ashr/lshr exact" of a constanst.
It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch.
Added code for lshr as well as non-exact right shifts.

It implements : 
(icmp eq/ne (ashr/lshr const2, A), const1)" ->
(icmp eq/ne A, Log2(const2/const1)) ->
(icmp eq/ne A, Log2(const2) - Log2(const1))

Differential Revision: http://reviews.llvm.org/D4068
 

llvm-svn: 213678
2014-07-22 19:19:36 +00:00
Suyog Sarda b60ec909ca Added InstCombine transform for pattern "(A & B) ^ (A ^ B) -> (A | B)"
Patch idea by Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4618

llvm-svn: 213677
2014-07-22 18:30:54 +00:00
Suyog Sarda d64faf6cae Added InstCombine Transform for patterns:
"((~A & B) | A) -> (A | B)" and "((A & B) | ~A) -> (~A | B)"

Original Patch credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4591

llvm-svn: 213676
2014-07-22 18:09:41 +00:00
Suyog Sarda 521237cad6 This patch implements transform for pattern "(A | B) ^ (~A) -> (A | ~B)".
Patch Credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4588

llvm-svn: 213662
2014-07-22 15:37:39 +00:00
Sanjay Patel fc3d8f0a78 fixed typo in comment
llvm-svn: 213614
2014-07-22 04:57:06 +00:00
Duncan P. N. Exon Smith 6c99015fe2 Revert "[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges."
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build.  I'll reply on the list in a moment.

llvm-svn: 213562
2014-07-21 17:06:51 +00:00
Manuel Jacob d11beffef4 [C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges.
Summary: This patch introduces two new iterator ranges and updates existing code to use it.  No functional change intended.

Test Plan: All tests (make check-all) still pass.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4481

llvm-svn: 213474
2014-07-20 09:10:11 +00:00
Suyog Sarda 68862414b5 Move ashr optimization from InstCombineShift to InstSimplify.
Refactor code, no functionality change, test case moved from instcombine to instsimplify.

Differential Revision: http://reviews.llvm.org/D4102
 

llvm-svn: 213231
2014-07-17 06:28:15 +00:00
Suyog Sarda de409fd798 Fix Typo (first commit to test commit access)
llvm-svn: 213228
2014-07-17 06:09:34 +00:00
Manuel Jacob 405fb18213 Utilize CastInst::CreatePointerBitCastOrAddrSpaceCast here.
llvm-svn: 213189
2014-07-16 20:13:45 +00:00
Manuel Jacob b4db99c76a Fix comment in InstCombiner::visitAddrSpaceCast.
In the original version of the patch the behaviour was like described in
the comment.  This behaviour was changed before committing it without
updating the comment.

llvm-svn: 213117
2014-07-16 01:34:21 +00:00
Matt Arsenault d0d6c0b4c9 Use pointer type cast helpers.
llvm-svn: 212963
2014-07-14 17:24:38 +00:00
Aditya Nandakumar 0b5a674243 When we sink an instruction, this can open up opportunity for the operands to be sunk - add them to the worklist
llvm-svn: 212847
2014-07-11 21:49:39 +00:00
Duncan P. N. Exon Smith 04934b0fec InstCombine: Fix a crash in Descale for multiply-by-zero
Fix a crash in `InstCombiner::Descale()` when a multiply-by-zero gets
created as an argument to a GEP partway through an iteration, causing
-instcombine to optimize the GEP before the multiply.

rdar://problem/17615671

llvm-svn: 212742
2014-07-10 17:13:27 +00:00
Hal Finkel a995f92627 Feeding isSafeToSpeculativelyExecute its DataLayout pointer
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.

This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.

llvm-svn: 212720
2014-07-10 14:41:31 +00:00
Sanjay Patel 58814445d4 Fix for PR20059 (instcombine reorders shufflevector after instruction that may trap)
In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem).

This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed.

Differential Revision: http://reviews.llvm.org/D4424

llvm-svn: 212629
2014-07-09 16:34:54 +00:00
Sanjay Patel 70af1fdf9d fixed some typos
llvm-svn: 212495
2014-07-07 22:13:58 +00:00
Benjamin Kramer 6cbe670db8 Make helper functions static.
llvm-svn: 212460
2014-07-07 14:47:51 +00:00
Benjamin Kramer d0993e0077 InstCombine: Simplify code, no functionality change.
llvm-svn: 212449
2014-07-07 11:01:16 +00:00
Benjamin Kramer a420df2999 InstCombine: Strength reduce sadd.with.overflow into a regular nsw add if we can prove that it cannot overflow.
PR20194

llvm-svn: 212331
2014-07-04 10:22:21 +00:00
David Majnemer f28e2a4282 InstCombine: Optimize x/INT_MIN to x==INT_MIN
The result of x/INT_MIN is either 0 or 1, we can just use an icmp
instead.

llvm-svn: 212167
2014-07-02 06:42:13 +00:00
David Majnemer bdeef602e9 InstCombine: Don't turn -(x/INT_MIN) -> x/INT_MIN
It is not safe to negate the smallest signed integer, doing so yields
the same number back.

This fixes PR20186.

llvm-svn: 212164
2014-07-02 06:07:09 +00:00
Reid Kleckner 813dab2fc6 Optimize InstCombine stack memory consumption
This patch reduces the stack memory consumption of the InstCombine
function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions
could overflow the stack because of excessive recursiveness.

For example, in a case like this:

%0 = alloca [50025 x i32], align 4
%1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0
store i32 0,                         i32* %1
%2 = getelementptr inbounds          i32* %1, i64 1
store i32 1,                         i32* %2
%3 = getelementptr inbounds          i32* %2, i64 1
store i32 2,                         i32* %3
%4 = getelementptr inbounds          i32* %3, i64 1
store i32 3,                         i32* %4
%5 = getelementptr inbounds          i32* %4, i64 1
store i32 4,                         i32* %5
%6 = getelementptr inbounds          i32* %5, i64 1
store i32 5,                         i32* %6
...

This piece of code crashes llvm when trying to apply instcombine on
desktop. On embedded devices this could happen with a much lower limit
of recursiveness.  Some instructions (getelementptr and bitcasts) make
the function recursively call itself on their uses, which is what makes
the example above consume so much stack (it becomes a recursive
depth-first tree visit with a very big depth).

The patch changes the algorithm to be semantically equivalent, but
iterative instead of recursive and the visiting order to be from a
depth-first visit to a breadth-first visit (visit all the instructions
of the current level before the ones of the next one).

Now if a lot of memory is required a heap allocation is done instead of
the the stack allocation, avoiding the possible crash.

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4355

Patch by Marcello Maggioni!  We don't generally commit large stress test
that look for out of memory conditions, so I didn't request that one be
added to the patch.

llvm-svn: 212133
2014-07-01 21:36:20 +00:00
Dinesh Dwivedi adc07739a9 Added instruction combine to transform few more negative values addition to subtraction (Part 3)
This patch enables transforms for

(x + (~(y | c) + 1) --> x - (y | c) if c is odd

Differential Revision: http://reviews.llvm.org/D4210

llvm-svn: 211881
2014-06-27 07:47:35 +00:00
Dinesh Dwivedi 99281a0615 This patch removed duplicate code for matching patterns
which are now handled in SimplifyUsingDistributiveLaws() 
(after r211261)

Differential Revision: http://reviews.llvm.org/D4253

llvm-svn: 211768
2014-06-26 08:57:33 +00:00
Dinesh Dwivedi a716173581 Added instruction combine to transform few more negative values addition to subtraction (Part 2)
This patch enables transforms for

(x + (~(y | c) + 1)   -->   x - (y | c) if c is even

Differential Revision: http://reviews.llvm.org/D4209

llvm-svn: 211765
2014-06-26 05:40:22 +00:00
Benjamin Kramer c96a7f88b9 InstCombine: Disable umul.with.overflow recognition for vectors.
It doesn't make a lot on most targets and the code isn't ready for it. PR20113.

llvm-svn: 211583
2014-06-24 10:47:52 +00:00
Benjamin Kramer 6de786666a InstCombine: Don't try to reorder shuffles where the mask is a ConstantExpr.
We can't analyze the individual values of a vector expression. PR20114.

llvm-svn: 211581
2014-06-24 10:38:10 +00:00
Dinesh Dwivedi 562fd7534c Added instruction combine to transform few more negative values addition to subtraction (Part 1)
This patch enables transforms for following patterns.
  (x + (~(y & c) + 1)   -->   x - (y & c)
  (x + (~((y >> z) & c) + 1)   -->   x - ((y>>z) & c)

Differential Revision: http://reviews.llvm.org/D3733

llvm-svn: 211266
2014-06-19 10:36:52 +00:00
Dinesh Dwivedi b62e52e1b5 Refactored and updated SimplifyUsingDistributiveLaws() to
* Find factorization opportunities using identity values.
 * Find factorization opportunities by treating shl(X, C) as mul (X, shl(C))
 * Keep NSW flag while simplifying instruction using factorization.

This fixes PR19263.

Differential Revision: http://reviews.llvm.org/D3799

llvm-svn: 211261
2014-06-19 08:29:18 +00:00
David Majnemer 6cf6c05322 InstCombine: Stop two transforms dueling
InstCombineMulDivRem has:
// Canonicalize (X+C1)*CI -> X*CI+C1*CI.

InstCombineAddSub has:
// W*X + Y*Z --> W * (X+Z)  iff W == Y

These two transforms could fight with each other if C1*CI would not fold
away to something simpler than a ConstantExpr mul.

The InstCombineMulDivRem transform only acted on ConstantInts until
r199602 when it was changed to operate on all Constants in order to
let it fire on ConstantVectors.

To fix this, make this transform more careful by checking to see if we
actually folded away C1*CI.

This fixes PR20079.

llvm-svn: 211258
2014-06-19 07:14:33 +00:00
Nick Lewycky 8561a49c27 Move optimization of some cases of (A & C1)|(B & C2) from instcombine to instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs.
llvm-svn: 211252
2014-06-19 03:51:46 +00:00
Nick Lewycky 802df52424 Remove redundant code in InstCombineShift, no functionality change because instsimplify already does this and instcombine calls instsimplify a few lines above. Patch by Suyog Sarda!
llvm-svn: 211250
2014-06-19 03:28:28 +00:00
Matt Arsenault a0050b0961 R600/SI: Add intrinsics for various math instructions.
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.

llvm-svn: 211247
2014-06-19 01:19:19 +00:00
Jingyue Wu 33bd53df7f [InstCombine] mark ADD with nuw if no unsigned overflow
Summary:
As a starting step, we only use one simple heuristic: if the sign bits
of both a and b are zero, we can prove "add a, b" do not unsigned
overflow, and thus convert it to "add nuw a, b".

Updated all affected tests and added two new tests (@zero_sign_bit and
@zero_sign_bit2) in AddOverflow.ll

Test Plan: make check-all

Reviewers: eliben, rafael, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D4144

llvm-svn: 211084
2014-06-17 00:42:07 +00:00
Jingyue Wu baabe5091c Canonicalize addrspacecast ConstExpr between different pointer types
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.

Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.

Piggyback a minor refactor in InstCombineCasts.cpp

Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll

llvm-svn: 211004
2014-06-15 21:40:57 +00:00
Dinesh Dwivedi 95f0d51bd3 This removes TODO added in http://reviews.llvm.org/D3658
The patch transforms

ABS(NABS(X)) -> ABS(X)
NABS(ABS(X)) -> NABS(X)

Differential Revision: http://reviews.llvm.org/D4040

llvm-svn: 210782
2014-06-12 14:06:00 +00:00
Matt Arsenault 44f60d0a60 Look through addrspacecasts when turning ptr comparisons into
index comparisons.

llvm-svn: 210488
2014-06-09 19:20:29 +00:00
Rafael Espindola 4ba22f0813 Revert 209903 and 210040.
The messages were

 "PR19753: Optimize comparisons with "ashr exact" of a constanst."
 "Added support to optimize comparisons with "lshr exact" of a constant."

They were not correctly handling signed/unsigned operation differences,
causing pr19958.

llvm-svn: 210393
2014-06-07 04:12:35 +00:00
Jingyue Wu 77145d9410 InstCombine: Canonicalize addrspacecast between different element types
addrspacecast X addrspace(M)* to Y addrspace(N)*

-->

bitcast X addrspace(M)* to Y addrspace(M)*
addrspacecast Y addrspace(M)* to Y addrspace(N)*

Updat all affected tests and add several new tests in addrspacecast.ll.

This patch is based on http://reviews.llvm.org/D2186 (authored by Matt
Arsenault) with fixes and more tests.

llvm-svn: 210375
2014-06-06 21:52:55 +00:00
Dinesh Dwivedi 3217b6c661 Added select flavour for ABS and NEG(ABS)
This patch can identify 
  ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X
  ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X
  NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X
  NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X
  
and can transform
  ABS(ABS(X)) -> ABS(X)
  NABS(NABS(X)) -> NABS(X)
  
Differential Revision: http://reviews.llvm.org/D3658

llvm-svn: 210312
2014-06-06 06:54:45 +00:00
Bill Schmidt a1184635ce [PPC64LE] Correct vperm -> shuffle transform for little endian
As discussed in cfe commit r210279, the correct little-endian
semantics for the vec_perm Altivec interfaces are implemented by
reversing the order of the input vectors and complementing the permute
control vector.  This converts the desired permute from little endian
element order into the big endian element order that the underlying
PowerPC vperm instruction uses.  This is represented with a
ppc_altivec_vperm intrinsic function.

The instruction combining pass contains code to convert a
ppc_altivec_vperm intrinsic into a vector shuffle operation when the
intrinsic has a permute control vector (mask) that is a constant.
However, the vector shuffle operation assumes that vector elements are
in natural order for their endianness, so for little endian code we
will get the wrong result with the existing transformation.

This patch reverses the semantic change to vec_perm that was performed
in altivec.h by once again swapping the input operands and
complementing the permute control vector, returning the element
ordering to little endian.

The correctness of this code is tested by the new perm.c test added in
a previous patch, and by other tests in the test suite that fail
without this patch.

llvm-svn: 210282
2014-06-05 19:46:04 +00:00
Rafael Espindola 78598d9ab5 Add a Constant version of stripPointerCasts.
Thanks to rnk for the suggestion.

llvm-svn: 210205
2014-06-04 19:01:48 +00:00
Rafael Espindola 4dc5dfc56b Clauses in a landingpad are always Constant. Use a stricter type.
llvm-svn: 210203
2014-06-04 18:51:31 +00:00
Rafael Espindola 04c2258624 InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
    signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
    the other operand has a known-zero bit in a more significant place than it
    (not including the sign bit) the ripple may go up to and fill the zero, but
    won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 210186
2014-06-04 15:39:14 +00:00
Rafael Espindola d1a2c2d905 Add back commit r210029.
The code was actually correct. Sorry for the confusion. I have expanded the
comment saying why the analysis is valid to avoid me misunderstaning it
again in the future.

llvm-svn: 210052
2014-06-02 22:01:04 +00:00
Rafael Espindola 582c890fbe Revert "Add the nsw flag when we detect that an add will not signed overflow."
This reverts commit r210029.

It was not correctly handling cases where LHS and RHS had multiple but different
sign bits.

llvm-svn: 210048
2014-06-02 21:12:19 +00:00
Rafael Espindola 6b04ef785e Added support to optimize comparisons with "lshr exact" of a constant.
Patch by Rahul Jain.

llvm-svn: 210040
2014-06-02 19:19:04 +00:00
Rafael Espindola 82899febf0 Add the nsw flag when we detect that an add will not signed overflow.
We already had a function for checking this, we were just using it only in
specialized cases.

llvm-svn: 210029
2014-06-02 14:32:58 +00:00
Dinesh Dwivedi ce5d35a9d0 Added inst combine tarnsform for (1 << X) & C pattrens where C is (some PowerOf2 - 1)
This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt
  "((1 << X) & 7) == 0" ==> "X > 2"
  "((1 << X) & 7) != 0" ==> "X < 3".

Differential Revision: http://reviews.llvm.org/D3678

llvm-svn: 210007
2014-06-02 07:57:24 +00:00
Dinesh Dwivedi 43e127bded Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Differential Revision: http://reviews.llvm.org/D3777

llvm-svn: 210006
2014-06-02 07:24:36 +00:00
Rafael Espindola c323952cb4 PR19753: Optimize comparisons with "ashr exact" of a constanst.
Patch by suyog sarda.

llvm-svn: 209903
2014-05-30 15:54:32 +00:00
Chandler Carruth fdc0e0b478 And fix my fix to sink down through the type at the right time. My
original fix would actually trigger the *exact* same crasher as the
original bug for a different reason. Awesomesauce.

Working on test cases now, but wanted to get bots healthier.

llvm-svn: 209860
2014-05-29 23:21:12 +00:00
Chandler Carruth 3012a1b4cd Fix one bug in the latest incarnation of r209843 -- combining GEPs
across PHI nodes. The code was computing the Idxs from the 'GEP'
variable's indices when what it wanted was Op1's indices. This caused an
ASan heap-overflow for me that pin pointed the issue when Op1 had more
indices than GEP did. =] I'll let Louis add a specific test case for
this if he wants.

llvm-svn: 209857
2014-05-29 23:05:52 +00:00
Louis Gerbarg c6b506a0ae Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209843
2014-05-29 20:29:47 +00:00
Rafael Espindola a248f536b3 Revert "Revert "Revert "InstCombine: Improvement to check if signed addition overflows."""
This reverts commit r209776.

It was miscompiling llvm::SelectionDAGISel::MorphNode.

llvm-svn: 209817
2014-05-29 14:39:16 +00:00
Rafael Espindola 6196b7430e Revert "Revert "InstCombine: Improvement to check if signed addition overflows.""
This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure

llvm-svn: 209776
2014-05-28 21:43:52 +00:00
Rafael Espindola 910528a3eb Revert "Add support for combining GEPs across PHI nodes"
This reverts commit r209755.

it was the real cause of the libc++ build failure.

llvm-svn: 209775
2014-05-28 21:41:21 +00:00
Rafael Espindola fb59b05ca4 Revert "InstCombine: Improvement to check if signed addition overflows."
This reverts commit r209746.

It looks it is causing a crash while building libcxx. I am trying to get a
reduced testcase.

llvm-svn: 209762
2014-05-28 18:48:10 +00:00
Louis Gerbarg 727f1cbb17 Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209755
2014-05-28 17:38:31 +00:00
Rafael Espindola 085b57941f InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
   signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
   the other operand has a known-zero bit in a more significant place than it
   (not including the sign bit) the ripple may go up to and fill the zero, but
   won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 209746
2014-05-28 15:30:40 +00:00
Filipe Cabecinhas e8d6a1e82f Post-commit fixes for r209643
Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio
Fixed the argument order to select (the mask semantics to blendv* are the
inverse of select) and fixed the tests
Added parenthesis to the assert condition
Ran clang-format

llvm-svn: 209667
2014-05-27 16:54:33 +00:00
Daniel Jasper 73458c95ac Fix bad assert.
llvm-svn: 209648
2014-05-27 09:55:37 +00:00
Filipe Cabecinhas 82ac07c283 Convert some X86 blendv* intrinsics into IR.
Summary:
Implemented an InstCombine transformation that takes a blendv* intrinsic
call and translates it into an IR select, if the mask is constant.

This will eventually get lowered into blends with immediates if possible,
or pblendvb (with an option to further optimize if we can transform the
pblendvb into a blend+immediate instruction, depending on the selector).
It will also enable optimizations by the IR passes, which give up on
sight of the intrinsic.

Both the transformation and the lowering of its result to asm got shiny
new tests.

The transformation is a bit convoluted because of blendvp[sd]'s
definition:

Its mask is a floating point value! This forces us to convert it and get
the highest bit. I suppose this happened because the mask has type
__m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin.

I will send an email to llvm-dev to discuss if we want to change this or
not.

Reviewers: grosbach, delena, nadav

Differential Revision: http://reviews.llvm.org/D3859

llvm-svn: 209643
2014-05-27 03:42:20 +00:00
Tim Northover 3b0846e8f7 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

llvm-svn: 209577
2014-05-24 12:50:23 +00:00
Dinesh Dwivedi f82f16e3e6 Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'
This removes TODO added in r208849 [http://reviews.llvm.org/D3629]

MIN(MIN(A, 97), 23) -> MIN(A, 23)
MAX(MAX(A, 23), 97) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3785

llvm-svn: 209110
2014-05-19 07:08:32 +00:00
NAKAMURA Takumi 7ef81a4f98 Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"
It broke clang selfhosting even after r209065.

llvm-svn: 209067
2014-05-17 14:39:21 +00:00
Louis Gerbarg 455805694e Fix for sanitizer crash introduced in r209049
This patch fixes 3 issues introduced by r209049 that only showed up in on
the sanitizer buildbots. One was a typo in a compare. The other is a check to
confirm that the single differing value in the two incoming GEPs is the same
type. The final issue was the the IRBuilder under some circumstances would
build PHIs in the middle of the block.

llvm-svn: 209065
2014-05-17 06:51:36 +00:00
Louis Gerbarg 8d2a43e9be Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

rdar://15547484

llvm-svn: 209049
2014-05-16 23:47:24 +00:00
Dinesh Dwivedi 83c11da849 Reverting r208848, reason: build failure: sanitizer-x86_64-linux-bootstrap/builds/3399
llvm-svn: 208852
2014-05-15 08:22:55 +00:00
Dinesh Dwivedi f675f4201b Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'
MIN(MIN(A, 23), 97) -> MIN(A, 23)
MAX(MAX(A, 97), 23) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3629

llvm-svn: 208849
2014-05-15 06:13:40 +00:00
Dinesh Dwivedi 837c16097e Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Z3 Verifications code for above transform
http://rise4fun.com/Z3/Pmsh

Differential Revision: http://reviews.llvm.org/D3717

llvm-svn: 208848
2014-05-15 06:01:33 +00:00
David Majnemer 186c94244c InstCombine: Optimize -x s< cst
Summary:
This gets rid of a sub instruction by moving the negation to the
constant when valid.

Reviewers: nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3773

llvm-svn: 208827
2014-05-15 00:02:20 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Serge Pavlov e6de9e39a8 Fix the case when reordering shuffle and binop produces a constant.
This resolves PR19737.

llvm-svn: 208762
2014-05-14 09:05:09 +00:00
Nick Lewycky f0cf8fa941 Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/
llvm-svn: 208750
2014-05-14 03:03:05 +00:00
Serge Pavlov b575ee8294 Fix type of shuffle resulted from shuffle merge.
This fix resolves PR19730.

llvm-svn: 208666
2014-05-13 06:07:21 +00:00
Serge Pavlov 02ff620c7b Fix type of shuffle obtained from reordering with binary operation
In transformation:
    BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef)
type of the undef argument must be same as type of BinOp.

llvm-svn: 208531
2014-05-12 10:11:27 +00:00
Serge Pavlov 0581109708 Fix reordering of shuffles and binary operations
Do not apply transformation:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))

if operands v1 and v2 are of different size.
This change fixes PR19717, which was caused by r208488.
    

llvm-svn: 208518
2014-05-12 05:44:53 +00:00
Serge Pavlov 9ef66a8266 Reorder shuffle and binary operation.
This patch enables transformations:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
    BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2)

They allow to eliminate extra shuffles in some cases.

Differential Revision: http://reviews.llvm.org/D3525

llvm-svn: 208488
2014-05-11 08:46:12 +00:00
Michael Zolotukhin 292d3caa15 [InstCombine] Some cleanup in optimization of redundant insertvalue instructions.
And one more test added.

llvm-svn: 208355
2014-05-08 19:50:24 +00:00
Chandler Carruth d70cc604af Tidy up whitespace with clang-format prior to making significant
changes.

llvm-svn: 208229
2014-05-07 17:36:59 +00:00
Michael Zolotukhin 7d6293a0d3 [InstCombine] Add optimization of redundant insertvalue instructions.
rdar://problem/11861387

llvm-svn: 208214
2014-05-07 14:30:18 +00:00
Rafael Espindola 85f3610222 Also handle ConstantAggregateZero when optimizing vpermilvar*.
llvm-svn: 207582
2014-04-29 22:20:40 +00:00
Rafael Espindola 152ee213a4 Remove tabs.
Sorry, new machine and I forgot to change the editor setting.

llvm-svn: 207578
2014-04-29 21:02:37 +00:00
Rafael Espindola eb7bdbd0ce Two fixes to the vpermilvar optimization.
The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect.
The _256 variants have indexes into the individual 128 bit lanes and in all
cases it also has to mask out unused bits.

llvm-svn: 207577
2014-04-29 20:41:54 +00:00
Hans Wennborg e36e116826 InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569)
llvm-svn: 207426
2014-04-28 17:40:03 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Andrea Di Biagio 8cc9059ce8 [InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift
right intrinsics.

A packed logical shift right with a shift count bigger than or equal to the
element size always produces a zero vector. In all other cases, it can be
safely replaced by a 'lshr' instruction.

llvm-svn: 207299
2014-04-26 01:03:22 +00:00
Craig Topper f40110f4d8 [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Michael J. Spencer dee4b2c379 [InstCombine][x86] Constant fold psll intrinsics.
This excludes avx512 as I don't have hardware to verify. It excludes _dq
variants because they are represented in the IR as <{2,4} x i64> when it's
actually a byte shift of the entire i{128,265}.

This also excludes _dq_bs as they aren't at all supported by the backend.
There are also no corresponding instructions in the ISA. I have no idea why
they exist...

llvm-svn: 207058
2014-04-24 00:58:18 +00:00
Filipe Cabecinhas 1a80595a2b Optimize some special cases for SSE4a insertqi
Summary:
Since the upper 64 bits of the destination register are undefined when
performing this operation, we can substitute it and let the optimizer
figure out that only a copy is needed.

Also added range merging, if an instruction copies a range that can be
merged with a previous copied range.

Added test cases for both optimizations.

Reviewers: grosbach, nadav

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3357

llvm-svn: 207055
2014-04-24 00:38:14 +00:00
Matt Arsenault 60728177fb Handle addrspacecast when looking at memcpys from globals
llvm-svn: 207054
2014-04-24 00:01:09 +00:00
Matt Arsenault 6b4bed4b83 Remove dead code in instcombine.
Don't replace shifts greater than the type with the maximum shift.

This isn't hit anywhere in the tests, and somewhere else is replacing
these with undef.

llvm-svn: 207000
2014-04-23 16:48:40 +00:00
Chandler Carruth 964daaaf19 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

llvm-svn: 206844
2014-04-22 02:55:47 +00:00
Rafael Espindola bad3f77703 Simplify a vpermil* with constant mask.
With a constant mask a vpermil* is just a shufflevector. This patch implements
that simplification. This allows us to produce denser code. It should also
allow more folding down the line.

llvm-svn: 206801
2014-04-21 22:06:04 +00:00
Chandler Carruth 5f1f26e891 [Modules] Sink all the DEBUG_TYPE defines for InstCombine out of the
header files and into the cpp files.

These files will require more touches as the header files actually use
DEBUG(). Eventually, I'll have to introduce a matched #define and #undef
of DEBUG_TYPE for the header files, but that comes as step N of many to
clean all of this up.

llvm-svn: 206777
2014-04-21 19:51:41 +00:00
Matt Arsenault fed3dc8dc6 Revert "Revert r206045, "Fix shift by constants for vector.""
Fix cases where the Value itself is used, and not the constant value.

llvm-svn: 206214
2014-04-14 21:50:37 +00:00
NAKAMURA Takumi 58ad0c87f8 Whitespace.
llvm-svn: 206154
2014-04-14 07:03:13 +00:00
NAKAMURA Takumi 26afa982ec Revert r206045, "Fix shift by constants for vector."
It broke some builders, at least, i686.

llvm-svn: 206153
2014-04-14 07:02:57 +00:00
Serge Pavlov b5f3ddc7a1 Use APInt arithmetic, fixed typo. Thanks to Benjamin Kramer for noticing that.
llvm-svn: 206144
2014-04-14 02:20:19 +00:00
Serge Pavlov 4bb54d51c8 Recognize test for overflow in integer multiplication.
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:

    %mul32 = trunc i64 %mul64 to i32
    %zext = zext i32 %mul32 to i64
    %overflow = icmp ne i64 %mul64, %zext
or
    %overflow = icmp ugt i64 %mul64 , 0xffffffff

then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.

Differential Revision: http://llvm-reviews.chandlerc.com/D2814

llvm-svn: 206137
2014-04-13 18:23:41 +00:00
Matt Arsenault 173a1e577c Fix shift by constants for vector.
ashr <N x iM>, <N x iM> M -> undef

llvm-svn: 206045
2014-04-11 17:57:53 +00:00
Eli Bendersky 9966b26dac Fix PR19270 - type mismatch caused by invalid optimization.
Patch by Jingyue Wu.

llvm-svn: 205547
2014-04-03 17:51:58 +00:00
Tim Northover 00ed9964c6 ARM64: initial backend import
This adds a second implementation of the AArch64 architecture to LLVM,
accessible in parallel via the "arm64" triple. The plan over the
coming weeks & months is to merge the two into a single backend,
during which time thorough code review should naturally occur.

Everything will be easier with the target in-tree though, hence this
commit.

llvm-svn: 205090
2014-03-29 10:18:08 +00:00
Erik Verbruggen 5e1bac3a38 Revert "InstCombine: merge constants in both operands of icmp."
This reverts commit r204912, and follow-up commit r204948.

This introduced a performance regression, and the fix is not completely
clear yet.

llvm-svn: 205010
2014-03-28 14:50:57 +00:00
Reid Kleckner 3bdf9bc48b InstCombine: Don't combine constants on unsigned icmps
Fixes a miscompile introduced in r204912.  It would miscompile code like
(unsigned)(a + -49) <= 5U.  The transform would turn this into
(unsigned)a < 55U, which would return true for values in [0, 49], when
it should not.

llvm-svn: 204948
2014-03-27 17:49:27 +00:00
Erik Verbruggen 59a1219846 InstCombine: merge constants in both operands of icmp.
Transform:
    icmp X+Cst2, Cst
into:
    icmp X, Cst-Cst2
when Cst-Cst2 does not overflow, and the add has nsw.

llvm-svn: 204912
2014-03-27 11:16:05 +00:00
Richard Osborne 0af4aa9a19 [InstCombine] Don't fold bitcast into store if it would need addrspacecast
Summary:
Previously the code didn't check if the before and after types for the
store were pointers to different address spaces. This resulted in
instcombine using a bitcast to convert between pointers to different
address spaces, causing an assertion due to the invalid cast.

It is not be appropriate to use addrspacecast this case because it is
not guaranteed to be a no-op cast. Instead bail out and do not do the
transformation.

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3117

llvm-svn: 204733
2014-03-25 17:21:41 +00:00
Richard Osborne 9805ec457d Reuse earlier variables to make it clear the types involved in the cast.
No functionality change.

llvm-svn: 204732
2014-03-25 17:21:35 +00:00
Owen Anderson 9b8f9c3d95 Fix a bug in InstCombine where we would incorrectly attempt to construct a
bitcast between pointers of two different address spaces if they happened to have
the same pointer size.

llvm-svn: 203862
2014-03-13 22:51:43 +00:00
Chandler Carruth cdf4788401 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

llvm-svn: 203364
2014-03-09 03:16:01 +00:00
Tim Northover fad2761ca0 InstCombine: form shuffles from wider range of insert/extractelements
Sequences of insertelement/extractelements are sometimes used to build
vectorsr; this code tries to put them back together into shuffles, but
could only produce a completely uniform shuffle types (<N x T> from two
<N x T> sources).

This should allow shuffles with different numbers of elements on the
input and output sides as well.

llvm-svn: 203229
2014-03-07 10:24:44 +00:00
Chandler Carruth 7da14f1ab9 [Layering] Move InstVisitor.h into the IR library as it is pretty
obviously coupled to the IR.

llvm-svn: 203064
2014-03-06 03:23:41 +00:00
Craig Topper 3e4c697ca1 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 202953
2014-03-05 09:10:37 +00:00
Chandler Carruth 8cd041ef19 [Modules] Move the ConstantRange class into the IR library. This is
a bit surprising, as the class is almost entirely abstracted away from
any particular IR, however it encodes the comparsion predicates which
mutate ranges as ICmp predicate codes. This is reasonable as they're
used for both instructions and constants. Thus, it belongs in the IR
library with instructions and constants.

llvm-svn: 202838
2014-03-04 12:24:34 +00:00
Chandler Carruth 452a00747e [Modules] Move the TargetFolder into the Analysis library. Historically,
this would have been required because of the use of DataLayout, but that
has moved into the IR proper. It is still required because this folder
uses the constant folding in the analysis library (which uses the
datalayout) as the more aggressive basis of its folder.

llvm-svn: 202832
2014-03-04 11:59:06 +00:00
Chandler Carruth 1305dc3351 [Modules] Move CFG.h to the IR library as it defines graph traits over
IR types.

llvm-svn: 202827
2014-03-04 11:45:46 +00:00
Chandler Carruth 4220e9c154 [Modules] Move ValueHandle into the IR library where Value itself lives.
Move the test for this class into the IR unittests as well.

This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.

llvm-svn: 202821
2014-03-04 11:17:44 +00:00
Chandler Carruth 820a908df7 [Modules] Move the LLVM IR pattern match header into the IR library, it
obviously is coupled to the IR.

llvm-svn: 202818
2014-03-04 11:08:18 +00:00
Chandler Carruth 219b89b987 [Modules] Move CallSite into the IR library where it belogs. It is
abstracting between a CallInst and an InvokeInst, both of which are IR
concepts.

llvm-svn: 202816
2014-03-04 11:01:28 +00:00
Chandler Carruth 03eb0de93d [Modules] Move GetElementPtrTypeIterator into the IR library. As its
name might indicate, it is an iterator over the types in an instruction
in the IR.... You see where this is going.

Another step of modularizing the support library.

llvm-svn: 202815
2014-03-04 10:40:04 +00:00
Rafael Espindola 935125126c Make DataLayout a plain object, not a pass.
Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.

llvm-svn: 202168
2014-02-25 17:30:31 +00:00
Rafael Espindola aeff8a9c05 Make some DataLayout pointers const.
No functionality change. Just reduces the noise of an upcoming patch.

llvm-svn: 202087
2014-02-24 23:12:18 +00:00
Rafael Espindola 37dc9e19f5 Rename many DataLayout variables from TD to DL.
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.

llvm-svn: 201827
2014-02-21 00:06:31 +00:00
Nick Lewycky 75080ff25d Make sure that value handle users see the transformation of an indirect call to a direct call. This is important for the CallGraph iteration. Patch by Björn Steinbrink!
llvm-svn: 201822
2014-02-20 23:00:15 +00:00
Matt Arsenault aa689f5079 Do more addrspacecast transforms that happen for bitcast.
Makes addrspacecast (gep) do addrspacecast (gep) instead.

llvm-svn: 201376
2014-02-14 00:49:12 +00:00
Benjamin Kramer 920409585f InstCombine: Replace custom constant folding code with ConstantExpr.
llvm-svn: 201352
2014-02-13 18:23:24 +00:00
Owen Anderson 883b5add8e Remove a very old instcombine where we would turn sequences of selects into
logical operations on the i1's driving them.  This is a bad idea for every
target I can think of (confirmed with micro tests on all of: x86-64, ARM,
AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into
a general purpose register, whereas consuming it directly into a select generally
allows it to exist only transiently in a predicate or flags register.

Chandler ran a set of performance tests with this change, and reported no
measurable change on x86-64.

llvm-svn: 201275
2014-02-12 23:54:07 +00:00
Benjamin Kramer 94fc18d040 InstCombine: Teach icmp merging about the equivalence of bit tests and UGE/ULT with a power of 2.
This happens in bitfield code. While there reorganize the existing code
a bit.

llvm-svn: 201176
2014-02-11 21:09:03 +00:00
Paul Robinson af4e64d095 Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892
2014-02-06 00:07:05 +00:00
Reid Kleckner 26af2cae05 Update optimization passes to handle inalloca arguments
Summary:
I searched Transforms/ and Analysis/ for 'ByVal' and updated those call
sites to check for inalloca if appropriate.

I added tests for any change that would allow an optimization to fire on
inalloca.

Reviewers: nlewycky

Differential Revision: http://llvm-reviews.chandlerc.com/D2449

llvm-svn: 200281
2014-01-28 02:38:36 +00:00
Benjamin Kramer 09b0f88a7f InstCombine: Don't try to use aggregate elements of ConstantExprs.
PR18600.

llvm-svn: 200028
2014-01-24 19:02:37 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Owen Anderson 1664dc8973 Fix all the remaining lost-fast-math-flags bugs I've been able to find. The most important of these are cases in the generic logic for combining BinaryOperators.
This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd.

llvm-svn: 199629
2014-01-20 07:44:53 +00:00
Benjamin Kramer b80e1699b3 InstCombine: Modernize a bunch of cast combines.
Also make them vector-aware.

llvm-svn: 199608
2014-01-19 20:05:13 +00:00
Benjamin Kramer 970f4959d4 InstCombine: Hoist 3 copies of AddOne/SubOne into a header.
llvm-svn: 199605
2014-01-19 16:56:10 +00:00
Benjamin Kramer 7a74bd4703 InstCombine: Replace a hand-rolled version of isKnownToBeAPowerOfTwo with the real thing.
llvm-svn: 199604
2014-01-19 16:48:41 +00:00
Benjamin Kramer 72196f3ae5 InstCombine: Teach most integer add/sub/mul/div combines how to deal with vectors.
llvm-svn: 199602
2014-01-19 15:24:22 +00:00
Benjamin Kramer 76b15d04ff InstCombine: Refactor fmul/fdiv combines to handle vectors.
llvm-svn: 199598
2014-01-19 13:36:27 +00:00
Nick Lewycky a6a17d77d2 Don't refuse to transform constexpr(call(arg, ...)) to call(constexpr(arg), ...)) just because the function has multiple return values even if their return types are the same. Patch by Eduard Burtescu!
llvm-svn: 199564
2014-01-18 22:47:12 +00:00
Benjamin Kramer fea9ac99b0 InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle vectors too.
PR18532.

llvm-svn: 199553
2014-01-18 16:43:14 +00:00
Owen Anderson 48b842ef7c Fix more instances of dropped fast math flags when optimizing FADD instructions. All found by inspection (aka grep).
llvm-svn: 199528
2014-01-18 00:48:14 +00:00
Owen Anderson e7321660c1 Fix two cases where we could lose fast math flags when optimizing FADD expressions.
llvm-svn: 199427
2014-01-16 21:26:02 +00:00
Owen Anderson 4557a156e3 Fix an instance where we would drop fast math flags when performing an fdiv to reciprocal multiply transformation.
llvm-svn: 199425
2014-01-16 21:07:52 +00:00
Owen Anderson e8537fc7e0 Fix a bug in InstCombine where we failed to preserve fast math flags when optimizing an FMUL expression.
llvm-svn: 199424
2014-01-16 20:59:41 +00:00
Owen Anderson f74cfe031f Teach InstCombine that (fmul X, -1.0) can be simplified to (fneg X), which LLVM expresses as (fsub -0.0, X).
llvm-svn: 199420
2014-01-16 20:36:42 +00:00
Matt Arsenault 2d353d1a10 Do pointer cast simplifications on addrspacecast
llvm-svn: 199254
2014-01-14 20:00:45 +00:00
Matt Arsenault f08a44f903 Remove a check for an illegal condition.
Bitcasts can't be between address spaces anymore.

llvm-svn: 199253
2014-01-14 19:56:57 +00:00
Hao Liu 26abebbb2c Fix a bug about generating undef operand when optimising shuffle vector and insert element in instruction combine.
llvm-svn: 198730
2014-01-08 03:06:15 +00:00
Kay Tiong Khoo e37d52095e Stay classy (and legal) LLVM. Remove links to 3rd party SMT solver whose links may not be permanent.
llvm-svn: 197713
2013-12-19 18:35:54 +00:00
Kay Tiong Khoo a570b5adb5 Improved fix for PR17827 (instcombine of shift/and/compare).
This change fixes the case of arithmetic shift right - do not attempt to fold that case.
This change also relaxes the conditions when attempting to fold the logical shift right and shift left cases.

No additional IR-level test cases included at this time. See http://llvm.org/bugs/show_bug.cgi?id=17827 for proofs that these are correct transformations.

llvm-svn: 197705
2013-12-19 18:07:17 +00:00
Matt Arsenault bbf18c6958 Fix assert with copy from global through addrspacecast
llvm-svn: 196638
2013-12-07 02:58:45 +00:00
Duncan P. N. Exon Smith ce5f93efd5 Don't use isNullValue to evaluate ConstantExpr
ConstantExpr can evaluate to false even when isNullValue gives false.

Fixes PR18143.

llvm-svn: 196611
2013-12-06 21:48:36 +00:00
Kay Tiong Khoo d7b00cac10 Use local variable for repeated use rather than 'get' method. No functional change intended.
llvm-svn: 196164
2013-12-02 22:23:32 +00:00
Kay Tiong Khoo 64b732005f Move variables to where they are used and give them better names. No functional change intended.
llvm-svn: 196163
2013-12-02 22:20:40 +00:00
Kay Tiong Khoo 564560f911 Rename variables to be consistent (CST -> Cst). No functional change intended.
llvm-svn: 196161
2013-12-02 22:11:56 +00:00
Kay Tiong Khoo 5389f74655 Conservative fix for PR17827 - don't optimize a shift + and + compare sequence where the shift is logical unless the comparison is unsigned
llvm-svn: 196129
2013-12-02 18:43:59 +00:00
Stephen Canon c454964c47 Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)).
llvm-svn: 195934
2013-11-28 21:38:05 +00:00
Hal Finkel 12100bf7e8 Apply the InstCombine fptrunc sqrt optimization to llvm.sqrt
InstCombine, in visitFPTrunc, applies the following optimization to sqrt calls:

  (fptrunc (sqrt (fpext x))) -> (sqrtf x)

but does not apply the same optimization to llvm.sqrt. This is a problem
because, to enable vectorization, Clang generates llvm.sqrt instead of sqrt in
fast-math mode, and because this optimization is being applied to sqrt and not
applied to llvm.sqrt, sometimes the fast-math code is slower.

This change makes InstCombine apply this optimization to llvm.sqrt as well.

This fixes the specific problem in PR17758, although the same underlying issue
(optimizations applied to libcalls are not applied to intrinsics) exists for
other optimizations in SimplifyLibCalls.

llvm-svn: 194935
2013-11-16 21:29:08 +00:00
Benjamin Kramer 03f3e248eb InstCombine: fold (A >> C) == (B >> C) --> (A^B) < (1 << C) for constant Cs.
This is common in bitfield code.

llvm-svn: 194925
2013-11-16 16:00:48 +00:00
Matt Arsenault a9e95abcbf Add instcombine visitor for addrspacecast
llvm-svn: 194786
2013-11-15 05:45:08 +00:00
Nadav Rotem ea186b9515 Update the docs to match the function name.
llvm-svn: 194537
2013-11-13 01:12:01 +00:00
Nadav Rotem 0ed2fdb5af Fold (iszero(A&K1) | iszero(A&K2)) -> (A&(K1|K2)) != (K1|K2) if we know that K1 and K2 are 'one-hot' (only one bit is on).
llvm-svn: 194525
2013-11-12 22:38:59 +00:00
Matt Arsenault 243140f2fd Scalarize select vector arguments when extracted.
When the elements are extracted from a select on vectors
or a vector select, do the select on the extracted scalars
from the input if there is only one use.

llvm-svn: 194013
2013-11-04 20:36:06 +00:00
Craig Topper ef9e993eaa Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext.
llvm-svn: 192672
2013-10-15 05:20:47 +00:00
Owen Anderson 5797bfd4a3 Pull fptrunc's upwards through selects when one of the select's selectands was a constant. This has a number of benefits, including producing small immediates (easier to materialize, smaller constant pools) as well as being more likely to allow the fptrunc to fuse with a preceding instruction (truncating selects are unusual).
llvm-svn: 191929
2013-10-03 21:08:05 +00:00
Matt Arsenault bfa37e546d Make gep i8* X, -(ptrtoint Y) transform work with address spaces
llvm-svn: 191920
2013-10-03 18:15:57 +00:00
Matt Arsenault 8468062c6e Use right address space size in InstCombineCompares
The test's output doesn't change, but this ensures
this is actually hit with a different address space.

llvm-svn: 191701
2013-09-30 21:11:01 +00:00
Matt Arsenault 06adecabe7 Constant fold ptrtoint + compare with address spaces
llvm-svn: 191699
2013-09-30 21:06:18 +00:00
Benjamin Kramer 6748576a0d InstCombine: Replace manual fast math flag copying with the new IRBuilder RAII helper.
Defines away the issue where cast<Instruction> would fail because constant
folding happened. Also slightly cleaner.

llvm-svn: 191674
2013-09-30 15:39:59 +00:00
Joey Gouly d51a35c6a0 Fix a bug in InstCombine where it attempted to cast a Value* to an Instruction*
when it was actually a Constant*.

There are quite a few other casts to Instruction that might have the same problem,
but this is the only one I have a test case for.

llvm-svn: 191668
2013-09-30 14:18:35 +00:00
Matt Arsenault fa25272db9 Use type helper functions
llvm-svn: 191574
2013-09-27 22:18:51 +00:00
Justin Bogner 4a9ac8cd75 InstCombine: Only foldSelectICmpAndOr for integer types
Currently foldSelectICmpAndOr asserts if the "or" involves a vector
containing several of the same power of two. We can easily avoid this by
only performing the fold on integer types, like foldSelectICmpAnd does.

Fixes <rdar://problem/15012516>

llvm-svn: 191552
2013-09-27 20:35:39 +00:00
Benjamin Kramer 30d249a1b3 Push analysis passes to InstSimplify when they're around anyways.
llvm-svn: 191309
2013-09-24 16:37:40 +00:00
Benjamin Kramer 0e2d162d1e InstCombine: Remove unused argument. No functionality change.
llvm-svn: 191112
2013-09-20 22:12:42 +00:00
Benjamin Kramer e6461e3053 InstCombine: Canonicalize (gep i8* X, -(ptrtoint Y)) to (sub (ptrtoint X), (ptrtoint Y))
The GEP pattern is what SCEV expander emits for "ugly geps". The latter is what
you get for pointer subtraction in C code. The rest of instcombine already
knows how to deal with that so just canonicalize on that.

llvm-svn: 191090
2013-09-20 14:38:44 +00:00
Shuxin Yang 3a7ca6ec87 [Fast-math] Disable "(C1/X)*C2 => (C1*C2)/X" if C1/X has multiple uses.
If "C1/X" were having multiple uses, the only benefit of this
transformation is to potentially shorten critical path. But it is at the
cost of instroducing additional div.

  The additional div may or may not incur cost depending on how div is
implemented. If it is implemented using Newton–Raphson iteration, it dosen't
seem to incur any cost (FIXME). However, if the div blocks the entire
pipeline, that sounds to be pretty expensive. Let CodeGen to take care 
this transformation.

  This patch sees 6% on a benchmark.

rdar://15032743

llvm-svn: 191037
2013-09-19 21:13:46 +00:00
Benjamin Kramer 0b37cdf9af InstCombine: Don't allow turning vector-of-pointer loads into vector-of-integer.
The code below can't handle any pointers. PR17293.

llvm-svn: 191036
2013-09-19 20:59:04 +00:00
Quentin Colombet 870b662779 Revert the load slicing done in r190870.
To avoid regressions with bitfield optimizations, this slicing should take place
later, like ISel time.

llvm-svn: 190891
2013-09-17 22:01:26 +00:00
Matt Arsenault e6952f28ca Cleanup handling of constant function casts.
Some of this code is no longer necessary since int<->ptr casts are no
longer occur as of r187444.

This also fixes handling vectors of pointers, and adds a bunch of new
testcases for vectors and address spaces.

llvm-svn: 190885
2013-09-17 21:10:14 +00:00
Quentin Colombet b8d672ef5b [InstCombiner] Slice a big load in two loads when the elements are next to each
other in memory.

The motivation was to get rid of truncate and shift right instructions that get
in the way of paired load or floating point load.
E.g.,
Consider the following example:
struct Complex {
  float real;
  float imm;
};

When accessing a complex, llvm was generating a 64-bits load and the imm field
was obtained by a trunc(lshr) sequence, resulting in poor code generation, at
least for x86.

The idea is to declare that two load instructions is the canonical form for
loading two arithmetic type, which are next to each other in memory.

Two scalar loads at a constant offset from each other are pretty
easy to detect for the sorts of passes that like to mess with loads. 

<rdar://problem/14477220>

llvm-svn: 190870
2013-09-17 16:57:34 +00:00
Eli Friedman 77d7fbb924 Get rid of unused isPodLike definitions.
llvm-svn: 190461
2013-09-11 00:36:54 +00:00
Quentin Colombet 5ab555532b [InstCombiner] Expose opportunities to merge subtract and comparison.
Several architectures use the same instruction to perform both a comparison and
a subtract. The instruction selection framework does not allow to consider
different basic blocks to expose such fusion opportunities.

Therefore, these instructions are “merged” by CSE at MI IR level.

To increase the likelihood of CSE to apply in such situation, we reorder the
operands of the comparison, when they have the same complexity, so that they
matches the order of the most frequent subtract.
E.g.,

icmp A, B
...
sub B, A

<rdar://problem/14514580>

llvm-svn: 190352
2013-09-09 20:56:48 +00:00
Matt Arsenault 8227b9f69c Use type helper functions.
llvm-svn: 190113
2013-09-06 00:37:24 +00:00
Matt Arsenault e6db76071c Consistently use dbgs() in debug printing
llvm-svn: 190093
2013-09-05 19:48:28 +00:00
Tim Northover dc647a2603 InstCombine: allow unmasked icmps to be combined with logical ops
"(icmp op i8 A, B)" is equivalent to "(icmp op i8 (A & 0xff), B)" as a
degenerate case. Allowing this as a "masked" comparison when analysing "(icmp)
&/| (icmp)" allows us to combine them in more cases.

rdar://problem/7625728

llvm-svn: 189931
2013-09-04 11:57:17 +00:00
Tim Northover c0756c454c InstCombine: look for masked compares with subset relation
Even in cases which aren't universally optimisable like "(A & B) != 0 && (A &
C) != 0", the masks can make one of the comparisons completely redundant. In
this case, since we've gone to the effort of spotting masked comparisons we
should combine them.

rdar://problem/7625728

llvm-svn: 189930
2013-09-04 11:57:13 +00:00
Matt Arsenault 3dfe54e954 Teach InstCombineLoadCast about address spaces.
This is another one that doesn't matter much,
but uses the right GEP index types in the first
place.

llvm-svn: 189854
2013-09-03 21:05:48 +00:00
Matt Arsenault e38e4cdc46 Use type form of getIntPtrType in alloca visitor.
This doesn't actually matter, since alloca is always
0 address space, but this is more consistent.

llvm-svn: 189853
2013-09-03 21:05:15 +00:00
Benjamin Kramer 010f108382 InstCombine: Check for zero shift amounts before subtracting one causing integer overflow.
PR17026. Also avoid undefined shifts and shift amounts larger than 64 bits
(those are always undef because we can't represent integer types that large).

llvm-svn: 189672
2013-08-30 14:35:35 +00:00