Commit Graph

76 Commits

Author SHA1 Message Date
Benjamin Kramer 03f3e248eb InstCombine: fold (A >> C) == (B >> C) --> (A^B) < (1 << C) for constant Cs.
This is common in bitfield code.

llvm-svn: 194925
2013-11-16 16:00:48 +00:00
Quentin Colombet 5ab555532b [InstCombiner] Expose opportunities to merge subtract and comparison.
Several architectures use the same instruction to perform both a comparison and
a subtract. The instruction selection framework does not allow to consider
different basic blocks to expose such fusion opportunities.

Therefore, these instructions are “merged” by CSE at MI IR level.

To increase the likelihood of CSE to apply in such situation, we reorder the
operands of the comparison, when they have the same complexity, so that they
matches the order of the most frequent subtract.
E.g.,

icmp A, B
...
sub B, A

<rdar://problem/14514580>

llvm-svn: 190352
2013-09-09 20:56:48 +00:00
Matt Arsenault 745101d666 Teach InstCombine about address spaces
llvm-svn: 188926
2013-08-21 19:53:10 +00:00
Stephen Lin a76289aa1b Catch more CHECK that can be converted to CHECK-LABEL in Transforms for easier debugging. No functionality change.
This conversion was done with the following bash script:

  find test/Transforms -name "*.ll" | \
  while read NAME; do
    echo "$NAME"
    if ! grep -q "^; *RUN: *llc" $NAME; then
      TEMP=`mktemp -t temp`
      cp $NAME $TEMP
      sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
      while read FUNC; do
        sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)define\([^@]*\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3define\4@$FUNC(/g" $TEMP
      done
      mv $TEMP $NAME
    fi
  done

llvm-svn: 186269
2013-07-14 01:50:49 +00:00
Stephen Lin c1c7a1309c Update Transforms tests to use CHECK-LABEL for easier debugging. No functionality change.
This update was done with the following bash script:

  find test/Transforms -name "*.ll" | \
  while read NAME; do
    echo "$NAME"
    if ! grep -q "^; *RUN: *llc" $NAME; then
      TEMP=`mktemp -t temp`
      cp $NAME $TEMP
      sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
      while read FUNC; do
        sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP
      done
      mv $TEMP $NAME
    fi
  done

llvm-svn: 186268
2013-07-14 01:42:54 +00:00
David Majnemer 72d76275ac InstCombine: variations on 0xffffffff - x >= 4
The following transforms are valid if -C is a power of 2:
(icmp ugt (xor X, C), ~C) -> (icmp ult X, C)
(icmp ult (xor X, C), -C) -> (icmp uge X, C)

These are nice, they get rid of the xor.

llvm-svn: 185915
2013-07-09 09:20:58 +00:00
David Majnemer 414d4e58aa InstCombine: X & -C != -C -> X <= u ~C
Tests were added in r185910 somehow.

llvm-svn: 185912
2013-07-09 08:09:32 +00:00
David Majnemer bafa537eb7 Commit r185909 was a misapplied patch, fix it
llvm-svn: 185910
2013-07-09 07:58:32 +00:00
David Majnemer f2a9a513c7 InstCombine: add more transforms
C1-X <u C2 -> (X|(C2-1)) == C1
C1-X >u C2 -> (X|C2) == C1
X-C1 <u C2 -> (X & -C2) == C1
X-C1 >u C2 -> (X & ~C2) == C1

llvm-svn: 185909
2013-07-09 07:50:59 +00:00
David Majnemer fa90a0b325 InstCombine: Fold X-C1 <u 2 -> (X & -2) == C1
Back in r179493 we determined that two transforms collided with each
other.  The fix back then was to reorder the transforms so that the
preferred transform would give it a try and then we would try the
secondary transform.  However, it was noted that the best approach would
canonicalize one transform into the other, removing the collision and
allowing us to optimize IR given to us in that form.

llvm-svn: 185808
2013-07-08 11:53:08 +00:00
David Majnemer 69430609ff InstCombine: typo in or_icmp_eq_B_0_icmp_ult_A_B test
llvm-svn: 185737
2013-07-06 00:54:07 +00:00
David Majnemer c2a990bc00 InstCombine: (icmp eq B, 0) | (icmp ult A, B) -> (icmp ule A, B-1)
This transform allows us to turn IR that looks like:
  %1 = icmp eq i64 %b, 0
  %2 = icmp ult i64 %a, %b
  %3 = or i1 %1, %2
  ret i1 %3

into:
  %0 = add i64 %b, -1
  %1 = icmp uge i64 %0, %a
  ret i1 %1

which means we go from lowering:
        cmpq    %rsi, %rdi
        setb    %cl
        testq   %rsi, %rsi
        sete    %al
        orb     %cl, %al
        ret

to lowering:
        decq    %rsi
        cmpq    %rdi, %rsi
        setae   %al
        ret

llvm-svn: 185677
2013-07-05 00:31:17 +00:00
David Majnemer b889e405eb InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2)
We may, after other optimizations, find ourselves with IR that looks
like:

  %shl = shl i32 1, %y
  %cmp = icmp ult i32 %shl, 32

Instead, we should just compare the shift count:

  %cmp = icmp ult i32 %y, 5

llvm-svn: 185242
2013-06-28 23:42:03 +00:00
Rafael Espindola 932470bcd9 Add a testcase from pr16244.
llvm-svn: 183433
2013-06-06 19:15:23 +00:00
Benjamin Kramer 21b972ae94 InstCombine: Don't just copy known bits from the first operand of an srem.
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.

llvm-svn: 181518
2013-05-09 16:32:32 +00:00
David Majnemer 1fae195557 Reorders two transforms that collide with each other
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224

llvm-svn: 179493
2013-04-14 21:15:43 +00:00
David Majnemer 1a08accbb7 Simplify (A & ~B) in icmp if A is a power of 2
The transform will execute like so:
(A & ~B) == 0 --> (A & B) != 0
(A & ~B) != 0 --> (A & B) == 0

llvm-svn: 179386
2013-04-12 17:25:07 +00:00
David Majnemer b81cd63c4b Optimize icmp involving addition better
Allows LLVM to optimize sequences like the following:

%add = add nsw i32 %x, 1
%cmp = icmp sgt i32 %add, %y

into:

%cmp = icmp sge i32 %x, %y

as well as:

%add1 = add nsw i32 %x, 20
%add2 = add nsw i32 %y, 57
%cmp = icmp sge i32 %add1, %add2

into:

%add = add nsw i32 %y, 37
%cmp = icmp sle i32 %cmp, %x

llvm-svn: 179316
2013-04-11 20:05:46 +00:00
Arnaud A. de Grandmaison 3ee88e8a77 Address issues found by Duncan during post-commit review of r177856.
llvm-svn: 177863
2013-03-25 11:47:38 +00:00
Arnaud A. de Grandmaison 9c383d68cf InstCombine: simplify comparisons to zero of (shl %x, Cst) or (mul %x, Cst)
This simplification happens at 2 places :
 - using the nsw attribute when the shl / mul is used by a sign test
 - when the shl / mul is compared for (in)equality to zero

llvm-svn: 177856
2013-03-25 09:48:49 +00:00
Arnaud A. de Grandmaison 61c167c62b Teach InstCombine to work with smaller legal types in icmp (shl %v, C1), C2
It enables to work with a smaller constant, which is target friendly for those which can compare to immediates.
It also avoids inserting a shift in favor of a trunc, which can be free on some targets.

This used to work until LLVM-3.1, but regressed with the 3.2 release.

llvm-svn: 175270
2013-02-15 14:35:47 +00:00
Jakub Staszak c48bbe7170 Add extra CHECK to make sure that 'or' instruction was replaced.
Also add an assert to avoid confusion in the code where is known that C1 <= C2.

llvm-svn: 171310
2012-12-31 18:26:42 +00:00
Jakub Staszak ea2b9b9d67 Transform (A == C1 || A == C2) into (A & ~(C1 ^ C2)) == C1
if C1 and C2 differ only with one bit.
Fixes PR14708.

llvm-svn: 171270
2012-12-31 00:34:55 +00:00
Paul Redmond 5917f4c715 Transform (x&C)>V into (x&C)!=0 where possible
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.

Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.

Patch by Kevin Schoedel

llvm-svn: 170576
2012-12-19 19:47:13 +00:00
NAKAMURA Takumi 38d2b2442f Revert r170020, "Simplify negated bit test", for now.
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.

This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.

llvm-svn: 170128
2012-12-13 14:28:16 +00:00
David Majnemer 5226aa94ce Simplify negated bit test
llvm-svn: 170020
2012-12-12 20:48:54 +00:00
Duncan Sands 1d3acddf0e Fix PR14361: wrong simplification of A+B==B+A. You may think that the old logic
replaced by this patch is equivalent to the new logic, but you'd be wrong, and
that's exactly where the bug was.  There's a similar bug in instsimplify which
manifests itself as instsimplify failing to simplify this, rather than doing it
wrong, see next commit.

llvm-svn: 168181
2012-11-16 18:55:49 +00:00
Benjamin Kramer 8b8a76974f InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare.
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.

stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us

where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.

llvm-svn: 158297
2012-06-10 20:35:00 +00:00
Nick Lewycky 3db143ea8c Reinstate the optimization from r151449 with a fix to not turn 'gep %x' into
'gep null' when the icmp predicate is unsigned (or is signed without inbounds).

llvm-svn: 151467
2012-02-26 02:09:49 +00:00
Nick Lewycky 7bbd72da46 Roll these back to r151448 until I figure out how they're breaking
MultiSource/Applications/lua.

llvm-svn: 151463
2012-02-25 23:01:19 +00:00
Nick Lewycky 51f2be8bff Teach instsimplify to be more aggressive when analyzing comparisons of pointers
by using llvm::isIdentifiedObject. Also teach it to handle GEPs that have
the same base pointer and constant operands. Fixes PR11238!

llvm-svn: 151449
2012-02-25 19:07:42 +00:00
Benjamin Kramer 6ee8690aa5 InstCombine: Don't transform a signed icmp of two GEPs into a signed compare of the indices.
This transformation is not safe in some pathological cases (signed icmp of pointers should be an
extremely rare thing, but it's valid IR!). Add an explanatory comment.

Kudos to Duncan for pointing out this edge case (and not giving up explaining it until I finally got it).

llvm-svn: 151055
2012-02-21 13:31:09 +00:00
Benjamin Kramer a00c5c451a Test case for r150978.
llvm-svn: 150979
2012-02-20 19:00:28 +00:00
Benjamin Kramer 7adb189538 InstCombine: When comparing two GEPs that were derived from the same base pointer but use different types, expand the offset calculation and to the compare on the offset if profitable.
This came up in SmallVector code.

llvm-svn: 150962
2012-02-20 15:07:47 +00:00
Rafael Espindola bb893fea6b Add r149110 back with a fix for when the vector and the int have the same
width.

llvm-svn: 149151
2012-01-27 23:33:07 +00:00
Rafael Espindola a4062624d1 Revert r149110 and add a testcase that was crashing since that revision.
Unfortunately I also had to disable constant-pool-sharing.ll the code it tests has been
updated to use the IL logic.

llvm-svn: 149148
2012-01-27 22:42:48 +00:00
Chris Lattner 111d6ee655 enhance constant folding to be able to constant fold bitcast of
ConstantVector's to integer type.

llvm-svn: 149110
2012-01-27 01:44:03 +00:00
Benjamin Kramer aca1885695 FileCheck hygiene.
llvm-svn: 147580
2012-01-05 00:43:34 +00:00
Pete Cooper fdddc27143 Improved fix for abs(val) != 0 to check other similar case. Also fixed style issues and confusing comment
llvm-svn: 145618
2011-12-01 19:13:26 +00:00
Pete Cooper 3b7f35bf08 Removed use of grep from test and moved it to be with other icmp tests
llvm-svn: 145570
2011-12-01 04:35:26 +00:00
Benjamin Kramer 9eca5feff1 PR10267: Don't combine an equality compare with an AND into an inequality compare when the AND has more than one use.
This can pessimize code, inequalities are generally more expensive.

llvm-svn: 134379
2011-07-04 20:16:36 +00:00
Benjamin Kramer c970849ea0 InstCombine: Fold A-b == C --> b == A-C if A and C are constants.
The backend already knew this trick.

llvm-svn: 132915
2011-06-13 15:24:24 +00:00
Benjamin Kramer 91f914ce21 InstCombine: Shrink ((zext X) & C1) == C2 to fold away the cast if the "zext" and the "and" have one use.
llvm-svn: 132897
2011-06-12 22:48:00 +00:00
Eli Friedman 8a20e66926 PR9838: Fix transform introduced in r127064 to not trigger when only one side of the icmp is an exact shift.
llvm-svn: 130954
2011-05-05 21:59:18 +00:00
Chris Lattner 1b06c71668 Transform: "icmp eq (trunc (lshr(X, cst1)), cst" to "icmp (and X, mask), cst"
when X has multiple uses.  This is useful for exposing secondary optimizations,
but the X86 backend isn't ready for this when X has a single use.  For example,
this can disable load folding.

This is inching towards resolving PR6627.

llvm-svn: 130238
2011-04-26 20:18:20 +00:00
Benjamin Kramer 1885d21700 Fix mistyped CHECK lines.
llvm-svn: 127366
2011-03-09 22:07:31 +00:00
Nick Lewycky ac55c79dd6 Tweak this test. We can analyze what happens and show that we still do the
right thing, instead of merely being unable to analyze and the transform
doesn't occur.

llvm-svn: 127149
2011-03-07 02:10:18 +00:00
Nick Lewycky e467979d0a Add more analysis of the sign bit of an srem instruction. If the LHS is negative
then the result could go either way. If it's provably positive then so is the
srem. Fixes PR9343 #7!

llvm-svn: 127146
2011-03-07 01:50:10 +00:00
Nick Lewycky 92db8e8e39 ConstantInt has some getters which return ConstantInt's or ConstantVector's of
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!

llvm-svn: 127116
2011-03-06 03:36:19 +00:00
Nick Lewycky 9719a719c7 Thread comparisons over udiv/sdiv/ashr/lshr exact and lshr nuw/nsw whenever
possible. This goes into instcombine and instsimplify because instsimplify
doesn't need to check hasOneUse since it returns (almost exclusively) constants.

This fixes PR9343 #4 #5 and #8!

llvm-svn: 127064
2011-03-05 05:19:11 +00:00