Commit Graph

7700 Commits

Author SHA1 Message Date
Benjamin Kramer 5b7a4e0195 Move "A | ~(A & ?) -> -1" from InstCombine to InstructionSimplify.
llvm-svn: 126082
2011-02-20 15:20:01 +00:00
Benjamin Kramer d5d7f37beb InstCombine: Add a bunch of combines of the form x | (y ^ z).
We usually catch this kind of optimization through InstSimplify's distributive
magic, but or doesn't distribute over xor in general.

"A | ~(A | B) -> A | ~B" hits 24 times on gcc.c.

llvm-svn: 126081
2011-02-20 13:23:43 +00:00
Nick Lewycky c8a1569950 Teach RecursivelyDeleteDeadPHINodes to handle multiple self-references. Patch
by Andrew Clinton!

llvm-svn: 126077
2011-02-20 08:38:20 +00:00
Nick Lewycky 080ea93779 Instead of keeping two Value*->id# mappings, keep one Value->Value mapping and
one Value set. This is faster because we only need to use the set when there
isn't already an entry in the map. No functionality change!

llvm-svn: 126076
2011-02-20 08:11:03 +00:00
Eli Friedman ef200db4fd PR9218: SimplifyDemandedVectorElts can return a non-null value that is not
the instruction passed in.  Make sure to account for this correctly, instead
of looping infinitely.

llvm-svn: 126058
2011-02-19 22:42:40 +00:00
Chris Lattner 72a35fb974 rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte
constant, including globals.  This makes us generate much more "pretty" pattern
globals as well because it doesn't break it down to an array of bytes all the
time.

This enables us to handle stores of relocatable globals.  This kicks in about
48 times in 254.gap, giving us stuff like this:

@.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*] [%struct.TypHeader* (%struct.TypHeader*, %struct
.TypHeader*)* @IsFalse, %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)* @IsFalse], align 16

...
  call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*]* @.memset_pattern40 to i8*
), i64 %tmp75) nounwind

llvm-svn: 126044
2011-02-19 19:56:44 +00:00
Chris Lattner 0f4a64011e Implement rdar://9009151, transforming strided loop stores of
unsplatable values into memset_pattern16 when it is available
(recent darwins).  This transforms lots of strided loop stores
of ints for example, like 5 in vpr:

  Formed memset:   call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25)
    from store to: {%3,+,4}<%11> at:   store i32 3, i32* %scevgep, align 4, !tbaa !4

llvm-svn: 126040
2011-02-19 19:31:39 +00:00
Chris Lattner e6b261fec5 Make loop-idiom use TargetLibraryInfo to determine whether it is allowed
to hack on memset, memcpy etc.

llvm-svn: 125974
2011-02-18 22:22:15 +00:00
Oscar Fuentes 5ed962656c Move library stuff out of the toplevel CMakeLists.txt file.
llvm-svn: 125968
2011-02-18 22:06:14 +00:00
Duncan Sands 84653b3674 Add some transforms of the kind X-Y>X -> 0>Y which are valid when there is no
overflow.  These subsume some existing equality transforms, so zap those.

llvm-svn: 125843
2011-02-18 16:25:37 +00:00
Chris Lattner 1a924e770a prevent jump threading from merging blocks when their address is
taken (and used!).  This prevents merging the blocks (invalidating
the block addresses) in a case like this:

#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })

void foo() {
  printf("%p\n", _THIS_IP_);
  printf("%p\n", _THIS_IP_);
  printf("%p\n", _THIS_IP_);
}

which fixes PR4151.

llvm-svn: 125829
2011-02-18 04:43:06 +00:00
Chris Lattner 4a14fbc50c Don't unroll loops whose header block's address is taken.
This is part of a futile attempt to not "break" bizzaro
code like this:

 l1:
  printf("l1: %p\n", &&l1);
  ++x;

  if( x < 3 ) goto l1;


Previously we'd fold &&l1 to 1, which is fine per our semantics
but not helpful to the user.

llvm-svn: 125827
2011-02-18 04:25:21 +00:00
Chris Lattner a8fed47eed have instcombine preserve nsw/nuw/exact when sinking
common operations through a phi. 

llvm-svn: 125790
2011-02-17 23:01:49 +00:00
Chris Lattner 75ae5a45ff fix typo
llvm-svn: 125787
2011-02-17 22:32:54 +00:00
Chris Lattner abb8eb2c63 fix instcombine merging GEPs through a PHI to only make the
result inbounds if all of the inputs are inbounds.

llvm-svn: 125785
2011-02-17 22:21:26 +00:00
Chris Lattner d406764d52 add is always integer, thanks to Frits for noticing this.
llvm-svn: 125774
2011-02-17 20:55:29 +00:00
Duncan Sands e522001171 Transform "A + B >= A + C" into "B >= C" if the adds do not wrap. Likewise for some
variations (some of these were already present so I unified the code).  Spotted by my
auto-simplifier as occurring a lot.

llvm-svn: 125734
2011-02-17 07:46:37 +00:00
Chris Lattner 5592071768 preserve NUW/NSW when transforming add x,x
llvm-svn: 125711
2011-02-17 02:23:02 +00:00
Chris Lattner 3eb0af94c4 fix PR9215, preventing -reassociate from clearing nsw/nuw when
it swaps the LHS/RHS of a single binop.

llvm-svn: 125700
2011-02-17 01:29:24 +00:00
Duncan Sands 75b5d27b84 Spelling fix: consequtive -> consecutive.
llvm-svn: 125563
2011-02-15 09:23:02 +00:00
Nadav Rotem 67d67a0385 Fix 9216 - Endless loop in InstCombine pass.
The pattern "A&(A^B) -> A & ~B" recreated itself because ~B is
actually a xor -1.

llvm-svn: 125557
2011-02-15 07:13:48 +00:00
Devang Patel 8d53ac81ec Do not forget DebugLoc!
llvm-svn: 125547
2011-02-15 02:02:30 +00:00
Chris Lattner 9f0ac0dd8b tidy up a bit.
llvm-svn: 125546
2011-02-15 01:56:08 +00:00
Chris Lattner 69229316aa convert ConstantVector::get to use ArrayRef.
llvm-svn: 125537
2011-02-15 00:14:00 +00:00
Devang Patel 3058398655 Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value that is modified inside loop.
llvm-svn: 125529
2011-02-14 23:03:23 +00:00
Chris Lattner 34442e6ebf revert my ConstantVector patch, it seems to have made the llvm-gcc
builders unhappy.

llvm-svn: 125504
2011-02-14 18:15:46 +00:00
Chris Lattner d9f5b88548 Switch ConstantVector::get to use ArrayRef instead of a pointer+size
idiom.  Change various clients to simplify their code.

llvm-svn: 125487
2011-02-14 07:55:32 +00:00
Chris Lattner 9bd7fdff58 remove a now-unneccesary cast.
llvm-svn: 125464
2011-02-13 18:30:09 +00:00
Chris Lattner 43273affb9 implement instcombine folding for things like (x >> c) < 42.
We were previously simplifying divisions, but not right shifts!

llvm-svn: 125454
2011-02-13 08:07:21 +00:00
Chris Lattner d369f575d7 refactor some code out into a helper method.
llvm-svn: 125451
2011-02-13 07:43:07 +00:00
Daniel Dunbar 210ce0feb5 SimplifyLibCalls: Add missing legalize check on various printf to puts and
putchar transforms, their return values are not compatible.

llvm-svn: 125442
2011-02-12 18:19:57 +00:00
Benjamin Kramer 1800d823de Also fold (A+B) == A -> B == 0 when the add is commuted.
llvm-svn: 125411
2011-02-11 21:46:48 +00:00
Chris Lattner d3c0e05f51 When lowering an inbounds gep, the intermediate adds can have
unsigned overflow (e.g. due to a negative array index), but
the scales on array size multiplications are known to not
sign wrap.

llvm-svn: 125409
2011-02-11 21:37:43 +00:00
Cameron Zwarich 99de19b3cb Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about
a loop when unswitching it. It only does this in the complex case, because
everything should be fine already in the simple case.

llvm-svn: 125369
2011-02-11 06:08:28 +00:00
Cameron Zwarich 25cb63c791 LoopInstSimplify preserves ScalarEvolution.
llvm-svn: 125368
2011-02-11 06:08:25 +00:00
Cameron Zwarich 97dae4d361 If we can't avoid running loop-simplify twice for now, at least avoid running
iv-users twice.

llvm-svn: 125318
2011-02-10 23:53:14 +00:00
Cameron Zwarich d8e66038f4 Rename 'loopsimplify' to 'loop-simplify'.
llvm-svn: 125317
2011-02-10 23:38:10 +00:00
Chris Lattner d86ded17ad implement the first part of PR8882: when lowering an inbounds
gep to explicit addressing, we know that none of the intermediate
computation overflows.

This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a 
negative index?

Previously the testcase would instcombine to:

define i1 @test(i64 %i) {
  %p1.idx.mask = and i64 %i, 4611686018427387903
  %cmp = icmp eq i64 %p1.idx.mask, 1000
  ret i1 %cmp
}

now we get:

define i1 @test(i64 %i) {
  %cmp = icmp eq i64 %i, 1000
  ret i1 %cmp
}

llvm-svn: 125271
2011-02-10 07:11:16 +00:00
Chris Lattner 6b657aed33 Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.

Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck.  I believe that this takes care of a bunch of related
code quality issues attached to PR8862.

llvm-svn: 125267
2011-02-10 05:36:31 +00:00
Chris Lattner 98457101fc Enhance the "compare with shift" and "compare with div"
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts.  For example, these (which
are the same except the first is 'exact' sdiv:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %A = sdiv exact i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %A = sdiv i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

compile down to:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %1 = icmp eq i64 %X, 0
  ret i1 %1
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %X.off = add i64 %X, 4
  %1 = icmp ult i64 %X.off, 9
  ret i1 %1
}

This happens when you do something like:
  (ptr1-ptr2) == 42

where the pointers are pointers to non-unit types.

llvm-svn: 125266
2011-02-10 05:23:05 +00:00
Chris Lattner dcef03fba2 more cleanups, notably bitcast isn't used for "signed to unsigned type
conversions". :)

llvm-svn: 125265
2011-02-10 05:17:27 +00:00
Chris Lattner 7d0e43ff8b A bunch of cleanups and simplifications using the new PatternMatch predicates
and generally tidying things up.  Only very trivial functionality changes
like now doing (-1 - A) -> (~A) for vectors too.

 InstCombineAddSub.cpp |  296 +++++++++++++++++++++-----------------------------
 1 file changed, 126 insertions(+), 170 deletions(-)

llvm-svn: 125264
2011-02-10 05:14:58 +00:00
Chris Lattner 768003c59e teach SimplifyDemandedBits that exact shifts demand the bits they
are shifting out since they do require them to be zeros.  Similarly
for NUW/NSW bits of shl

llvm-svn: 125263
2011-02-10 05:09:34 +00:00
Eric Christopher da6bd45088 Revert this in an attempt to bring the builders back.
llvm-svn: 125257
2011-02-10 01:48:24 +00:00
Cameron Zwarich 58c8670ab2 Turn this pass ordering:
Natural Loop Information
 Loop Pass Manager
   Canonicalize natural loops
 Scalar Evolution Analysis
 Loop Pass Manager
   Induction Variable Users
   Canonicalize natural loops
   Induction Variable Users
   Loop Strength Reduction

into this:

Scalar Evolution Analysis
Loop Pass Manager
  Canonicalize natural loops
  Induction Variable Users
  Loop Strength Reduction

This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of
thing automatically, but it seems easier to just change the ordering of the
passes if this is the only case.

llvm-svn: 125254
2011-02-10 01:07:54 +00:00
Chris Lattner 9e4aa0259f Teach instsimplify some tricks about exact/nuw/nsw shifts.
improve interfaces to instsimplify to take this info.

llvm-svn: 125196
2011-02-09 17:15:04 +00:00
Chris Lattner b940091388 Rework InstrTypes.h so to reduce the repetition around the NSW/NUW/Exact
versions of creation functions.  Eventually, the "insertion point" versions
of these should just be removed, we do have IRBuilder afterall.

Do a massive rewrite of much of pattern match.  It is now shorter and less
redundant and has several other widgets I will be using in other patches.
Among other changes, m_Div is renamed to m_IDiv (since it only matches 
integer divides) and m_Shift is gone (it used to match all binops!!) and
we now have m_LogicalShift for the one client to use.

Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv
and reduce redundancy within IRbuilder by having these methods chain to
each other more instead of duplicating code.

llvm-svn: 125194
2011-02-09 17:00:45 +00:00
Nick Lewycky 292e78c3cd When removing a function from the function set and adding it to deferred, we
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.

Also add some debugging statements.

llvm-svn: 125180
2011-02-09 06:32:02 +00:00
Dan Gohman de7f699754 Don't split any loop backedges, including backedges of loops other than
the active loop. This is generally desirable, and it avoids trouble
in situations such as the testcase in PR9123, though the failure
mode depends on use-list order, so it is infeasible to test.

llvm-svn: 125065
2011-02-08 00:55:13 +00:00
Benjamin Kramer 8d6a8c130b SimplifyCFG: Track the number of used icmps when turning a icmp chain into a switch. If we used only one icmp, don't turn it into a switch.
Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler.

llvm-svn: 125056
2011-02-07 22:37:28 +00:00