Commit Graph

7891 Commits

Author SHA1 Message Date
Cameron Zwarich 47e7175fe9 Check for TLI so that -codegenprepare can be used from opt.
llvm-svn: 128194
2011-03-24 04:51:51 +00:00
Cameron Zwarich 10ebc189ee Fix PR9464 by correcting some math that just happened to be right in most cases
that were hit in practice.

llvm-svn: 128146
2011-03-23 05:25:55 +00:00
Anders Carlsson 1cc8073bb3 Handle another case that Frits suggested.
llvm-svn: 128068
2011-03-22 03:21:01 +00:00
Devang Patel 17bbd7f495 Simplify.
llvm-svn: 128030
2011-03-21 22:04:45 +00:00
Anders Carlsson 4dd420f193 More cleanups to the OptimizeEmptyGlobalCXXDtors GlobalOpt function.
llvm-svn: 127997
2011-03-21 14:54:40 +00:00
Anders Carlsson 701822a48e As suggested by Nick Lewycky, ignore debugging intrinsics when trying to decide whether a destructor is empty or not.
llvm-svn: 127985
2011-03-21 02:42:27 +00:00
Nick Lewycky d078183725 Fix comments
llvm-svn: 127984
2011-03-21 02:26:01 +00:00
Evan Cheng 0663f23bd8 Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified.
llvm-svn: 127981
2011-03-21 01:19:09 +00:00
Anders Carlsson 336fd90f4d Don't try to eliminate invokes to __cxa_atexit.
llvm-svn: 127976
2011-03-20 20:21:33 +00:00
Anders Carlsson fcec2f519a Don't segfault on mutual recursion, as pointed out by Frits.
llvm-svn: 127975
2011-03-20 20:16:43 +00:00
Anders Carlsson 48a44911d3 Address comments from Frits van Bommel.
llvm-svn: 127974
2011-03-20 19:51:13 +00:00
Anders Carlsson ee6bc70d2f Add an optimization to GlobalOpt that eliminates calls to __cxa_atexit, if the function passed is empty.
llvm-svn: 127970
2011-03-20 17:59:11 +00:00
Daniel Dunbar 327cd36f74 Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors
to canonicalize IR", it broke a lot of things.

llvm-svn: 127954
2011-03-19 21:47:14 +00:00
Evan Cheng 824a711305 SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433

llvm-svn: 127953
2011-03-19 17:17:39 +00:00
Devang Patel 2c7ee2700c If an AllocaInst referred by DbgDeclareInst is used by a LoadInst then the LoadInst should also get a corresponding llvm.dbg.value intrinsic.
llvm-svn: 127924
2011-03-18 23:45:43 +00:00
Devang Patel 3ac171d49a Remove dead code.
llvm-svn: 127923
2011-03-18 23:33:58 +00:00
Devang Patel c1431e6e84 Consider debug info intrinsics pointing to null value as dead instructions.
llvm-svn: 127922
2011-03-18 23:28:02 +00:00
Andrew Trick f8f67f0188 Remove TargetData and ValueTracking includes. I didn't mean for them to sneak in my last checkin.
llvm-svn: 127842
2011-03-18 00:36:39 +00:00
Andrew Trick 87716c93c2 Added isValidRewrite() to check the result of ScalarEvolutionExpander.
SCEV may generate expressions composed of multiple pointers, which can
lead to invalid GEP expansion. Until we can teach SCEV to follow strict
pointer rules, make sure no bad GEPs creep into IR.
Fixes rdar://problem/9038671.

llvm-svn: 127839
2011-03-17 23:51:11 +00:00
Andrew Trick e44f0d94f6 whitespace
llvm-svn: 127837
2011-03-17 23:46:48 +00:00
Devang Patel aad34d882d Try to not lose variable's debug info during instcombine.
This is done by lowering dbg.declare intrinsic into dbg.value intrinsic.
Radar 9143931.

llvm-svn: 127834
2011-03-17 22:18:16 +00:00
Devang Patel 8c0b16b0aa Refactor into a separate utility function.
llvm-svn: 127832
2011-03-17 21:58:19 +00:00
Cameron Zwarich 7599b106b7 Fix a comment.
llvm-svn: 127728
2011-03-16 08:13:42 +00:00
Cameron Zwarich 0454253d7a Only convert allocas to scalars if it is profitable. The profitability metric I
chose is having a non-memcpy/memset use and being larger than any native integer
type. Originally I chose having an access of a size smaller than the total size
of the alloca, but this caused some minor issues on the spirit benchmark where
SRoA runs again after some inlining.

This fixes <rdar://problem/8613163>.

llvm-svn: 127718
2011-03-16 00:13:44 +00:00
Cameron Zwarich b51c830f7c Better use initializer lists.
llvm-svn: 127716
2011-03-16 00:13:37 +00:00
Cameron Zwarich 63062ccf85 Add a clarifying comment.
llvm-svn: 127715
2011-03-16 00:13:35 +00:00
Cameron Zwarich dbb27393cc Clean up something noticed by Fritz.
llvm-svn: 127684
2011-03-15 18:42:33 +00:00
Cameron Zwarich 0b8cdfb6ec Do not add PHIs with no users when creating LCSSA form. Patch by Andrew Clinton.
llvm-svn: 127674
2011-03-15 07:41:25 +00:00
Eli Friedman c4414c6e92 PR9450: Make switch optimization in SimplifyCFG not dependent on the ordering
of pointers in an std::map.

llvm-svn: 127650
2011-03-15 02:23:35 +00:00
Eric Christopher 2139d3148f If we don't know how long a string is we can't fold an _chk version to the
normal version.

Fixes rdar://9123638

llvm-svn: 127636
2011-03-15 00:25:41 +00:00
Andrew Trick 8b55b736b1 Added SCEV::NoWrapFlags to manage unsigned, signed, and self wrap
properties.
Added the self-wrap flag for SCEV::AddRecExpr.
A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag
without changing behavior in this revision.

llvm-svn: 127590
2011-03-14 16:50:06 +00:00
Andrew Trick 328b223bb1 whitespace
llvm-svn: 127589
2011-03-14 16:48:10 +00:00
Jin-Gu Kang b452db02f0 This case is solved by Scalar Replacement of Aggregates (DT) and
Early CSE pass so this patch reverts it to original source code.

llvm-svn: 127574
2011-03-14 01:21:00 +00:00
Jin-Gu Kang b7538c71e1 Add comment as following:
load and store reference same memory location, the memory location
is represented by getelementptr with two uses (load and store) and
the getelementptr's base is alloca with single use. At this point,
instructions from alloca to store can be removed.
(this pattern is generated when bitfield is accessed.)
For example,
%u = alloca %struct.test, align 4               ; [#uses=1]
%0 = getelementptr inbounds %struct.test* %u, i32 0, i32 0;[#uses=2]
%1 = load i8* %0, align 4                       ; [#uses=1]
%2 = and i8 %1, -16                             ; [#uses=1]
%3 = or i8 %2, 5                                ; [#uses=1]
store i8 %3, i8* %0, align 4

llvm-svn: 127565
2011-03-13 14:05:51 +00:00
Jin-Gu Kang 2e939f7c3c This patch removes some of useless instructions generated by bitfield access.
llvm-svn: 127539
2011-03-12 12:18:44 +00:00
Cameron Zwarich 338d362200 Roll r127459 back in:
Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

llvm-svn: 127498
2011-03-11 21:52:04 +00:00
Daniel Dunbar 94ccb27b43 Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get
created from the", it broke some GCC test suite tests.

llvm-svn: 127477
2011-03-11 19:30:30 +00:00
Benjamin Kramer 51897bcd3e InstCombine: Fix a thinko where transform an icmp under the assumption that it's a zero comparison when it's not.
Fixes PR9454.

llvm-svn: 127464
2011-03-11 11:37:40 +00:00
Cameron Zwarich cc27b3acc4 Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

llvm-svn: 127459
2011-03-11 04:54:27 +00:00
Dan Gohman affbc66f60 RecursivelyDeleteTriviallyDeadInstructions only needs a
Value, not an Instruction, so casting is not necessary. Also,
it's theoretically possible that the Value is not an
Instruction, since WeakVH follows RAUWs.

llvm-svn: 127427
2011-03-10 20:57:44 +00:00
Dan Gohman 154ed49784 Fix reassociate to postpone certain instruction deletions until
after it has finished all of its reassociations, because its
habit of unlinking operands and holding them in a datastructure
while working means that it's not easy to determine when an
instruction is really dead until after all its regular work is
done. rdar://9096268.

llvm-svn: 127424
2011-03-10 19:51:54 +00:00
Benjamin Kramer b49b964b98 InstCombine: Turn umul_with_overflow into mul nuw if we can prove that it cannot overflow.
This happens a lot in clang-compiled C++ code because it adds overflow checks to operator new[]:
  unsigned *foo(unsigned n) { return new unsigned[n]; }
We can optimize away the overflow check on 64 bit targets because (uint64_t)n*4 cannot overflow.

llvm-svn: 127418
2011-03-10 18:40:14 +00:00
Devang Patel 13f8c7d48e Preserve line number information while simplifying libcalls.
llvm-svn: 127362
2011-03-09 21:27:52 +00:00
Devang Patel a10794ab7b These llvm.dbg.* constants are not used anymore.
llvm-svn: 127352
2011-03-09 19:41:33 +00:00
Cameron Zwarich 19f2b3c652 Fix a crasher introduced by r127317 that is seen on the bots when using an
alloca as both integer and floating-point vectors of the same size. Bugpoint is
not cooperating with me, but I'll try to find a manual testcase tomorrow.

llvm-svn: 127320
2011-03-09 07:34:11 +00:00
Cameron Zwarich 3b649f4d01 Add support to scalar replacement for partial vector accesses of an alloca, e.g.
a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the
use of vector intrinsics, especially in NEON when programmers know the layout of
the register file. This enables codegen to eliminate a lot of the subregister
traffic it would otherwise generate.

This commit only enables this for a small number of floating-point cases, but a
lot more integer cases. I assume this is okay for all ports, but I did not do
extensive testing of the quality of code involving i512 vectors and the like. If
there is a use case where this generates worse code than before, let me know and
we can scale it back.

This fixes <rdar://problem/9036264>.

llvm-svn: 127317
2011-03-09 05:43:05 +00:00
Cameron Zwarich 43a241fa06 Move vector type merging to a separate function in preparation for it getting
more complicated.

llvm-svn: 127316
2011-03-09 05:43:01 +00:00
Eli Friedman a81a82dcaf PR9346: Prevent SimplifyDemandedBits from incorrectly introducing
INT_MIN % -1.

llvm-svn: 127306
2011-03-09 01:28:35 +00:00
Eli Friedman aac35b3fbb PR9420; an instruction before an unreachable is guaranteed not to have any
reachable uses, but there still might be uses in dead blocks.  Use the
standard solution of replacing all the uses with undef.  This is
a rare case because it's very sensitive to phase ordering in SimplifyCFG.

llvm-svn: 127299
2011-03-09 00:48:33 +00:00
Devang Patel fbb482b314 llvm.dbg.declare intrinsic does not use any llvm::Values. It's magic!
llvm-svn: 127282
2011-03-08 22:12:11 +00:00
Nick Lewycky afc8098c9e Reorder comments to put them the right way around.
llvm-svn: 127220
2011-03-08 06:29:47 +00:00
Devang Patel 97d0be8ee1 While sinking an instruction, do not lose llvm.dbg.value intrinsic.
llvm-svn: 127214
2011-03-08 03:06:19 +00:00
Devang Patel d00c628f8f Preserve line no. info.
Radar 9097659

llvm-svn: 127182
2011-03-07 22:43:45 +00:00
Nick Lewycky e467979d0a Add more analysis of the sign bit of an srem instruction. If the LHS is negative
then the result could go either way. If it's provably positive then so is the
srem. Fixes PR9343 #7!

llvm-svn: 127146
2011-03-07 01:50:10 +00:00
Rafael Espindola 871cfde1c2 Don't internalize available_externally functions. We already did the right
thing for variables.

llvm-svn: 127138
2011-03-06 23:41:34 +00:00
Nick Lewycky 92db8e8e39 ConstantInt has some getters which return ConstantInt's or ConstantVector's of
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!

llvm-svn: 127116
2011-03-06 03:36:19 +00:00
Benjamin Kramer 08c913b6e6 InstCombine: We know the number of items initially added to the worklist map, reserve space early to avoid rehashing.
llvm-svn: 127089
2011-03-05 16:43:46 +00:00
Cameron Zwarich 13c885d193 Fix PR9398 - 10% of llc compile time is spent in Value::getNumUses. This reduces
the percentage of time spent in CodeGenPrepare when llcing 403.gcc from 12.6% to
1.8% of total llc time.

llvm-svn: 127069
2011-03-05 08:12:26 +00:00
Nick Lewycky 9719a719c7 Thread comparisons over udiv/sdiv/ashr/lshr exact and lshr nuw/nsw whenever
possible. This goes into instcombine and instsimplify because instsimplify
doesn't need to check hasOneUse since it returns (almost exclusively) constants.

This fixes PR9343 #4 #5 and #8!

llvm-svn: 127064
2011-03-05 05:19:11 +00:00
Nick Lewycky 25cc338d88 Try once again to optimize "icmp (srem X, Y), Y" by turning the comparison into
true/false or "icmp slt/sge Y, 0".

llvm-svn: 127063
2011-03-05 04:28:48 +00:00
Jakob Stoklund Olesen e2017b6f2e DenseMap<uintptr_t,...> doesn't allow all values as keys.
Avoid colliding with the sentinels, hopefully unbreaking
llvm-gcc-x86_64-linux-selfhost.

llvm-svn: 126982
2011-03-04 02:48:56 +00:00
Richard Osborne 5003782293 Fix typo in comment.
llvm-svn: 126941
2011-03-03 14:21:22 +00:00
Richard Osborne af52c52569 Optimize fprintf -> iprintf if there are no floating point arguments
and siprintf is available on the target.

llvm-svn: 126940
2011-03-03 14:20:22 +00:00
Richard Osborne 2dfb888392 Optimize sprintf -> siprintf if there are no floating point arguments
and siprintf is available on the target.

llvm-svn: 126937
2011-03-03 14:09:28 +00:00
Richard Osborne 815de536e5 Optimize printf -> iprintf if there are no floating point arguments
and iprintf is available on the target. Currently iprintf is only
marked as being available on the XCore.

llvm-svn: 126935
2011-03-03 13:17:51 +00:00
Cameron Zwarich 86ade9510f Remove some more unused code that I missed.
llvm-svn: 126826
2011-03-02 03:48:29 +00:00
Cameron Zwarich 5dd2aa2615 Eliminate the unused CodeGenPrepare option to split critical edges.
llvm-svn: 126825
2011-03-02 03:31:46 +00:00
Cameron Zwarich b7f8eaafa3 Stop computing the number of uses twice per value in CodeGenPrepare's sinking of
addressing code. On 403.gcc this almost halves CodeGenPrepare time and reduces
total llc time by 9.5%. Unfortunately, getNumUses() is still the hottest function
in llc.

llvm-svn: 126782
2011-03-01 21:13:53 +00:00
Anders Carlsson da80afef99 Make InstCombiner::FoldAndOfICmps create a ConstantRange that's the
intersection of the LHS and RHS ConstantRanges and return "false" when
the range is empty.

This simplifies some code and catches some extra cases.

llvm-svn: 126744
2011-03-01 15:05:01 +00:00
Eli Friedman 683bbc16c4 Add an obvious missing safety check to DAE::RemoveDeadArgumentsFromCallers.
llvm-svn: 126720
2011-03-01 00:33:47 +00:00
Ted Kremenek 20164dcc68 Unbreak CMake build.
llvm-svn: 126715
2011-02-28 23:56:33 +00:00
Chris Lattner 1ac5e0c5c6 update cmake
llvm-svn: 126694
2011-02-28 22:45:25 +00:00
Dan Gohman 06d70015ce Delete the GEPSplitter experiment.
llvm-svn: 126671
2011-02-28 19:47:47 +00:00
Dan Gohman b8a25f49f3 Delete the SimplifyHalfPowrLibCalls pass, which was unused, and
only existed as the result of a misunderstanding.

llvm-svn: 126669
2011-02-28 19:41:14 +00:00
Frits van Bommel 8ae07996c9 Teach SimplifyCFG that (switch (select cond, X, Y)) is better expressed as a branch.
Based on a patch by Alistair Lynn.

llvm-svn: 126647
2011-02-28 09:44:07 +00:00
Nick Lewycky 66f4f22f7b srem doesn't actually have the same resulting sign as its numerator, you could
also have a zero when numerator = denominator. Reverts parts of r126635 and
r126637.

llvm-svn: 126644
2011-02-28 09:17:39 +00:00
Nick Lewycky 174a705497 Teach InstCombine to fold "(shr exact X, Y) == 0" --> X == 0, fixing #1 from
PR9343.

llvm-svn: 126643
2011-02-28 08:31:40 +00:00
Nick Lewycky 6b445419b0 The sign of an srem instruction is the sign of its dividend (the first
argument), regardless of the divisor. Teach instcombine about this and fix
test7 in PR9343!

llvm-svn: 126635
2011-02-28 06:20:05 +00:00
Benjamin Kramer ceb5daa567 Revert "SimplifyCFG: GEPs with just one non-constant index are also cheap."
Yes, there are other types than i8* and GEPs on them can produce an add+multiply.
We don't consider that cheap enough to be speculatively executed.

llvm-svn: 126481
2011-02-25 10:33:33 +00:00
Benjamin Kramer dfdca1a14d SimplifyCFG: GEPs with just one non-constant index are also cheap.
llvm-svn: 126452
2011-02-24 23:26:09 +00:00
Benjamin Kramer 27361a7124 SimplifyCFG: GEPs with constant indices are cheap enough to be executed unconditionally.
llvm-svn: 126445
2011-02-24 22:46:11 +00:00
Devang Patel cedf928743 Do not use DIFactory. Use DIBuilder.
llvm-svn: 126398
2011-02-24 18:49:55 +00:00
Chris Lattner eddb33ebd0 wire TargetLibraryInfo into simplify libcalls and use it in a couple of
trivial places.  This pass needs a lot of work.

llvm-svn: 126367
2011-02-24 07:16:14 +00:00
Chris Lattner 2e56e20662 move a massive amount of code out into its own helper function
to reduce nesting.  This needs to be turned into a table.

llvm-svn: 126366
2011-02-24 07:12:12 +00:00
Chris Lattner adf38b3e09 change instcombine to not turn a call to non-varargs bitcast of
function prototype into a call to a varargs prototype.  We do
allow the xform if we have a definition, but otherwise we don't
want to risk that we're changing the abi in a subtle way.  On
X86-64, for example, varargs require passing stuff in %al.

llvm-svn: 126363
2011-02-24 05:10:56 +00:00
Cameron Zwarich 826308586c Make LoopDeletion work on loops with multiple edges, as long as the incoming
values from all of the loop's exiting blocks are equal. Patch by Andrew Clinton.

llvm-svn: 126253
2011-02-22 22:25:39 +00:00
Duncan Sands ecbbf0825b If the phi node was used by an unreachable instruction that ends up using
itself without going via a phi node then we could return false here in
spite of making a change.  Also, tweak the comment because this method
can (and always could) return true without deleting the original phi node.
For example, if the phi node was used by a read-only invoke instruction
which is used by another phi node phi2 which is only used by and only uses
the invoke, then phi2 would be deleted but not the invoke instruction and
not the original phi node.

llvm-svn: 126129
2011-02-21 17:32:05 +00:00
Chris Lattner 2333ac279f fix a crasher in disabled code (on variable stride loops)
llvm-svn: 126125
2011-02-21 17:02:55 +00:00
Duncan Sands 6dcd49bc2b Simplify RecursivelyDeleteDeadPHINode. The only functionality change
should be that if the phi is used by a side-effect free instruction with
no uses then the phi and the instruction now get zapped (checked by the
unittest).

llvm-svn: 126124
2011-02-21 16:27:36 +00:00
Chris Lattner bc661d6686 Add some (disabled code) to print out negative strides.
llvm-svn: 126102
2011-02-21 02:08:54 +00:00
Nick Lewycky 183c24c51b Make RecursivelyDeleteDeadPHINode delete a phi node that has no users and add a
test for that. With this change, test/CodeGen/X86/codegen-dce.ll no longer finds
any instructions to DCE, so delete the test.

Also renamed J and JP to I and IP in RecursivelyDeleteDeadPHINode.

llvm-svn: 126088
2011-02-20 18:05:56 +00:00
Benjamin Kramer 5b7a4e0195 Move "A | ~(A & ?) -> -1" from InstCombine to InstructionSimplify.
llvm-svn: 126082
2011-02-20 15:20:01 +00:00
Benjamin Kramer d5d7f37beb InstCombine: Add a bunch of combines of the form x | (y ^ z).
We usually catch this kind of optimization through InstSimplify's distributive
magic, but or doesn't distribute over xor in general.

"A | ~(A | B) -> A | ~B" hits 24 times on gcc.c.

llvm-svn: 126081
2011-02-20 13:23:43 +00:00
Nick Lewycky c8a1569950 Teach RecursivelyDeleteDeadPHINodes to handle multiple self-references. Patch
by Andrew Clinton!

llvm-svn: 126077
2011-02-20 08:38:20 +00:00
Nick Lewycky 080ea93779 Instead of keeping two Value*->id# mappings, keep one Value->Value mapping and
one Value set. This is faster because we only need to use the set when there
isn't already an entry in the map. No functionality change!

llvm-svn: 126076
2011-02-20 08:11:03 +00:00
Eli Friedman ef200db4fd PR9218: SimplifyDemandedVectorElts can return a non-null value that is not
the instruction passed in.  Make sure to account for this correctly, instead
of looping infinitely.

llvm-svn: 126058
2011-02-19 22:42:40 +00:00
Chris Lattner 72a35fb974 rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte
constant, including globals.  This makes us generate much more "pretty" pattern
globals as well because it doesn't break it down to an array of bytes all the
time.

This enables us to handle stores of relocatable globals.  This kicks in about
48 times in 254.gap, giving us stuff like this:

@.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*] [%struct.TypHeader* (%struct.TypHeader*, %struct
.TypHeader*)* @IsFalse, %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)* @IsFalse], align 16

...
  call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*]* @.memset_pattern40 to i8*
), i64 %tmp75) nounwind

llvm-svn: 126044
2011-02-19 19:56:44 +00:00
Chris Lattner 0f4a64011e Implement rdar://9009151, transforming strided loop stores of
unsplatable values into memset_pattern16 when it is available
(recent darwins).  This transforms lots of strided loop stores
of ints for example, like 5 in vpr:

  Formed memset:   call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25)
    from store to: {%3,+,4}<%11> at:   store i32 3, i32* %scevgep, align 4, !tbaa !4

llvm-svn: 126040
2011-02-19 19:31:39 +00:00
Chris Lattner e6b261fec5 Make loop-idiom use TargetLibraryInfo to determine whether it is allowed
to hack on memset, memcpy etc.

llvm-svn: 125974
2011-02-18 22:22:15 +00:00
Oscar Fuentes 5ed962656c Move library stuff out of the toplevel CMakeLists.txt file.
llvm-svn: 125968
2011-02-18 22:06:14 +00:00
Duncan Sands 84653b3674 Add some transforms of the kind X-Y>X -> 0>Y which are valid when there is no
overflow.  These subsume some existing equality transforms, so zap those.

llvm-svn: 125843
2011-02-18 16:25:37 +00:00
Chris Lattner 1a924e770a prevent jump threading from merging blocks when their address is
taken (and used!).  This prevents merging the blocks (invalidating
the block addresses) in a case like this:

#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })

void foo() {
  printf("%p\n", _THIS_IP_);
  printf("%p\n", _THIS_IP_);
  printf("%p\n", _THIS_IP_);
}

which fixes PR4151.

llvm-svn: 125829
2011-02-18 04:43:06 +00:00
Chris Lattner 4a14fbc50c Don't unroll loops whose header block's address is taken.
This is part of a futile attempt to not "break" bizzaro
code like this:

 l1:
  printf("l1: %p\n", &&l1);
  ++x;

  if( x < 3 ) goto l1;


Previously we'd fold &&l1 to 1, which is fine per our semantics
but not helpful to the user.

llvm-svn: 125827
2011-02-18 04:25:21 +00:00
Chris Lattner a8fed47eed have instcombine preserve nsw/nuw/exact when sinking
common operations through a phi. 

llvm-svn: 125790
2011-02-17 23:01:49 +00:00
Chris Lattner 75ae5a45ff fix typo
llvm-svn: 125787
2011-02-17 22:32:54 +00:00
Chris Lattner abb8eb2c63 fix instcombine merging GEPs through a PHI to only make the
result inbounds if all of the inputs are inbounds.

llvm-svn: 125785
2011-02-17 22:21:26 +00:00
Chris Lattner d406764d52 add is always integer, thanks to Frits for noticing this.
llvm-svn: 125774
2011-02-17 20:55:29 +00:00
Duncan Sands e522001171 Transform "A + B >= A + C" into "B >= C" if the adds do not wrap. Likewise for some
variations (some of these were already present so I unified the code).  Spotted by my
auto-simplifier as occurring a lot.

llvm-svn: 125734
2011-02-17 07:46:37 +00:00
Chris Lattner 5592071768 preserve NUW/NSW when transforming add x,x
llvm-svn: 125711
2011-02-17 02:23:02 +00:00
Chris Lattner 3eb0af94c4 fix PR9215, preventing -reassociate from clearing nsw/nuw when
it swaps the LHS/RHS of a single binop.

llvm-svn: 125700
2011-02-17 01:29:24 +00:00
Duncan Sands 75b5d27b84 Spelling fix: consequtive -> consecutive.
llvm-svn: 125563
2011-02-15 09:23:02 +00:00
Nadav Rotem 67d67a0385 Fix 9216 - Endless loop in InstCombine pass.
The pattern "A&(A^B) -> A & ~B" recreated itself because ~B is
actually a xor -1.

llvm-svn: 125557
2011-02-15 07:13:48 +00:00
Devang Patel 8d53ac81ec Do not forget DebugLoc!
llvm-svn: 125547
2011-02-15 02:02:30 +00:00
Chris Lattner 9f0ac0dd8b tidy up a bit.
llvm-svn: 125546
2011-02-15 01:56:08 +00:00
Chris Lattner 69229316aa convert ConstantVector::get to use ArrayRef.
llvm-svn: 125537
2011-02-15 00:14:00 +00:00
Devang Patel 3058398655 Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value that is modified inside loop.
llvm-svn: 125529
2011-02-14 23:03:23 +00:00
Chris Lattner 34442e6ebf revert my ConstantVector patch, it seems to have made the llvm-gcc
builders unhappy.

llvm-svn: 125504
2011-02-14 18:15:46 +00:00
Chris Lattner d9f5b88548 Switch ConstantVector::get to use ArrayRef instead of a pointer+size
idiom.  Change various clients to simplify their code.

llvm-svn: 125487
2011-02-14 07:55:32 +00:00
Chris Lattner 9bd7fdff58 remove a now-unneccesary cast.
llvm-svn: 125464
2011-02-13 18:30:09 +00:00
Chris Lattner 43273affb9 implement instcombine folding for things like (x >> c) < 42.
We were previously simplifying divisions, but not right shifts!

llvm-svn: 125454
2011-02-13 08:07:21 +00:00
Chris Lattner d369f575d7 refactor some code out into a helper method.
llvm-svn: 125451
2011-02-13 07:43:07 +00:00
Daniel Dunbar 210ce0feb5 SimplifyLibCalls: Add missing legalize check on various printf to puts and
putchar transforms, their return values are not compatible.

llvm-svn: 125442
2011-02-12 18:19:57 +00:00
Benjamin Kramer 1800d823de Also fold (A+B) == A -> B == 0 when the add is commuted.
llvm-svn: 125411
2011-02-11 21:46:48 +00:00
Chris Lattner d3c0e05f51 When lowering an inbounds gep, the intermediate adds can have
unsigned overflow (e.g. due to a negative array index), but
the scales on array size multiplications are known to not
sign wrap.

llvm-svn: 125409
2011-02-11 21:37:43 +00:00
Cameron Zwarich 99de19b3cb Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about
a loop when unswitching it. It only does this in the complex case, because
everything should be fine already in the simple case.

llvm-svn: 125369
2011-02-11 06:08:28 +00:00
Cameron Zwarich 25cb63c791 LoopInstSimplify preserves ScalarEvolution.
llvm-svn: 125368
2011-02-11 06:08:25 +00:00
Cameron Zwarich 97dae4d361 If we can't avoid running loop-simplify twice for now, at least avoid running
iv-users twice.

llvm-svn: 125318
2011-02-10 23:53:14 +00:00
Cameron Zwarich d8e66038f4 Rename 'loopsimplify' to 'loop-simplify'.
llvm-svn: 125317
2011-02-10 23:38:10 +00:00
Chris Lattner d86ded17ad implement the first part of PR8882: when lowering an inbounds
gep to explicit addressing, we know that none of the intermediate
computation overflows.

This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a 
negative index?

Previously the testcase would instcombine to:

define i1 @test(i64 %i) {
  %p1.idx.mask = and i64 %i, 4611686018427387903
  %cmp = icmp eq i64 %p1.idx.mask, 1000
  ret i1 %cmp
}

now we get:

define i1 @test(i64 %i) {
  %cmp = icmp eq i64 %i, 1000
  ret i1 %cmp
}

llvm-svn: 125271
2011-02-10 07:11:16 +00:00
Chris Lattner 6b657aed33 Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.

Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck.  I believe that this takes care of a bunch of related
code quality issues attached to PR8862.

llvm-svn: 125267
2011-02-10 05:36:31 +00:00
Chris Lattner 98457101fc Enhance the "compare with shift" and "compare with div"
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts.  For example, these (which
are the same except the first is 'exact' sdiv:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %A = sdiv exact i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %A = sdiv i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

compile down to:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %1 = icmp eq i64 %X, 0
  ret i1 %1
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %X.off = add i64 %X, 4
  %1 = icmp ult i64 %X.off, 9
  ret i1 %1
}

This happens when you do something like:
  (ptr1-ptr2) == 42

where the pointers are pointers to non-unit types.

llvm-svn: 125266
2011-02-10 05:23:05 +00:00
Chris Lattner dcef03fba2 more cleanups, notably bitcast isn't used for "signed to unsigned type
conversions". :)

llvm-svn: 125265
2011-02-10 05:17:27 +00:00
Chris Lattner 7d0e43ff8b A bunch of cleanups and simplifications using the new PatternMatch predicates
and generally tidying things up.  Only very trivial functionality changes
like now doing (-1 - A) -> (~A) for vectors too.

 InstCombineAddSub.cpp |  296 +++++++++++++++++++++-----------------------------
 1 file changed, 126 insertions(+), 170 deletions(-)

llvm-svn: 125264
2011-02-10 05:14:58 +00:00
Chris Lattner 768003c59e teach SimplifyDemandedBits that exact shifts demand the bits they
are shifting out since they do require them to be zeros.  Similarly
for NUW/NSW bits of shl

llvm-svn: 125263
2011-02-10 05:09:34 +00:00
Eric Christopher da6bd45088 Revert this in an attempt to bring the builders back.
llvm-svn: 125257
2011-02-10 01:48:24 +00:00
Cameron Zwarich 58c8670ab2 Turn this pass ordering:
Natural Loop Information
 Loop Pass Manager
   Canonicalize natural loops
 Scalar Evolution Analysis
 Loop Pass Manager
   Induction Variable Users
   Canonicalize natural loops
   Induction Variable Users
   Loop Strength Reduction

into this:

Scalar Evolution Analysis
Loop Pass Manager
  Canonicalize natural loops
  Induction Variable Users
  Loop Strength Reduction

This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of
thing automatically, but it seems easier to just change the ordering of the
passes if this is the only case.

llvm-svn: 125254
2011-02-10 01:07:54 +00:00
Chris Lattner 9e4aa0259f Teach instsimplify some tricks about exact/nuw/nsw shifts.
improve interfaces to instsimplify to take this info.

llvm-svn: 125196
2011-02-09 17:15:04 +00:00
Chris Lattner b940091388 Rework InstrTypes.h so to reduce the repetition around the NSW/NUW/Exact
versions of creation functions.  Eventually, the "insertion point" versions
of these should just be removed, we do have IRBuilder afterall.

Do a massive rewrite of much of pattern match.  It is now shorter and less
redundant and has several other widgets I will be using in other patches.
Among other changes, m_Div is renamed to m_IDiv (since it only matches 
integer divides) and m_Shift is gone (it used to match all binops!!) and
we now have m_LogicalShift for the one client to use.

Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv
and reduce redundancy within IRbuilder by having these methods chain to
each other more instead of duplicating code.

llvm-svn: 125194
2011-02-09 17:00:45 +00:00
Nick Lewycky 292e78c3cd When removing a function from the function set and adding it to deferred, we
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.

Also add some debugging statements.

llvm-svn: 125180
2011-02-09 06:32:02 +00:00
Dan Gohman de7f699754 Don't split any loop backedges, including backedges of loops other than
the active loop. This is generally desirable, and it avoids trouble
in situations such as the testcase in PR9123, though the failure
mode depends on use-list order, so it is infeasible to test.

llvm-svn: 125065
2011-02-08 00:55:13 +00:00
Benjamin Kramer 8d6a8c130b SimplifyCFG: Track the number of used icmps when turning a icmp chain into a switch. If we used only one icmp, don't turn it into a switch.
Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler.

llvm-svn: 125056
2011-02-07 22:37:28 +00:00
Chris Lattner 35315d065b enhance vmcore to know that udiv's can be exact, and add a trivial
instcombine xform to exercise this.

Nothing forms exact udivs yet though.  This is progress on PR8862

llvm-svn: 124992
2011-02-06 21:44:57 +00:00
Nick Lewycky cb1a4c26ee Simplify away redundant test, and document what's going on.
llvm-svn: 124977
2011-02-06 05:04:00 +00:00
Nick Lewycky f8797fda44 Remove specialized comparison of InlineAsm objects. They're uniqued on creation
now, and this wasn't comparing some of their relevant bits anyhow.

llvm-svn: 124976
2011-02-06 04:33:50 +00:00
Benjamin Kramer 62aa46b852 SimplifyCFG: Also transform switches that represent a range comparison but are not sorted into sub+icmp.
This transforms another 1000 switches in gcc.c.

llvm-svn: 124826
2011-02-03 22:51:41 +00:00
Benjamin Kramer f4ea1d5f79 SimplifyCFG: Turn switches into sub+icmp+branch if possible.
This makes the job of the later optzn passes easier, allowing the vast amount of
icmp transforms to chew on it.

We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting
binary on i386-linux.

The testcase from README.txt now compiles into
  decl  %edi
  cmpl  $3, %edi
  sbbl  %eax, %eax
  andl  $1, %eax
  ret

llvm-svn: 124724
2011-02-02 15:56:22 +00:00
Nick Lewycky a46c898314 Remove wasteful caching. This isn't needed for correctness because any function
that might have changed been affected by a merge elsewhere will have been
removed from the function set, and it isn't needed for performance because we
call grow() ahead of time to prevent reallocations.

llvm-svn: 124717
2011-02-02 05:31:01 +00:00
Dan Gohman c6f0bda839 Conservatively, clear optional flags, such as nsw, when performing
reassociation. No testcase, because I wasn't able to create a testcase
which actually demonstrates a problem.

llvm-svn: 124713
2011-02-02 02:05:46 +00:00
Dan Gohman 08d2c98c23 Fix reassociate to clear optional flags, such as nsw.
llvm-svn: 124712
2011-02-02 02:02:34 +00:00
Anders Carlsson f23a6da271 Recognize and simplify
(A+B) == A  ->  B == 0
A == (A+B)  ->  B == 0

llvm-svn: 124567
2011-01-30 22:01:13 +00:00