Commit Graph

2916 Commits

Author SHA1 Message Date
Bill Wendling 7f9f5680ca Testcase for r151691.
llvm-svn: 151694
2012-02-29 01:53:13 +00:00
Pete Cooper 39b5255df4 Reverted r152620 - DSE: Shorten memset when a later store overwrites the start of it. There were all sorts of buildbot issues
llvm-svn: 151621
2012-02-28 05:06:24 +00:00
Pete Cooper f3862f91de DSE: Shorten memset when a later store overwrites the start of it
llvm-svn: 151620
2012-02-28 04:27:10 +00:00
Duncan Sands 27f459519d When performing a conditional branch depending on the value of a comparison
%cmp (eg: A==B) we already replace %cmp with "true" under the true edge, and
with "false" under the false edge.  This change enhances this to replace the
negated compare (A!=B) with "false" under the true edge and "true" under the
false edge.  Reported to improve perlbench results by 1%.

llvm-svn: 151517
2012-02-27 08:14:30 +00:00
Rafael Espindola 09a4201d3c Fix this assert. IP can point to an instruction with strange dominance
properties (invoke). Just assert that the instruction we return dominates
the insertion point.

llvm-svn: 151511
2012-02-27 02:13:03 +00:00
Rafael Espindola a640db900a Add testcase for the previous commit.
llvm-svn: 151475
2012-02-26 05:49:57 +00:00
Rafael Espindola 94df267db3 Change the implementation of dominates(inst, inst) to one based on what the
verifier does. This correctly handles invoke.
Thanks to Duncan, Andrew and Chris for the comments.
Thanks to Joerg for the early testing.

llvm-svn: 151469
2012-02-26 02:19:19 +00:00
Nick Lewycky 3db143ea8c Reinstate the optimization from r151449 with a fix to not turn 'gep %x' into
'gep null' when the icmp predicate is unsigned (or is signed without inbounds).

llvm-svn: 151467
2012-02-26 02:09:49 +00:00
Nick Lewycky 7bbd72da46 Roll these back to r151448 until I figure out how they're breaking
MultiSource/Applications/lua.

llvm-svn: 151463
2012-02-25 23:01:19 +00:00
Nick Lewycky eeeffbb497 An argument and a local identified object (eg. a noalias call) could turn out
equal if both are null. In the test, scope type %t and global @y by adding a
'gep' prefix to them.

llvm-svn: 151452
2012-02-25 20:19:07 +00:00
Nick Lewycky 51f2be8bff Teach instsimplify to be more aggressive when analyzing comparisons of pointers
by using llvm::isIdentifiedObject. Also teach it to handle GEPs that have
the same base pointer and constant operands. Fixes PR11238!

llvm-svn: 151449
2012-02-25 19:07:42 +00:00
Chris Lattner 01990f0e1c fix PR12075, a regression in a recent transform I added. In unreachable code, gep chains can be infinite. Just like "stripPointerCasts", use a set to keep track of visited instructions so we don't recurse infinitely.
llvm-svn: 151383
2012-02-24 19:01:58 +00:00
Duncan Sands 926d101640 Teach GVN that x+y is the same as y+x and that x<y is the same as y>x.
llvm-svn: 151365
2012-02-24 15:16:31 +00:00
Rafael Espindola cd06b482d2 Semantically revert 151015. Add a comment on why we should be able to assert
the dominance once the dominates method is fixed and why we can use the builder's
insertion point.
Fixes pr12048.

llvm-svn: 151125
2012-02-22 03:21:39 +00:00
Nick Lewycky 9d0da18597 Use the target-aware constant folder on expressions to improve the chance
they'll be simple enough to simulate, and to reduce the chance we'll encounter
equal but different simple pointer constants.

This removes the symptoms from PR11352 but is not a full fix. A proper fix would
either require a guarantee that two constant objects we simulate are folded
when equal, or a different way of handling equal pointers (ie., trying a
constantexpr icmp on them to see whether we know they're equal or non-equal or
unsure).

llvm-svn: 151093
2012-02-21 22:08:06 +00:00
Benjamin Kramer 6ee8690aa5 InstCombine: Don't transform a signed icmp of two GEPs into a signed compare of the indices.
This transformation is not safe in some pathological cases (signed icmp of pointers should be an
extremely rare thing, but it's valid IR!). Add an explanatory comment.

Kudos to Duncan for pointing out this edge case (and not giving up explaining it until I finally got it).

llvm-svn: 151055
2012-02-21 13:31:09 +00:00
Nick Lewycky 519561f418 Check for the correct size in the invariant marker.
llvm-svn: 151003
2012-02-20 23:32:26 +00:00
Benjamin Kramer a00c5c451a Test case for r150978.
llvm-svn: 150979
2012-02-20 19:00:28 +00:00
Benjamin Kramer 7adb189538 InstCombine: When comparing two GEPs that were derived from the same base pointer but use different types, expand the offset calculation and to the compare on the offset if profitable.
This came up in SmallVector code.

llvm-svn: 150962
2012-02-20 15:07:47 +00:00
Benjamin Kramer 7746eb62fb InstCombine: Make OptimizePointerDifference more aggressive.
- Ignore pointer casts.
- Also expand GEPs that aren't constantexprs when they have one use or only constant indices.

- We now compile "&foo[i] - &foo[j]" into "i - j".

llvm-svn: 150961
2012-02-20 14:34:57 +00:00
Chris Lattner 445d8c6b50 fold comparisons of gep'd alloca points with null to false,
implementing PR12013.  We now compile the testcase to:

__Z4testv:                              ## @_Z4testv
## BB#0:                                ## %_ZN4llvm15SmallVectorImplIiE9push_backERKi.exit
	pushq	%rbx
	subq	$64, %rsp
	leaq	32(%rsp), %rbx
	movq	%rbx, (%rsp)
	leaq	64(%rsp), %rax
	movq	%rax, 16(%rsp)
	movl	$1, 32(%rsp)
	leaq	36(%rsp), %rax
	movq	%rax, 8(%rsp)
	leaq	(%rsp), %rdi
	callq	__Z1gRN4llvm11SmallVectorIiLj8EEE
	movq	(%rsp), %rdi
	cmpq	%rbx, %rdi
	je	LBB0_2
## BB#1:
	callq	_free
LBB0_2:                                 ## %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit
	addq	$64, %rsp
	popq	%rbx
	ret

instead of:

__Z4testv:                              ## @_Z4testv
## BB#0:
	pushq	%rbx
	subq	$64, %rsp
	xorl	%eax, %eax
	leaq	(%rsp), %rbx
	addq	$32, %rbx
	movq	%rbx, (%rsp)
	movq	%rbx, 8(%rsp)
	leaq	64(%rsp), %rcx
	movq	%rcx, 16(%rsp)
	je	LBB0_2
## BB#1:
	movl	$1, 32(%rsp)
	movq	%rbx, %rax
LBB0_2:                                 ## %_ZN4llvm15SmallVectorImplIiE9push_backERKi.exit
	addq	$4, %rax
	movq	%rax, 8(%rsp)
	leaq	(%rsp), %rdi
	callq	__Z1gRN4llvm11SmallVectorIiLj8EEE
	movq	(%rsp), %rdi
	cmpq	%rbx, %rdi
	je	LBB0_4
## BB#3:
	callq	_free
LBB0_4:                                 ## %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit
	addq	$64, %rsp
	popq	%rbx
	ret

This doesn't shrink clang noticably though.

llvm-svn: 150944
2012-02-20 00:42:49 +00:00
Rafael Espindola 82d957593e Don't skip debug instructions when looking for the insertion point of
the cast. If we do, we can end up with

   inst1
   ---------------  < Insertion point
   dbg inst
   new inst

instead of the desired

   inst1
   new inst
   ---------------  < Insertion point
   dbg inst

Another option would be for InsertNoopCastOfTo (or its callers) to move the
insertion point and we would end up with

   inst1
   dbg inst
   new inst
   ---------------  < Insertion point

but that complicates the callers. This fixes PR12018 (and firefox's build).

llvm-svn: 150884
2012-02-18 17:22:58 +00:00
Eli Friedman 952d1f9f40 Fix a rather nasty regression from r150690: LHS != RHS does not imply LHS->stripPointerCasts() != RHS->stripPointerCasts().
llvm-svn: 150863
2012-02-18 03:29:25 +00:00
Dan Gohman 0155f30a9c Calls and invokes with the new clang.arc.no_objc_arc_exceptions
metadata may still unwind, but only in ways that the ARC
optimizer doesn't need to consider. This permits more
aggressive optimization.

llvm-svn: 150829
2012-02-17 18:59:53 +00:00
Nick Lewycky aed0553360 Remove question.
llvm-svn: 150809
2012-02-17 09:55:20 +00:00
Nick Lewycky 68f9f9d9c8 Add support for invariant.start inside the static constructor evaluator. This is
useful to represent a variable that is const in the source but can't be constant
in the IR because of a non-trivial constructor. If globalopt evaluates the
constructor, and there was an invariant.start with no matching invariant.end
possible, it will mark the global constant afterwards.

llvm-svn: 150794
2012-02-17 06:59:21 +00:00
Benjamin Kramer ea51f62e4b InstSimplify: Ignore pointer casts when constant folding compares between pointers.
llvm-svn: 150690
2012-02-16 13:49:39 +00:00
Eli Bendersky 924f9a671d Replace all instances of dg.exp file with lit.local.cfg, since all tests are run with LIT now and now Dejagnu. dg.exp is no longer needed.
Patch reviewed by Daniel Dunbar. It will be followed by additional cleanup patches.

llvm-svn: 150664
2012-02-16 06:28:33 +00:00
Eli Friedman c458885c58 loop-rotate shouldn't hoist alloca instructions out of a loop. Patch by Patrik Hägglund, with slightly modified test. Issue reported by Patrik Hägglund on llvmdev.
llvm-svn: 150642
2012-02-16 00:41:10 +00:00
Andrew Trick 10cc45336d Add simplifyLoopLatch to LoopRotate pass.
This folds a simple loop tail into a loop latch. It covers the common (in fortran) case of postincrement loops. It's a "free" way to expose this type of loop to downstream loop optimizations that bail out on non-canonical loops (getLoopLatch is a heavily used check).

llvm-svn: 150439
2012-02-14 00:00:23 +00:00
Devang Patel 698452bc7e Check against umin while converting fcmp into an icmp.
llvm-svn: 150425
2012-02-13 23:05:18 +00:00
Dan Gohman eb6e01533a Just like in regular escape analysis, loads and stores through
(but not of) a block pointer do not cause the block pointer to
escape. This fixes rdar://10803830.

llvm-svn: 150424
2012-02-13 22:57:02 +00:00
Hal Finkel 1bde3f86d1 Update BBVectorize to use aliasesUnknownInst.
This allows BBVectorize to check the "unknown instruction" list in the
alias sets. This is important to prevent instruction fusing from reordering
function calls. Resolves PR11920.

llvm-svn: 150250
2012-02-10 15:52:40 +00:00
Duncan Sands 26641d7c02 Fix PR11948: the result type of an icmp may be a vector of boolean -
don't assume it is a boolean.

llvm-svn: 150247
2012-02-10 14:31:24 +00:00
Duncan Sands bf48ac622a Revert commit 149912 (lattner) and add a testcase that shows the problem (which
is that patterns no longer match for vectors of booleans, because you only get
ConstantDataVector when the vector element type is i8, i16, etc, not when it is
i1).  Original commit message:
Remove some dead code and tidy things up now that vectors use ConstantDataVector
instead of always using ConstantVector.

llvm-svn: 150246
2012-02-10 14:26:42 +00:00
Benjamin Kramer 487a3962c7 GlobalOpt: Be more aggressive about elminating side-effect free static dtors.
GlobalOpt runs early in the pipeline (before inlining) and complex class
hierarchies often introduce bitcasts or GEPs which weren't optimized away.
Teach it to ignore side-effect free instructions instead of depending on
other passes to remove them.

llvm-svn: 150174
2012-02-09 14:26:06 +00:00
Bill Wendling 9ebc0896e1 The 'unwind' instruction is deprecated and will be removed, making this test
obsolete.

llvm-svn: 149880
2012-02-06 18:18:47 +00:00
Nick Lewycky 52da72b12a Teach GlobalOpt to handle atomic accesses to globals.
* Most of the transforms come through intact by having each transformed load or
store copy the ordering and synchronization scope of the original.
 * The transform that turns a global only accessed in main() into an alloca
(since main is non-recursive) with a store of the initial value uses an
unordered store, since it's guaranteed to be the first thing to happen in main.
(Threads may have started before main (!) but they can't have the address of a
function local before the point in the entry block we insert our code.)
 * The heap-SRoA transforms are disabled in the face of atomic operations. This
can probably be improved; it seems odd to have atomic accesses to an alloca
that doesn't have its address taken.

AnalyzeGlobal keeps track of the strongest ordering found in any use of the
global. This is more information than we need right now, but it's cheap to
compute and likely to be useful.

llvm-svn: 149847
2012-02-05 19:56:38 +00:00
Duncan Sands 4b613497f0 Reduce the number of dom queries made by GVN's conditional propagation
logic by half: isOnlyReachableViaThisEdge was trying to be clever and
handle the case of a branch to a basic block which is contained in a
loop.  This costs a domtree lookup and is completely useless due to
GVN's position in the pass pipeline: all loops have preheaders at this
point, which means it is enough for isOnlyReachableViaThisEdge to check
that Dst has only one predecessor.  (I checked this theoretical argument
by running over the entire nightly testsuite, and indeed it is so!).

llvm-svn: 149838
2012-02-05 18:25:50 +00:00
Hal Finkel 135cac922c Boost the effective chain depth of loads and stores.
By default, boost the chain depth contribution of loads and stores. This will allow a load/store pair to vectorize even when it would not otherwise be long enough to satisfy the chain depth requirement.

llvm-svn: 149761
2012-02-04 04:14:04 +00:00
Dan Gohman 2d54ce841c Fix SSAUpdaterImpl's RecordMatchingPHI to record exactly the
PHI nodes which were matched, rather than climbing up the
original PHI node's operands to rediscover PHI nodes for
recording, since the PHI nodes found that are not
necessarily part of the matched set.
This fixes rdar://10589171.

llvm-svn: 149654
2012-02-03 01:07:01 +00:00
Jim Grosbach 0ab54184d7 Revert "Disable InstCombine unsafe folding bitcasts of calls w/ varargs."
This reverts commit d0e277d272d517ca1cda368267d199f0da7cad95.

llvm-svn: 149647
2012-02-03 00:00:50 +00:00
Hal Finkel c34e51132c Add a basic-block autovectorization pass.
This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).

llvm-svn: 149468
2012-02-01 03:51:43 +00:00
Jim Grosbach 9fa0481569 Disable InstCombine unsafe folding bitcasts of calls w/ varargs.
Changing arguments from being passed as fixed to varargs is unsafe, as
the ABI may require they be handled differently (stack vs. register, for
example).

Remove two tests which rely on the bitcast being folded into the direct
call, which is exactly the transformation that's unsafe.

llvm-svn: 149457
2012-02-01 00:08:17 +00:00
Bill Wendling d7cd9727ee Remove all references to the old EH.
There was always the current EH. -- Ministry of Truth

llvm-svn: 149335
2012-01-31 02:09:07 +00:00
Bill Wendling f98a6f4aab Update test to new EH model.
llvm-svn: 149333
2012-01-31 02:05:13 +00:00
Rafael Espindola bb893fea6b Add r149110 back with a fix for when the vector and the int have the same
width.

llvm-svn: 149151
2012-01-27 23:33:07 +00:00
Rafael Espindola a4062624d1 Revert r149110 and add a testcase that was crashing since that revision.
Unfortunately I also had to disable constant-pool-sharing.ll the code it tests has been
updated to use the IL logic.

llvm-svn: 149148
2012-01-27 22:42:48 +00:00
Chris Lattner 111d6ee655 enhance constant folding to be able to constant fold bitcast of
ConstantVector's to integer type.

llvm-svn: 149110
2012-01-27 01:44:03 +00:00
Nick Lewycky 70d50ee8fb Support pointer comparisons against constants, when looking at the inline-cost
savings from a pointer argument becoming an alloca. Sometimes callees will even
compare a pointer to null and then branch to an otherwise unreachable block!
Detect these cases and compute the number of saved instructions, instead of
bailing out and reporting no savings.

llvm-svn: 148941
2012-01-25 08:27:40 +00:00
Nick Lewycky c31ceda7d9 Make Value::isDereferenceablePointer() handle unreachable code blocks. (This
returns false in the event the computation feeding into the pointer is
unreachable, which maybe ought to be true -- but this is at least consistent
with undef->isDereferenceablePointer().) Fixes PR11825!

llvm-svn: 148671
2012-01-23 00:05:17 +00:00
Andrew Trick b9c822ab0b Handle a corner case with IV chain collection with bailout instead of assert.
Fixes PR11783: bad cast to AddRecExpr.

llvm-svn: 148572
2012-01-20 21:23:40 +00:00
Andrew Trick 16abc8a1e2 Test case comments missing from my previous checkin.
llvm-svn: 148571
2012-01-20 21:21:27 +00:00
Nick Lewycky e8415fea4b Fix CountCodeReductionForAlloca to more accurately represent what SROA can and
can't handle. Also don't produce non-zero results for things which won't be
transformed by SROA at all just because we saw the loads/stores before we saw
the use of the address.

llvm-svn: 148536
2012-01-20 08:35:20 +00:00
Andrew Trick c908b43d9f SCEVExpander fixes. Affects LSR and indvars.
LSR has gradually been improved to more aggressively reuse existing code, particularly existing phi cycles. This exposed problems with the SCEVExpander's sloppy treatment of its insertion point. I applied some rigor to the insertion point problem that will hopefully avoid an endless bug cycle in this area. Changes:

- Always used properlyDominates to check safe code hoisting.

- The insertion point provided to SCEV is now considered a lower bound. This is usually a block terminator or the use itself. Under no cirumstance may SCEVExpander insert below this point.

- LSR is reponsible for finding a "canonical" insertion point across expansion of different expressions.

- Robust logic to determine whether IV increments are in "expanded" form and/or can be safely hoisted above some insertion point.

Fixes PR11783: SCEVExpander assert.

llvm-svn: 148535
2012-01-20 07:41:13 +00:00
Dan Gohman 8ee108bf98 Set the "tail" flag on pattern-matched objc_storeStrong calls.
rdar://10531041.

llvm-svn: 148490
2012-01-19 19:14:36 +00:00
Dan Gohman 82041c2e60 Use llvm.global_ctors to locate global constructors instead
of recognizing them by name.

llvm-svn: 148416
2012-01-18 21:19:38 +00:00
Andrew Trick c193b16ea2 Test case rename
llvm-svn: 148344
2012-01-17 22:27:45 +00:00
Dan Gohman e7a243fea5 Add a new ObjC ARC optimization pass to eliminate unneeded
autorelease push+pop pairs.

llvm-svn: 148330
2012-01-17 20:52:24 +00:00
Andrew Trick 12728f04ca LSR fix: broaden the check for loop preheaders.
It's becoming clear that LoopSimplify needs to unconditionally create loop preheaders. But that is a bigger fix. For now, continuing to hack LSR.
Fixes rdar://10701050 "Cannot split an edge from an IndirectBrInst" assert.

llvm-svn: 148288
2012-01-17 06:45:52 +00:00
Andrew Trick 23ef0d6c40 Fix a corner case hit by redundant phi elimination running after LSR.
Fixes PR11761: bad IR w/ redundant Phi elim

llvm-svn: 148177
2012-01-14 03:17:23 +00:00
Dan Gohman 728db4997a Implement proper ObjC ARC objc_retainBlock "escape" analysis, so that
the optimizer doesn't eliminate objc_retainBlock calls which are needed
for their side effect of copying blocks onto the heap.
This implements rdar://10361249.

llvm-svn: 148076
2012-01-13 00:39:07 +00:00
Duncan Sands 0bf46b5363 Don't try to create a GEP when the pointee type is unsized (such GEPs
are invalid).  Fixes a crash on array1.C from the GCC testsuite when
compiled with dragonegg.

llvm-svn: 147946
2012-01-11 12:20:08 +00:00
Stepan Dyatkovskiy 8216569812 Improved compile time:
1. Size heuristics changed. Now we calculate number of unswitching
branches only once per loop.
2. Some checks was moved from UnswitchIfProfitable to
processCurrentLoop, since it is not changed during processCurrentLoop
iteration. It allows decide to skip some loops at an early stage.
Extended statistics:
- Added total number of instructions analyzed.

llvm-svn: 147935
2012-01-11 08:40:51 +00:00
Bill Wendling c79155192d If the global variable is removed by the linker, then don't constant merge it
with other symbols.

An object in the __cfstring section is suppoed to be filled with CFString
objects, which have a pointer to ___CFConstantStringClassReference followed by a
pointer to a __cstring. If we allow the object in the __cstring section to be
merged with another global, then it could end up in any section. Because the
linker is going to remove these symbols in the final executable, we shouldn't
bother to merge them.
<rdar://problem/10564621>

llvm-svn: 147899
2012-01-11 00:13:08 +00:00
Andrew Trick d5d2db9af9 Enable LSR IV Chains with sufficient heuristics.
These heuristics are sufficient for enabling IV chains by
default. Performance analysis has been done for i386, x86_64, and
thumbv7. The optimization is rarely important, but can significantly
speed up certain cases by eliminating spill code within the
loop. Unrolled loops are prime candidates for IV chains. In many
cases, the final code could still be improved with more target
specific optimization following LSR. The goal of this feature is for
LSR to make the best choice of induction variables.

Instruction selection may not completely take advantage of this
feature yet. As a result, there could be cases of slight code size
increase.

Code size can be worse on x86 because it doesn't support postincrement
addressing. In fact, when chains are formed, you may see redundant
address plus stride addition in the addressing mode. GenerateIVChains
tries to compensate for the common cases.

On ARM, code size increase can be mitigated by using postincrement
addressing, but downstream codegen currently misses some opportunities.

llvm-svn: 147826
2012-01-10 01:45:08 +00:00
Andrew Trick 248d410e3e Adding IV chain generation to LSR.
After collecting chains, check if any should be materialized. If so,
hide the chained IV users from the LSR solver. LSR will only solve for
the head of the chain. GenerateIVChains will then materialize the
chained IV users by computing the IV relative to its previous value in
the chain.

In theory, chained IV users could be exposed to LSR's solver. This
would be considerably complicated to implement and I'm not aware of a
case where we need it. In practice it's more important to
intelligently prune the search space of nontrivial loops before
running the solver, otherwise the solver is often forced to prune the
most optimal solutions. Hiding the chained users does this well, so
that LSR is more likely to find the best IV for the chain as a whole.

llvm-svn: 147801
2012-01-09 21:18:52 +00:00
Benjamin Kramer f9d0cc0160 InstCombine: Teach foldLogOpOfMaskedICmpsHelper that sign bit tests are bit tests.
This subsumes several other transforms while enabling us to catch more cases.

llvm-svn: 147777
2012-01-09 17:23:27 +00:00
Benjamin Kramer 6609f741b9 Tweak my last commit to be less conservative about uses.
We still save an instruction when just the "and" part is replaced.
Also change the code to match comments more closely.

llvm-svn: 147753
2012-01-08 21:12:51 +00:00
Benjamin Kramer da37e15345 InstCombine: If we have a bit test and a sign test anded/ored together, merge the sign bit into the bit test.
This is common in bit field code, e.g. checking if the first or the last bit of a bit field is set.

llvm-svn: 147749
2012-01-08 18:32:24 +00:00
Andrew Trick 732ad80dbb LSR: Don't optimize loops if an outer loop has no preheader.
LoopSimplify may not run on some outer loops, e.g. because of indirect
branches. SCEVExpander simply cannot handle outer loops with no preheaders.
Fixes rdar://10655343 SCEVExpander segfault.

llvm-svn: 147718
2012-01-07 03:16:50 +00:00
Andrew Trick 5adedf5d47 Extended replaceCongruentPhis to handle mixed phi types.
llvm-svn: 147707
2012-01-07 01:12:09 +00:00
Andrew Trick cbf2fe066a comment typo
llvm-svn: 147701
2012-01-07 00:29:20 +00:00
Dan Gohman 5ab9c0a927 Fix SpeculativelyExecuteBB to either speculate all or none of the phis
present in the bottom of the CFG triangle, as the transformation isn't
ever valuable if the branch can't be eliminated.

Also, unify some heuristics between SimplifyCFG's multiple
if-converters, for consistency.

This fixes rdar://10627242.

llvm-svn: 147630
2012-01-05 23:58:56 +00:00
Eli Friedman 55fa49f32d PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into global initializers if there's an implied extension or truncation.
llvm-svn: 147625
2012-01-05 23:03:32 +00:00
Dan Gohman 5267211899 Revert r56315. When the instruction to speculate is a load, this
code can incorrectly move the load across a store. This never
happens in practice today, but only because the current
heuristics accidentally preclude it.

llvm-svn: 147623
2012-01-05 22:54:35 +00:00
Benjamin Kramer aca1885695 FileCheck hygiene.
llvm-svn: 147580
2012-01-05 00:43:34 +00:00
Nick Lewycky 0c48afa0ed Teach instcombine all sorts of great stuff about shifts that have exact, nuw or
nsw bits on them.

llvm-svn: 147528
2012-01-04 09:28:29 +00:00
Andrew Trick cbcc98fb50 Fix SCEVExpander to handle loops with no preheader when LSR gives it a
"phony" insertion point.

Fixes rdar://10619599: "SelectionDAGBuilder shouldn't visit PHI nodes!" assert

llvm-svn: 147439
2012-01-02 21:25:10 +00:00
Nick Lewycky b59008c694 Make use of the exact bit when optimizing '(X >>exact 3) << 1' to eliminate the
'and' that would zero out the trailing bits, and to produce an exact shift
ourselves.

llvm-svn: 147391
2011-12-31 21:30:22 +00:00
Nick Lewycky 4c378a4453 Change CaptureTracking to pass a Use* instead of a Value* when a value is
captured. This allows the tracker to look at the specific use, which may be
especially interesting for function calls.

Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does
not iterate until a fixpoint and does not guarantee that it produces the same
result regardless of iteration order. The new implementation builds up a graph
of how arguments are passed from function to function, and uses a bottom-up walk
on the argument-SCCs to assign nocapture. This gets us nocapture more often, and
does so rather efficiently and independent of iteration order.

llvm-svn: 147327
2011-12-28 23:24:21 +00:00
Nick Lewycky a8e84fb56b Turn cos(-x) into cos(x). Patch by Alexander Malyshev!
llvm-svn: 147291
2011-12-27 18:25:50 +00:00
Nick Lewycky c554a9b58e Teach simplifycfg to recompute branch weights when merging some branches, and
to discard weights when appropriate. Still more to do (and a new TODO), but
it's a start!

llvm-svn: 147286
2011-12-27 04:31:52 +00:00
Nick Lewycky 8d302df4a4 Update the branch weight metadata when reversing the order of a branch.
llvm-svn: 147280
2011-12-26 20:54:14 +00:00
Chandler Carruth 8b7e71ffd6 Add an explicit test that we now fold cttz.i32(..., true) >> 5 -> 0.
This is a result of Benjamin's work on ValueTracking.

llvm-svn: 147259
2011-12-24 22:34:15 +00:00
Benjamin Kramer b16bd77bd2 InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff x is smaller than 2^n and it fuses with a following add.
This was intended to undo the sub canonicalization in cases where it's not profitable, but it also
finds some cases on it's own.

llvm-svn: 147256
2011-12-24 17:31:53 +00:00
Benjamin Kramer 4ee5747fdd ComputeMaskedBits: Make knownzero computation more aggressive for ctlz with undef zero.
unsigned foo(unsigned x) { return 31 - __builtin_clz(x); }
now compiles into a single "bsrl" instruction on x86.

llvm-svn: 147255
2011-12-24 17:31:46 +00:00
Benjamin Kramer 010337c838 InstCombine: Canonicalize (2^n)-1 - x into (2^n)-1 ^ x iff x is known to be smaller than 2^n.
This has the obvious advantage of being commutable and is always a win on x86 because
const - x wastes a register there. On less weird architectures this may lead to
a regression because other arithmetic doesn't fuse with it anymore. I'll address that
problem in a followup.

llvm-svn: 147254
2011-12-24 17:31:38 +00:00
Nick Lewycky 854c869c36 Move this test from date-name to feature-name, and port it to FileCheck.
llvm-svn: 147223
2011-12-23 18:41:31 +00:00
Chad Rosier 388769427d Reinstate r146578; it doesn't appear to be the cause of some recent execution-
time regressions.  In general, it is beneficial to compile-time.

Original commit message:
Fix for bug #11429: Wrong behaviour for switches. Small improvement for code
size heuristics.

llvm-svn: 147175
2011-12-22 21:06:36 +00:00
Benjamin Kramer f1fd6e394d Give string constants generated by IRBuilder private linkage.
Fixes PR11640.

llvm-svn: 147144
2011-12-22 14:22:14 +00:00
Chad Rosier 1b7e2baf47 Speculatively revert r146578 to determine if it is the cause of a number of
performance regressions (both execution-time and compile-time) on our
nightly testers.

Original commit message:
Fix for bug #11429: Wrong behaviour for switches. Small improvement for code
size heuristics.

llvm-svn: 147131
2011-12-22 02:40:57 +00:00
Nick Lewycky b4039f633c Make some intrinsics safe to speculatively execute.
llvm-svn: 147036
2011-12-21 05:52:02 +00:00
Andrew Trick a34a8c45b4 Unit test for r146950: LSR postinc expansion, PR11571.
llvm-svn: 146951
2011-12-20 01:43:20 +00:00
Joerg Sonnenberger d6cb7649d8 Allow inlining of functions with returns_twice calls, if they have the
attribute themselve.

llvm-svn: 146851
2011-12-18 20:35:43 +00:00
Kevin Enderby 8b3deabd2d Revert r146822 at Pete Cooper's request as it broke clang self hosting.
Hope I did this correctly :)

llvm-svn: 146834
2011-12-17 19:48:52 +00:00
Pete Cooper eadf124d2b SimplifyCFG now predicts some conditional branches to true or false depending on previous branch on same comparison operands.
For example, 

if (a == b) {
    if (a > b) // this is false
    
Fixes some of the issues on <rdar://problem/10554090>

llvm-svn: 146822
2011-12-17 06:32:38 +00:00
Pete Cooper b33c297f14 Added InstCombine for "select cond, ~cond, x" type patterns
These can be reduced to "~cond & x" or "~cond | x"

llvm-svn: 146624
2011-12-15 00:56:45 +00:00
Eli Friedman 16ad2905a3 Make loop preheader insertion in LoopSimplify handle the case where the loop header is a landing pad correctly (by splitting the landingpad out of the loop header). Make some adjustments to the rest of LoopSimplify to make it clear that the rest of LoopSimplify isn't making bad assumptions about the presence of landing pads. PR11575.
llvm-svn: 146621
2011-12-15 00:50:34 +00:00
Dan Gohman 75d7d5e988 Move Instruction::isSafeToSpeculativelyExecute out of VMCore and
into Analysis as a standalone function, since there's no need for
it to be in VMCore. Also, update it to use isKnownNonZero and
other goodies available in Analysis, making it more precise,
enabling more aggressive optimization.

llvm-svn: 146610
2011-12-14 23:49:11 +00:00
Andrew Trick e0ced62119 LSR: Fold redundant bitcasts on-the-fly.
llvm-svn: 146597
2011-12-14 22:07:19 +00:00
Stepan Dyatkovskiy d7b2bb3bdd Fix for bug #11429: Wrong behaviour for switches. Small improvement for code size heuristics.
llvm-svn: 146578
2011-12-14 19:19:17 +00:00
Dan Gohman bd944b4153 It turns out that clang does use pointer-to-function types to
point to ARC-managed pointers sometimes. This fixes rdar://10551239.

llvm-svn: 146577
2011-12-14 19:10:53 +00:00
Joerg Sonnenberger 45c4164166 Only replace fwrite with fputc, if the return value is unused.
llvm-svn: 146411
2011-12-12 20:18:31 +00:00
Chandler Carruth 6b0e34c445 Manually upgrade the test suite to specify the flag to cttz and ctlz.
I followed three heuristics for deciding whether to set 'true' or
'false':

- Everything target independent got 'true' as that is the expected
  common output of the GCC builtins.
- If the target arch only has one way of implementing this operation,
  set the flag in the way that exercises the most of codegen. For most
  architectures this is also the likely path from a GCC builtin, with
  'true' being set. It will (eventually) require lowering away that
  difference, and then lowering to the architecture's operation.
- Otherwise, set the flag differently dependending on which target
  operation should be tested.

Let me know if anyone has any issue with this pattern or would like
specific tests of another form. This should allow the x86 codegen to
just iteratively improve as I teach the backend how to differentiate
between the two forms, and everything else should remain exactly the
same.

llvm-svn: 146370
2011-12-12 11:59:10 +00:00
Andrew Trick d04d152998 Add -unroll-runtime for unrolling loops with run-time trip counts.
Patch by Brendon Cahoon!

This extends the existing LoopUnroll and LoopUnrollPass. Brendon
measured no regressions in the llvm test suite with -unroll-runtime
enabled. This implementation works by using the existing loop
unrolling code to unroll the loop by a power-of-two (default 8). It
generates an if-then-else sequence of code prior to the loop to
execute the extra iterations before entering the unrolled loop.

llvm-svn: 146245
2011-12-09 06:19:40 +00:00
Nick Lewycky fe970725cc Fix infinite loop in DSE when deleting a free in a reachable loop that's also
trivially infinite.

llvm-svn: 146197
2011-12-08 22:36:35 +00:00
Andrew Trick 5df9096584 LSR: prune undesirable formulae early.
It's always good to prune early, but formulae that are unsatisfactory
in their own right need to be removed before running any other pruning
heuristics. We easily avoid generating such formulae, but we need them
as an intermediate basis for forming other good formulae.

llvm-svn: 145906
2011-12-06 03:13:31 +00:00
Chad Rosier 8abf65a130 Probably not a good idea to convert a single vector load into a memcpy. We
don't do this now, but add a test case to prevent this from happening in the
future.
Additional test for rdar://9892684

llvm-svn: 145879
2011-12-06 00:19:08 +00:00
Chad Rosier 19446a07a7 Make the MemCpyOptimizer a bit more aggressive. I can't think of a scenerio
where this would be bad as the backend shouldn't have a problem inlining small
memcpys.
rdar://10510150

llvm-svn: 145865
2011-12-05 22:37:00 +00:00
Nadav Rotem 3924cb0267 Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Pete Cooper e03fe83d98 Fixed deadstoreelimination bug where negative indices were incorrectly causing the optimisation to occur
Turns out long long + unsigned long long is unsigned.  Doh!

Fixes http://llvm.org/bugs/show_bug.cgi?id=11455

llvm-svn: 145731
2011-12-03 00:04:30 +00:00
Chad Rosier 0155a63513 Add support for constant folding the pow intrinsic.
rdar://10514247

llvm-svn: 145730
2011-12-03 00:00:03 +00:00
Chad Rosier 3367123b12 Prevent library calls from being folded if -fno-builtin has been specified.
rdar://10500969

llvm-svn: 145639
2011-12-01 22:14:50 +00:00
Pete Cooper fdddc27143 Improved fix for abs(val) != 0 to check other similar case. Also fixed style issues and confusing comment
llvm-svn: 145618
2011-12-01 19:13:26 +00:00
Pete Cooper 3b7f35bf08 Removed use of grep from test and moved it to be with other icmp tests
llvm-svn: 145570
2011-12-01 04:35:26 +00:00
Pete Cooper bc5c524b71 Added instcombine pattern to spot comparing -val or val against 0.
(val != 0) == (-val != 0) so "abs(val) != 0" becomes "val != 0"

Fixes <rdar://problem/10482509>

llvm-svn: 145563
2011-12-01 03:58:40 +00:00
Andrew Trick 613c67e475 Better test case found in duplicate PR10570.
llvm-svn: 145484
2011-11-30 06:26:42 +00:00
Andrew Trick ceafa2c746 LSR: handle the expansion of phi operands that use postinc forms of the IV.
Fixes PR11431: SCEVExpander::expandAddRecExprLiterally(const llvm::SCEVAddRecExpr*): Assertion `(!isa<Instruction>(Result) || SE.DT->dominates(cast<Instruction>(Result), Builder.GetInsertPoint())) && "postinc expansion does not dominate use"' failed.

llvm-svn: 145482
2011-11-30 06:07:54 +00:00
Chad Rosier 82e1bd8e94 Add support for sqrt, sqrtl, and sqrtf in TargetLibraryInfo. Disable
(fptrunc (sqrt (fpext x))) -> (sqrtf x) transformation if -fno-builtin is 
specified.
rdar://10466410

llvm-svn: 145460
2011-11-29 23:57:10 +00:00
Duncan Sands ca6f8ddbf8 Fix a theoretical problem (not seen in the wild): if different instances of a
weak variable are compiled by different compilers, such as GCC and LLVM, while
LLVM may increase the alignment to the preferred alignment there is no reason to
think that GCC will use anything more than the ABI alignment.  Since it is the
GCC version that might end up in the final program (as the linkage is weak), it
is wrong to increase the alignment of loads from the global up to the preferred
alignment as the alignment might only be the ABI alignment.

Increasing alignment up to the ABI alignment might be OK, but I'm not totally
convinced that it is.  It seems better to just leave the alignment of weak
globals alone.

llvm-svn: 145413
2011-11-29 18:26:38 +00:00
Andrew Trick e756031a62 Reenable this IndVars unit test.
SCEV can't optimize undef in all cases, which is a separate issue from this test case.

llvm-svn: 145343
2011-11-29 00:52:04 +00:00
Eli Friedman b3f9b0676a Add a missing safety check to ProcessUGT_ADDCST_ADD. Fixes PR11438.
llvm-svn: 145316
2011-11-28 23:32:19 +00:00
Eli Friedman e7ab1a2f0f Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Chris Lattner 251d827d2c remove a test that is using old-style llvm.dbg intrinsics, apparently only
fails on ppc and arm hosts.

llvm-svn: 145188
2011-11-27 18:13:47 +00:00
Chris Lattner ee471c484a remove autoupgrade support for old forms of llvm.prefetch and the old
trampoline forms.  Both of these were correct in LLVM 3.0, and we don't
need to support LLVM 2.9 and earlier in mainline.

llvm-svn: 145174
2011-11-27 07:42:04 +00:00
Chris Lattner 6a144a2227 Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic.
llvm-svn: 145171
2011-11-27 06:54:59 +00:00
Chris Lattner 90ef78c07f remove autoupgrade support for really old-style debug info intrinsics.
I think this is the last of autoupgrade that can be removed in 3.1.
Can the atomic upgrade stuff also go?

llvm-svn: 145169
2011-11-27 06:18:33 +00:00
Chandler Carruth f156f0cf57 FileCheck-ize this test and make it more precise. This is in preparation
for adding other tests.

llvm-svn: 145143
2011-11-26 08:24:25 +00:00
Richard Smith 4f9a8081c3 Correctly byte-swap APInts with bit-widths greater than 64.
llvm-svn: 145111
2011-11-23 21:33:37 +00:00
Duncan Sands 81a2af12d6 Fix a crash in which a multiplication was being reported as being both negative
and positive: positive, because it could be directly computed to be positive;
negative, because the nsw flags means it is either negative or undefined (the
multiplication always overflowed).

llvm-svn: 145104
2011-11-23 16:26:47 +00:00
Nick Lewycky 063ae5897c Fix crasher in GVN due to my recent capture tracking changes.
llvm-svn: 145047
2011-11-21 19:42:56 +00:00
Benjamin Kramer 650c09aa4d XFAIL this test until I figure out what indvars is doing here (or find someone who does)
llvm-svn: 145008
2011-11-20 11:10:03 +00:00
Andrew Trick 6b4d578f54 Fix a corner case in updating LoopInfo after fully unrolling an outer loop.
The loop tree's inclusive block lists are painful and expensive to
update. (I have no idea why they're inclusive). The design was
supposed to handle this case but the implementation missed it and my
unit tests weren't thorough enough.

Fixes PR11335: loop unroll update.

llvm-svn: 144970
2011-11-18 03:42:41 +00:00
Andrew Trick 949045864d Fix an overly general check in SimplifyIndvar to handle useless phi cycles.
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.

Fixes PR11350: runaway iteration assertion.

llvm-svn: 144935
2011-11-17 23:36:35 +00:00
Eli Friedman 489c0ff4a4 Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom
names for fwrite and fputs.

Fixes <rdar://problem/9815881>.

llvm-svn: 144876
2011-11-17 01:27:36 +00:00
Nick Lewycky db408f1857 Fix typo in test.
llvm-svn: 144774
2011-11-16 03:56:38 +00:00
Nick Lewycky c7f1e7993c Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when
looking at the size of the pointee. Fixes PR11390!

llvm-svn: 144773
2011-11-16 03:49:48 +00:00
Andrew Trick 90c7a108ca Fix SCEV overly optimistic back edge taken count for multi-exit loops.
Fixes PR11375: Different results for 'clang++ huh.cpp'...

llvm-svn: 144746
2011-11-16 00:52:40 +00:00
Nick Lewycky 7013a19e8a Refactor capture tracking (which already had a couple flags for whether returns
and stores capture) to permit the caller to see each capture point and decide
whether to continue looking.

Use this inside memdep to do an analysis that basicaa won't do. This lets us
solve another devirtualization case, fixing PR8908!

llvm-svn: 144580
2011-11-14 22:49:42 +00:00
Nick Lewycky d48ab84556 Don't try to loop on iterators that are potentially invalidated inside the loop. Fixes PR11361!
llvm-svn: 144454
2011-11-12 03:09:12 +00:00
Eli Friedman ecb453805d Make sure scalarrepl picks the correct alloca when it rewrites a bitcast. Fixes PR11353.
llvm-svn: 144442
2011-11-12 02:07:50 +00:00
Eli Friedman 0a309292c4 Get rid of an optimization in SCCP which appears to have many issues. Specifically, it doesn't handle many cases involving undef correctly, and it is missing other checks which
lead to it trying to re-mark a value marked as a constant with a different value.  It also appears to trigger very rarely.

Fixes PR11357.

llvm-svn: 144352
2011-11-11 01:16:15 +00:00
Pete Cooper 856977cb15 DeadStoreElimination can now trim the size of a store if the end of the store is dead.
Currently checks alignment and killing stores on a power of 2 boundary as this is likely
to trim the size of the earlier store without breaking large vector stores into scalar ones.

Fixes <rdar://problem/10140300>

llvm-svn: 144239
2011-11-09 23:07:35 +00:00
Eli Friedman 0bae8b2cfb Fix code to match comment. Fixes PR11340, a regression from r143209.
llvm-svn: 144121
2011-11-08 21:08:02 +00:00
Pete Cooper 9ee220915b LICM pass now understands invariant load metadata. Nothing generates this yet so it will currently never get used in real tests
llvm-svn: 144107
2011-11-08 19:30:00 +00:00
Bill Wendling 2a917595d2 Convert to the new EH model.
llvm-svn: 144050
2011-11-08 00:23:01 +00:00
Nick Lewycky f2905afe62 Do simple cross-block DSE when we encounter a free statement. Fixes PR11240.
llvm-svn: 143808
2011-11-05 10:48:42 +00:00
Dan Gohman ce3d6248b2 Add tests for existing InstSimplify features.
llvm-svn: 143721
2011-11-04 18:39:16 +00:00
Dan Gohman 85977e6ab4 Teach instsimplify to simplify calls to undef.
llvm-svn: 143719
2011-11-04 18:32:42 +00:00
Daniel Dunbar e6d40de414 Speculatively revert "DeadStoreElimination can now trim the size of a store if
the end of it is dead.", which appears to break bootstrapping LLVM.

llvm-svn: 143668
2011-11-04 00:48:26 +00:00
Pete Cooper 8a95aedb5d DeadStoreElimination can now trim the size of a store if the end of it is dead.
Only currently done if the later store is writing to a power of 2 address or 
has the same alignment as the earlier store as then its likely to not break up
large stores into smaller ones

Fixes <rdar://problem/10140300>

llvm-svn: 143630
2011-11-03 18:01:56 +00:00
Andrew Trick c2c79c90f2 Rewrite LinearFunctionTestReplace to handle pointer-type IVs.
We've been hitting asserts in this code due to the many supported
combintions of modes (iv-rewrite/no-iv-rewrite) and IV types. This
second rewrite of the code attempts to deal with these cases systematically.

llvm-svn: 143546
2011-11-02 17:19:57 +00:00
Andrew Trick 0dae890346 Broaden an assert to handle enable-iv-rewrite=true following r143183.
Narrowest possible fix for PR11279.

llvm-svn: 143522
2011-11-02 00:02:45 +00:00
Eli Friedman a49b828f8f Make sure we use the right insertion point when instcombine replaces a PHI with another instruction. (Specifically, don't insert an arbitrary instruction before a PHI.) Fixes PR11275.
llvm-svn: 143437
2011-11-01 04:49:29 +00:00
Duncan Sands 3d5692a475 Reapply commit 143214 with a fix: m_ICmp doesn't match conditions
with the given predicate, it matches any condition and returns the
predicate - d'oh!  Original commit message:
The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false.
Spotted by my super-optimizer in 186.crafty and 450.soplex.  We really
need a proper infrastructure for handling generalizations of this kind
of thing (which occur a lot), however this case is so simple that I decided
to go ahead and implement it directly.

llvm-svn: 143318
2011-10-30 19:56:36 +00:00
Benjamin Kramer 594ee77964 SimplifyLibCalls: Use IRBuilder.CreateGlobalString when creating a string for printf->puts, which correctly sets the unnamed_addr bit on the resulting GlobalVariable.
Fixes PR11264.

llvm-svn: 143289
2011-10-29 19:43:31 +00:00
Eli Friedman 3af3c046a9 Revert r143214; it's breaking a bunch of stuff.
llvm-svn: 143265
2011-10-29 00:56:07 +00:00
Duncan Sands 280bc553b3 The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false.
Spotted by my super-optimizer in 186.crafty and 450.soplex.  We really
need a proper infrastructure for handling generalizations of this kind
of thing (which occur a lot), however this case is so simple that I decided
to go ahead and implement it directly.

llvm-svn: 143214
2011-10-28 19:01:20 +00:00
Duncan Sands 985ba6386d A shift of a power of two is a power of two or zero.
For completeness - not spotted in the wild.

llvm-svn: 143211
2011-10-28 18:30:05 +00:00
Duncan Sands 92af0a8a7f Fold icmp ugt (udiv X, Y), X to false. Spotted by my super-optimizer
in 186.crafty.

llvm-svn: 143209
2011-10-28 18:17:44 +00:00
Andrew Trick effdca9441 LFTR should avoid a type mismatch with null pointer IVs.
Fixes rdar://10359193 Indvar LinearFunctionTestReplace assertion

llvm-svn: 143183
2011-10-28 03:45:11 +00:00
Duncan Sands 7cb61e5a0e Reapply commit 143028 with a fix: the problem was casting a ConstantExpr Mul
using BinaryOperator (which only works for instructions) when it should have
been a cast to OverflowingBinaryOperator (which also works for constants).
While there, correct a few other dubious looking uses of BinaryOperator.
Thanks to Chad Rosier for the testcase.  Original commit message:
My super-optimizer noticed that we weren't folding this expression to
true: (x *nsw x) sgt 0, where x = (y | 1).  This occurs in 464.h264ref.

llvm-svn: 143125
2011-10-27 19:16:21 +00:00
Bob Wilson 1455ce27e4 Revert Duncan's r143028 expression folding which appears to be the culprit
behind a compile failure on 483.xalancbmk.

llvm-svn: 143102
2011-10-27 15:47:25 +00:00
Eli Friedman 73beaf7bbc It is not safe to sink an alloca into a stacksave/stackrestore pair, so don't do that. <rdar://problem/10352360>
llvm-svn: 143093
2011-10-27 01:33:51 +00:00
Duncan Sands ba286d7c73 The maximum power of 2 dividing a power of 2 is itself. This occurs
in 403.gcc and was spotted by my super-optimizer.

llvm-svn: 143054
2011-10-26 20:55:21 +00:00
Duncan Sands 1d2bb9882d My super-optimizer noticed that we weren't folding this expression to
true: (x *nsw x) sgt 0, where x = (y | 1).  This occurs in 464.h264ref.

llvm-svn: 143028
2011-10-26 15:31:51 +00:00
Nick Lewycky dd1d3df524 A dead malloc, a free(NULL) and a free(undef) are all trivially dead
instructions.

This doesn't introduce any optimizations we weren't doing before (except
potentially due to pass ordering issues), now passes will eliminate them sooner
as part of their own cleanups.

llvm-svn: 142787
2011-10-24 04:35:36 +00:00
Cameron Zwarich 057fbb1a10 The element insertion code in scalar replacement doesn't handle incorrect
element types, even though the element extraction code does. It is surprising
that this bug has been here for so long. Fixes <rdar://problem/10318778>.

llvm-svn: 142740
2011-10-23 07:02:10 +00:00
Nick Lewycky 52340ac5f8 Oops! Fix test I forgot to submit as part of r142735.
llvm-svn: 142736
2011-10-22 22:07:31 +00:00
Nick Lewycky 32f8051d66 A non-escaping malloc in the entry block is not unlike an alloca. Do dead-store
elimination on them too.

llvm-svn: 142735
2011-10-22 21:59:35 +00:00
Eli Friedman 688db1d6d0 Remap blockaddress correctly when inlining a function. Fixes PR10162.
llvm-svn: 142684
2011-10-21 20:45:19 +00:00
Eli Friedman ce818277fc Extend instcombine's shufflevector simplification to handle more cases where the input and output vectors have different sizes. Patch by Xiaoyi Guo.
llvm-svn: 142671
2011-10-21 19:06:29 +00:00
Eli Friedman 1923a330e6 Refactor code from inlining and globalopt that checks whether a function definition is unused, and enhance it so it can tell that functions which are only used by a blockaddress are in fact dead. This probably doesn't happen much on most code, but the Linux kernel's _THIS_IP_ can trigger this issue with blockaddress. (GlobalDCE can also handle the given tescase, but we only run that at -O3.) Found while looking at PR11180.
llvm-svn: 142572
2011-10-20 05:23:42 +00:00
Nick Lewycky 462098824f "@string = constant i8 0" is a value i8* string of length zero. Analyze that
correctly in GetStringLength, fixing PR11181!

llvm-svn: 142558
2011-10-20 00:34:35 +00:00
Dan Gohman a7107f992e Teach the ARC optimizer about the !clang.arc.copy_on_escape metadata
tag on objc_retainBlock calls, which indicates that they may be
optimized away. rdar://10211286.

llvm-svn: 142298
2011-10-17 22:53:25 +00:00
Lang Hames e7594abd87 Fixed quoting on default data layout option.
llvm-svn: 142286
2011-10-17 21:54:43 +00:00
Bill Wendling c68c8cb8d4 Add support for the Objective-C personality function to the instruction
combining of the landingpad instruction. The ObjC personality function acts
almost identically to the C++ personality function. In particular, it uses
"null" as a "catch-all" value.

llvm-svn: 142256
2011-10-17 21:20:24 +00:00
Dan Gohman 1736c14b85 Suppress partial retain+release elimination when there's a
possibility that it will span multiple CFG diamonds/triangles which
could have different controlling predicates.  rdar://10282956

llvm-svn: 142222
2011-10-17 18:48:25 +00:00
Bill Wendling 63a4ea1859 Correct over-zealous removal of hack.
Some code want to check that *any* call within a function has the 'returns
twice' attribute, not just that the current function has one.

llvm-svn: 142221
2011-10-17 18:43:40 +00:00
Bill Wendling 42cf65fe51 Temporarily XFAIL waiting for a fix.
llvm-svn: 142215
2011-10-17 18:25:32 +00:00
Chandler Carruth 3e8aa65bc2 Add a routine to swap branch instruction operands, and update any
profile metadata at the same time. Use it to preserve metadata attached
to a branch when re-writing it in InstCombine.

Add metadata to the canonicalize_branch InstCombine test, and check that
it is tranformed correctly.

Reviewed by Nick Lewycky!

llvm-svn: 142168
2011-10-17 01:11:57 +00:00
Nick Lewycky 84baea77ea Oops! Fix testcase.
llvm-svn: 142151
2011-10-16 20:20:15 +00:00
Nick Lewycky 0a7e9ccf04 When looking for dependencies on the src pointer, scan the src pointer. Scanning
on the memcpy call will pull up other unrelated stuff. Fixes PR11142.

llvm-svn: 142150
2011-10-16 20:13:32 +00:00
Andrew Trick fd4ca0f4ac Fix SCEVExpander assert during LSR: "argument of incompatible type".
Just because we're dealing with a GEP doesn't mean we can assert the
SCEV has a pointer type. The fix is simply to ignore the SCEV pointer
type, which we really didn't need.
Fixes PR11138 webkit crash.

llvm-svn: 142058
2011-10-15 06:19:55 +00:00
Andrew Trick 870c1a3f15 Reapply r141870, SCEV expansion of post-inc.
Speculatively reapply to see if this test case still crashes on
linux. I may have fixed it in my last checkin.

llvm-svn: 141895
2011-10-13 21:55:29 +00:00
Andrew Trick 41c253c35c Revert r141870. The test case crashes on linux with data corruption. A deeper issue was exposed.
llvm-svn: 141873
2011-10-13 17:58:24 +00:00
Andrew Trick e15d6e14e3 LSR: Reuse the post-inc expansion of expressions.
This avoids unnecessary expansion of expressions and allows the SCEV
expander to work on expression DAGs, not just trees.
Fixes PR11090.

llvm-svn: 141870
2011-10-13 17:31:47 +00:00
Lang Hames 850f7b3cdc Removed colons from some target datalayout strings in test, since they don't match the required format.
llvm-svn: 141825
2011-10-12 22:24:17 +00:00
Cameron Zwarich 1a761dcfbd Fix PR11106 by correcting a typo that has been in the code for over a year. This
would have never worked, since the element type of a vector type is never a
vector type. Also fix the conditional to be more direct in checking whether
EltTy is a vector type.

llvm-svn: 141713
2011-10-11 21:26:40 +00:00
Cameron Zwarich ab3a9b3baf Add a test for PR10565.
llvm-svn: 141647
2011-10-11 06:10:37 +00:00
Cameron Zwarich d7515ccc47 Remove a lot of the fancy scalar replacement code for dealing with llvm-gcc's
lowering of NEON code. It provides little-to-no benefit now and only introduces
additional complexity.

llvm-svn: 141646
2011-10-11 06:10:30 +00:00
Andrew Trick f9201c572e Move replaceCongruentIVs into SCEVExapander and bias toward "expanded"
IVs.

Indvars previously chose randomly between congruent IVs. Now it will
bias the decision toward IVs that SCEVExpander likes to create. This
was not done to fix any problem, it's just a welcome side effect of
factoring code.

llvm-svn: 141633
2011-10-11 02:28:51 +00:00
Lang Hames 44c78f809b Added a testcase for r141599, rdar://problem/10063881.
llvm-svn: 141628
2011-10-11 01:32:10 +00:00
Andrew Trick ce0cb3a101 Unit test for LSR phi reuse in r141442.
llvm-svn: 141472
2011-10-08 02:34:51 +00:00
Duncan Sands c52af46484 Teach GVN to also propagate switch cases. For example, in this code
switch (n) {
    case 27:
      do_something(x);
    ...
  }
the call do_something(x) will be replaced with do_something(27).  In
gcc-as-one-big-file this results in the removal of about 500 lines of
bitcode (about 0.02%), so has about 1/10 of the effect of propagating
branch conditions.

llvm-svn: 141360
2011-10-07 08:29:06 +00:00
Eli Friedman 3e3aecbc2c PR11061: Make simplifylibcalls fold strcmp("", x) correctly.
While I'm here, fix the related issue with strncmp, add some actual tests for strcmp and strncmp, and start using StringRef::compare for constant folding instead of using strcmp/strncmp so that the optimized IR isn't dependent on the host's implementation of strcmp.

llvm-svn: 141227
2011-10-05 22:27:16 +00:00
Jim Grosbach 8f9acfac89 Revert 141203. InstCombine is looping on unit tests.
llvm-svn: 141209
2011-10-05 20:44:29 +00:00
Rafael Espindola 79d0c4f4b0 Check for the returns_twice attribute in callsFunctionThatReturnsTwice. This
fixes PR11038, but there are still some cleanups to be done.

llvm-svn: 141204
2011-10-05 20:05:13 +00:00
Jim Grosbach e37e030137 Update InstCombine worklist after instruction transform is complete.
When updating the worklist for InstCombine, the Add/AddUsersToWorklist
functions may access the instruction(s) being added, for debug output for
example. If the instructions aren't yet added to the basic block, this
can result in a crash. Finish the instruction transformation before
adjusting the worklist instead.

rdar://10238555

llvm-svn: 141203
2011-10-05 20:05:00 +00:00