Commit Graph

3328 Commits

Author SHA1 Message Date
Dan Gohman f68d29edd5 Fix EnforceKnownAlignment so that it doesn't ever reduce the alignment
of an alloca or global variable.

llvm-svn: 64693
2009-02-16 23:02:21 +00:00
Dan Gohman 136aa1fb96 Delete this long-commented-out code. The situation it seems to have
been written for is no longer relevant with the elimination of
signed and unsigned types.

llvm-svn: 64625
2009-02-16 02:57:42 +00:00
Dan Gohman 9cdfd44521 Change these tests to use regular loads instead of llvm.x86.sse2.loadu.dq.
Enhance instcombine to use the preferred field of
GetOrEnforceKnownAlignment in more cases, so that regular IR operations are
optimized in the same way that the intrinsics currently are.

llvm-svn: 64623
2009-02-16 00:44:23 +00:00
Nick Lewycky 8f4a097f15 Update the list of function annotations for nocapture. All of these came up
when I was looking at functions used by python.

Highlights include, better largefile support (64-bit file sizes on 32-bit
systems), fputs string is nocapture, popen/pclose added (popen being noalias
return), modf and frexp and friends. Also added some missing 'break' statements
and combined identical sections.

llvm-svn: 64615
2009-02-15 22:47:25 +00:00
Evan Cheng e79841adbb Fix pr3571: If stride is a value defined by an instruction, make sure it dominates the loop preheader. When IV users are strength reduced, the stride is inserted into the preheader. It could create a use before def situation.
llvm-svn: 64579
2009-02-15 06:06:15 +00:00
Evan Cheng fe151ba135 ifdef out unneeded if statement.
llvm-svn: 64575
2009-02-15 03:20:37 +00:00
Dan Gohman 671f2c085f Extend the IndVarSimplify support for promoting induction variables:
- Test for signed and unsigned wrapping conditions, instead of just
   testing for non-negative induction ranges. 
 - Handle loops with GT comparisons, in addition to LT comparisons.
 - Support more cases of induction variables that don't start at 0.

llvm-svn: 64532
2009-02-14 02:31:09 +00:00
Dan Gohman 47ff6aad23 Clarify debug output.
llvm-svn: 64531
2009-02-14 02:26:50 +00:00
Dan Gohman 4bfa1d4c63 Simplify some code. hasComputableLoopEvolution is overkill in this case.
No functionality change.

llvm-svn: 64530
2009-02-14 02:25:19 +00:00
Dan Gohman 55ea72179c In CodeGenPrepare's debug output, use WriteAsOperand instead of
printing getName(), so that unnamed values are printed correctly.

llvm-svn: 64468
2009-02-13 17:45:12 +00:00
Dan Gohman a2730abaaa Complete the sentance in this comment. I have reservations
about the code it describes, but at least now the comment
is right.

llvm-svn: 64465
2009-02-13 17:36:42 +00:00
Nick Lewycky d234a845f9 Mark strto* as readonly when the endptr is null.
llvm-svn: 64460
2009-02-13 17:08:33 +00:00
Nick Lewycky a0e83a0952 On strtod and friends, mark 'endptr' nocapture in the function prototype, and
mark the first argument nocapture if endptr=NULL for each particular call.

llvm-svn: 64453
2009-02-13 15:31:46 +00:00
Dan Gohman f71a473720 Fix the code that checked if a SCEVAddRecExpr Start contains an
addrec in a different loop to check the value being added to
the accumulated Start value, not the Start value before it has
the new value added to it. This prevents LSR from going crazy
on the included testcase. Dale, please review.

llvm-svn: 64440
2009-02-13 03:58:31 +00:00
Dan Gohman ba83228cdb Fix LSR's IV sorting function to explicitly sort by bitwidth
after sorting by stride value. This prevents it from missing
IV reuse opportunities in a host-sensitive manner.

llvm-svn: 64415
2009-02-13 00:26:43 +00:00
Dan Gohman eb6be650ce Teach IndVarSimplify to optimize code using the C "int" type for
loop induction on LP64 targets. When the induction variable is
used in addressing, IndVars now is usually able to inserst a
64-bit induction variable and eliminates the sign-extending cast.
This is also useful for code using C "short" types for
induction variables on targets with 32-bit addressing.

Inserting a wider induction variable is easy; the tricky part is
determining when trunc(sext(i)) expressions are no-ops. This
requires range analysis of the loop trip count. A common case is
when the original loop iteration starts at 0 and exits when the
induction variable is signed-less-than a fixed value; this case
is now handled.

This replaces IndVarSimplify's OptimizeCanonicalIVType. It was
doing the same optimization, but it was limited to loops with
constant trip counts, because it was running after the loop
rewrite, and the information about the original induction
variable is lost by that point.

Rename ScalarEvolution's executesAtLeastOnce to
isLoopGuardedByCond, generalize it to be able to test for
ICMP_NE conditions, and move it to be a public function so that
IndVars can use it.

llvm-svn: 64407
2009-02-12 22:19:27 +00:00
Dan Gohman 656b097b8a Add a utility function to LoopInfo to return the exit block
when the loop has exactly one exit, and make use of it in
LoopIndexSplit.

llvm-svn: 64388
2009-02-12 18:08:24 +00:00
Dan Gohman e0d32c490a This code doesn't actually use the ExitingBlocks list.
llvm-svn: 64376
2009-02-12 16:36:26 +00:00
Chris Lattner 096f44de61 improve naming of values in GVN, patch by Jay Foad!
llvm-svn: 64363
2009-02-12 07:00:35 +00:00
Chris Lattner 5297c63565 fix PR3537: if resetting bbi back to the start of a block, we need to
forget about already inserted expressions.

llvm-svn: 64362
2009-02-12 06:56:08 +00:00
Nick Lewycky b92c4d72a7 Don't mark all args to strtod and friends as nocapture.
llvm-svn: 64352
2009-02-12 03:18:34 +00:00
Nate Begeman 318aea93bf the two non-mask arguments to a shufflevector must be the same width, but they do not have to be the same
width as the result value.

llvm-svn: 64335
2009-02-11 22:36:25 +00:00
Devang Patel da1a632a87 Use early exits. Reduce indentation.
llvm-svn: 64226
2009-02-10 19:28:07 +00:00
Devang Patel caf4485781 Enable scalar replacement of AllocaInst whose one of the user is dbg info.
llvm-svn: 64207
2009-02-10 07:00:59 +00:00
Dale Johannesen cd19967754 Fix PR 3471, and some cleanups.
llvm-svn: 64177
2009-02-09 22:14:15 +00:00
Bill Wendling 415515077b Mistakenly turned this on.
llvm-svn: 64065
2009-02-08 01:32:00 +00:00
Bill Wendling 5469ec1072 Revert r63999. It was breaking self-hosting builds.
llvm-svn: 64062
2009-02-08 00:58:05 +00:00
Mon P Wang 21eb52a74f Instrcombine should not change load(cast p) to cast(load p) if the cast
changes the address space of the pointer.

llvm-svn: 64035
2009-02-07 22:19:29 +00:00
Mike Stump f009a51794 Insert space to avoid warning and make code more readable.
llvm-svn: 64003
2009-02-07 03:36:02 +00:00
Devang Patel 7cb8df4ce7 Ignore DbgInfoIntrinsics.
llvm-svn: 63923
2009-02-06 06:19:06 +00:00
Chris Lattner bbbb74372b fix PR3489, use bits instead of bytes.
llvm-svn: 63916
2009-02-06 04:34:07 +00:00
Devang Patel 409b794cfe Ignore dbg intrinsics while propagating conditional expression info. Take 2.
llvm-svn: 63898
2009-02-05 23:32:52 +00:00
Devang Patel 02f58e1e8d Revert rev. 63876. It is causing llvm-gcc bootstrap failure.
llvm-svn: 63888
2009-02-05 21:46:41 +00:00
Devang Patel 58cb603d2a Remove dead blocks in the end.
llvm-svn: 63880
2009-02-05 19:59:42 +00:00
Devang Patel 5922e26d1a Ignore dbg intrinsics while propagating conditional expression info.
llvm-svn: 63876
2009-02-05 19:15:39 +00:00
Devang Patel 43a1161379 If "optimize for size" attribute is set then block non-trivial loop unswitches but allow trivial loop unswitches.
llvm-svn: 63670
2009-02-03 22:04:27 +00:00
Chris Lattner ef37dc8511 teach "convert from scalar" to handle loads of fca's.
llvm-svn: 63659
2009-02-03 21:08:45 +00:00
Chris Lattner f5df53cb46 refactor the interface to ConvertUsesOfLoadToScalar,
renaming it to ConvertScalar_ExtractValue

llvm-svn: 63658
2009-02-03 21:01:03 +00:00
Chris Lattner 576baa4adf convert ConvertUsesOfLoadToScalar to use IRBuilder,
no functionality change.

llvm-svn: 63652
2009-02-03 19:45:44 +00:00
Chris Lattner c1fb96d347 switch ConvertScalar_InsertValue to use an IRBuilder, no
functionality change.

llvm-svn: 63651
2009-02-03 19:41:50 +00:00
Chris Lattner 18f56c295c make scalar conversion handle stores of first class
aggregate values.  loads are not yet handled (coming
soon to an sroa near you).

llvm-svn: 63649
2009-02-03 19:30:11 +00:00
Chris Lattner 73eff2e6e8 Make SROA produce a vector only when the alloca is actually
accessed at least once as a vector.  This prevents it from
compiling the example in not-a-vector into:

define double @test(double %A, double %B) {
	%tmp4 = insertelement <7 x double> undef, double %A, i32 0
	%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
	%tmp2 = extractelement <7 x double> %tmp, i32 4
	ret double %tmp2
}

instead, producing the integer code.  Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.

llvm-svn: 63638
2009-02-03 18:15:05 +00:00
Evan Cheng 8542caa3f7 APInt'fy SimplifyDemandedVectorElts so it can analyze vectors with more than 64 elements.
llvm-svn: 63631
2009-02-03 10:05:09 +00:00
Chris Lattner 80810b4c2d add another case of undefined behavior without crashing, PR3466.
llvm-svn: 63620
2009-02-03 07:08:57 +00:00
Chris Lattner 6aa6b1f263 Teach ConvertUsesToScalar to handle memset, allowing it to handle
crazy cases like:

struct f {  int A, B, C, D, E, F; };
short test4() {
  struct f A;
  A.A = 1;
  memset(&A.B, 2, 12);
  return A.C;
}

llvm-svn: 63596
2009-02-03 02:01:43 +00:00
Chris Lattner 09b65ab288 rearrange how SRoA handles promotion of allocas to vectors.
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type 
itself.  This allows us to un-xfail 
test/CodeGen/X86/vec_ins_extract.ll.

llvm-svn: 63590
2009-02-03 01:30:09 +00:00
Chris Lattner 43cecd7c26 inline SROA::ConvertToScalar, no functionality change.
llvm-svn: 63544
2009-02-02 20:44:45 +00:00
Chris Lattner 18eba4f211 Fix a bug which caused us to miscompile a couple of Ada
tests.  Thanks for the beautiful reduced testcase Duncan!

llvm-svn: 63529
2009-02-02 18:02:59 +00:00
Duncan Sands 6f361ff345 Fix a comment (bytes -> bits), reformat a comment
and remove trailing whitespace.  No functionality
change.

llvm-svn: 63511
2009-02-02 10:06:20 +00:00
Duncan Sands 33d6e97e33 Fix an obvious thinko.
llvm-svn: 63510
2009-02-02 09:53:14 +00:00
Chris Lattner 1aafe4cece reduce indentation, (~XorCST->getValue()).isSignBit() -> isMaxSignedValue()
llvm-svn: 63500
2009-02-02 07:15:30 +00:00
Nick Lewycky f23908151a Reinstate this optimization to fold icmp of xor when possible. Don't try to
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.

llvm-svn: 63487
2009-01-31 21:30:05 +00:00
Chris Lattner 9e2b9f3234 Fix PR3452 (an infinite loop bootstrapping) by disabling the recent
improvements to the EvaluateInDifferentType code.  This code works 
by just inserted a bunch of new code and then seeing if it is 
useful.  Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point.  Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.

llvm-svn: 63483
2009-01-31 19:05:27 +00:00
Chris Lattner 76a63ed099 now that all the pieces are in place, teach instcombine's
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it.  This allows
it to simplify the code in multi-use-or.ll into a single 'add 
double'.

This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch.  When working on fixing those bugs,
this should be disabled.

llvm-svn: 63481
2009-01-31 08:40:03 +00:00
Chris Lattner 3e2cb66c56 simplify/clarify control flow and improve comments, no functionality change.
llvm-svn: 63480
2009-01-31 08:24:16 +00:00
Chris Lattner 83c6a141b8 make some fairly meaty internal changes to how SimplifyDemandedBits works.
Now, if it detects that "V" is the same as some other value, 
SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately.
This has two benefits:
1) simpler code in the recursive SimplifyDemandedBits routine.
2) it allows future fun stuff in instcombine where an operation has multiple
   uses and can be simplified in one context, but not all.

#2 isn't implemented yet, this patch should have no functionality change.

llvm-svn: 63479
2009-01-31 08:15:18 +00:00
Chris Lattner 585cfb2ce7 minor cleanups
llvm-svn: 63477
2009-01-31 07:26:06 +00:00
Chris Lattner 94cfb281c3 make sure to set Changed=true when instcombine hacks on the code,
not doing so prevents it from properly iterating and prevents it
from deleting the entire body of dce-iterate.ll

llvm-svn: 63476
2009-01-31 07:04:22 +00:00
Chris Lattner ec99c46d44 Simplify and generalize the SROA "convert to scalar" transformation to
be able to handle *ANY* alloca that is poked by loads and stores of 
bitcasts and GEPs with constant offsets.  Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc.  This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.

One case that is pretty great is that we compile 
2006-11-07-InvalidArrayPromote.ll into:

define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
	%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
	%tmp105 = bitcast <4 x i32> %tmp10 to i128
	%tmp1056 = zext i128 %tmp105 to i256	
	%tmp.upgrd.43 = lshr i256 %tmp1056, 96
	%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32	
	ret i32 %tmp.upgrd.44
}

which turns into:

_func:
	subl	$28, %esp
	cvttps2dq	%xmm1, %xmm0
	movaps	%xmm0, (%esp)
	movl	12(%esp), %eax
	addl	$28, %esp
	ret

Which is pretty good code all things considering :).

One effect of this is that SROA will start generating arbitrary bitwidth 
integers that are a multiple of 8 bits.  In the case above, we got a 
256 bit integer, but the codegen guys assure me that it can handle the 
simple and/or/shift/zext stuff that we're doing on these operations.

This addresses rdar://6532315

llvm-svn: 63469
2009-01-31 02:28:54 +00:00
Chris Lattner df17987c19 Fix some issues with volatility, move "CanConvertToScalar" check
after the others.

llvm-svn: 63227
2009-01-28 20:16:43 +00:00
Duncan Sands 5a913d61e3 Rename getAnalysisToUpdate to getAnalysisIfAvailable.
llvm-svn: 63198
2009-01-28 13:14:17 +00:00
Mon P Wang 3537a62704 Fixed optimization of combining two shuffles where the first shuffle inputs
has a different number of elements than the output.

llvm-svn: 62998
2009-01-26 04:39:00 +00:00
Chris Lattner 9449991c4f Handle single-entry phi nodes gracefully in condprop.
llvm-svn: 62985
2009-01-26 02:18:20 +00:00
Chris Lattner 7b6647c178 Fix PR3408 by making a non-obvious assumption very obvious, and
handling the flaw inherent in that assumption.  :)

llvm-svn: 62984
2009-01-26 02:11:30 +00:00
Chris Lattner 57cb472b56 More cleanups and simplifications, no functionality change.
llvm-svn: 62983
2009-01-26 01:57:01 +00:00
Chris Lattner d67aaa6560 tidy asserts
llvm-svn: 62982
2009-01-26 01:38:24 +00:00
Torok Edwin f4395ea97a testcase for PR3381.
Also it was an empty struct, not a void after all.

llvm-svn: 62920
2009-01-24 17:16:04 +00:00
Torok Edwin 73ff92272f void* is represented as pointer to empty struct {}.
Thus we need to check whether the struct is empty before trying to index into
it. This fixes PR3381.

llvm-svn: 62918
2009-01-24 11:30:49 +00:00
Chris Lattner 72cd68fe64 Make InstCombineStoreToCast handle aggregates more aggressively,
handling the case in Transforms/InstCombine/cast-store-gep.ll, which
is a heavily reduced testcase from Clang on x86-64.

llvm-svn: 62904
2009-01-24 01:00:13 +00:00
Gabor Greif eb61fcf2a1 Simplify the logic of getting hold of a PHI predecessor block.
There is now a direct way from value-use-iterator to incoming block in PHINode's API.
This way we avoid the iterator->index->iterator trip, and especially the costly
getOperandNo() invocation. Additionally there is now an assertion that the iterator
really refers to one of the PHI's Uses.

llvm-svn: 62869
2009-01-23 19:40:15 +00:00
Chris Lattner 77527f5812 Remove uses of uint32_t in favor of 'unsigned' for better
compatibility with cygwin.  Patch by Jay Foad!

llvm-svn: 62695
2009-01-21 18:09:24 +00:00
Dale Johannesen b5721632ee Make special cases (0 inf nan) work for frem.
Besides APFloat, this involved removing code
from two places that thought they knew the
result of frem(0., x) but were wrong.

llvm-svn: 62645
2009-01-21 00:35:19 +00:00
Chris Lattner 73d7fe5a34 improve compatibility with cygwin, patch by Jay Foad!
llvm-svn: 62535
2009-01-19 22:00:18 +00:00
Chris Lattner 6f34e317e9 Fix PR3353, infinitely jump threading an infinite loop make from switches.
llvm-svn: 62529
2009-01-19 21:20:34 +00:00
Chris Lattner 64b7bd7f9e Fix rdar://6505632, an llc crash on 483.xalancbmk
llvm-svn: 62470
2009-01-18 20:35:00 +00:00
Nick Lewycky 3ced0dfa69 Fix copy and pasted typos that prevented strtok_r, realloc, getenv, ungetc,
putc, puts, perror, vscanf and vsscanf from getting annotations.

Add annotations for eight printf functions, memalign, pread and pwrite.

On Linux, llvm-gcc sometimes renames strdup, getc, putc, strtok_r, scanf and
sscanf. Match the alternate function names.

Fix a crash annotating opendir.

Don't mark fsetpos's second parameter as nocapture. It's supposed to be
captured.

Do mark fopen's path and mode strings as nocapture. Mark ferror as readonly,
but not fileno which may set errno.

llvm-svn: 62456
2009-01-18 04:34:36 +00:00
Chris Lattner db2d9613d2 Fix PR3335 by not turning a store to one address space into a store to another.
llvm-svn: 62351
2009-01-16 20:12:52 +00:00
Chris Lattner 733256fe31 reduce indentation by using early exits, no functionality change.
llvm-svn: 62350
2009-01-16 20:08:59 +00:00
Evan Cheng beac6f8b0c Clean up previous cast optimization a bit. Also make zext elimination a bit more aggressive: if it's not necessary to emit an AND (i.e. high bits are already zero), it's profitable to evaluate the operand at a different type.
llvm-svn: 62297
2009-01-16 02:11:43 +00:00
Rafael Espindola 6de96a1b5d Add the private linkage.
llvm-svn: 62279
2009-01-15 20:18:42 +00:00
Evan Cheng ff716cb342 Eliminate a redundant check.
llvm-svn: 62264
2009-01-15 17:09:07 +00:00
Evan Cheng 60e19a46f2 - Teach CanEvaluateInDifferentType of this xform: sext (zext ty1), ty2 -> zext ty2
- Looking at the number of sign bits of the a sext instruction to determine  whether new trunc + sext pair should be added when its source is being evaluated in a different type.

llvm-svn: 62263
2009-01-15 17:01:23 +00:00
Chris Lattner 8fb9480ed2 Fix PR3325, a miscompilation of invokes by IPSCCP. Patch by Jay Foad!
llvm-svn: 62244
2009-01-14 21:01:16 +00:00
Dale Johannesen 1f0e0e7c9c Fix the time regression I introduced in 464.h264ref with
my earlier patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

Also, when we build an expression that involves a (possibly
non-affine) IV from a different loop as well as an IV from
the one we're interested in (containsAddRecFromDifferentLoop),
don't recurse into that.  We can't do much with it and will
get in trouble if we try to create new non-affine IVs or something.

More testcases are coming.

llvm-svn: 62212
2009-01-14 02:35:31 +00:00
Dan Gohman 59af77376c Make instcombine ensure that all allocas are explicitly aligned at at
least their preferred alignment.

llvm-svn: 62176
2009-01-13 20:18:38 +00:00
Duncan Sands dc020f9c3c Rename getABITypeSize to getTypePaddedSize, as
suggested by Chris.

llvm-svn: 62099
2009-01-12 20:38:59 +00:00
Chris Lattner bd3c7c8b52 Duncan is nervous about undefinedness of % with negatives. I'm
not thrilled about 64-bit % in general, so rewrite to use * instead.

llvm-svn: 62047
2009-01-11 20:41:36 +00:00
Chris Lattner b19151686f do not generated GEPs into vectors where they don't already exist.
We should treat vectors as atomic types, not like arrays.

llvm-svn: 62046
2009-01-11 20:23:52 +00:00
Chris Lattner 171d2d474f Make a couple of cleanups to the instcombine bitcast/gep
canonicalization transform based on duncan's comments:

1) improve the comment about %.
2) within our index loop make sure the offset stays 
   within the *type size*, instead of within the *abi size*.
   This allows us to reason explicitly about landing in tail
   padding and means that issues like non-zero offsets into
   [0 x foo] types don't occur anymore.

llvm-svn: 62045
2009-01-11 20:15:20 +00:00
Chris Lattner 5f54d50917 fix typo Duncan noticed.
llvm-svn: 61997
2009-01-09 18:31:39 +00:00
Chris Lattner ae0e857b98 Fix PR3304
llvm-svn: 61995
2009-01-09 18:18:43 +00:00
Misha Brukman 5cbf223916 Removed trailing whitespace from Makefiles.
llvm-svn: 61991
2009-01-09 16:44:42 +00:00
Chris Lattner f50aa6ae5c Implement rdar://6480391, extending of equality icmp's to avoid a truncation.
I noticed this in the code compiled for a routine using std::map, which produced
this code:
	%25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly
	%.lobit.i = lshr i32 %25, 31		; <i32> [#uses=1]
	%tmp.i = trunc i32 %.lobit.i to i8		; <i8> [#uses=1]
	%toBool = icmp eq i8 %tmp.i, 0		; <i1> [#uses=1]
	br i1 %toBool, label %bb3, label %bb4
which compiled to:

	call	L_memcmp$stub
	shrl	$31, %eax
	testb	%al, %al
	jne	LBB1_11	## 

with this change, we compile it to:

	call	L_memcmp$stub
	testl	%eax, %eax
	js	LBB1_11

This triggers all the time in common code, with patters like this:

	%169 = and i32 %ply, 1		; <i32> [#uses=1]
	%170 = trunc i32 %169 to i8		; <i8> [#uses=1]
	%toBool = icmp ne i8 %170, 0		; <i1> [#uses=1]

 	%7 = lshr i32 %6, 24		; <i32> [#uses=1]
	%9 = trunc i32 %7 to i8		; <i8> [#uses=1]
	%10 = icmp ne i8 %9, 0		; <i1> [#uses=1]

etc

llvm-svn: 61985
2009-01-09 07:47:06 +00:00
Chris Lattner 0f7cf1d7e1 Remove some old code that looks like a remanant from signed-types days.
llvm-svn: 61984
2009-01-09 07:10:58 +00:00
Chris Lattner 482eb70a10 Fix PR3298, a crash in Jump Threading. Apparently even
jump threading can have bugs, who knew? ;-)

llvm-svn: 61983
2009-01-09 06:08:12 +00:00
Chris Lattner fef138b140 Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible.
llvm-svn: 61980
2009-01-09 05:44:56 +00:00
Chris Lattner a784a2ce01 move some code, check to see if the input to the GEP is a bitcast
(which is constant time and cheap) before checking hasAllZeroIndices.

llvm-svn: 61976
2009-01-09 04:53:57 +00:00
Chris Lattner c518dfd11b This implements the second half of the fix for PR3290, handling
loads from allocas that cover the entire aggregate.  This handles
some memcpy/byval cases that are produced by llvm-gcc.  This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).

llvm-svn: 61915
2009-01-08 05:42:05 +00:00
Chris Lattner f2b8c82ad1 Implement the first half of PR3290: if there is a store of an
integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and 
filing them away in the SROA'd elements.

This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these.  For 
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"

In 176.gcc I see a few i32 stores to "%struct..0anon".

In the testcase, this is a difference between compiling test1 to:

_test1:
	subl	$12, %esp
	movl	20(%esp), %eax
	movl	%eax, 4(%esp)
	movl	16(%esp), %eax
	movl	%eax, (%esp)
	movl	(%esp), %eax
	addl	4(%esp), %eax
	addl	$12, %esp
	ret

vs:

_test1:
	movl	8(%esp), %eax
	addl	4(%esp), %eax
	ret

The second half of this will be to handle loads of the same form.

llvm-svn: 61853
2009-01-07 08:11:13 +00:00
Chris Lattner 9a2de65fd6 Factor a bunch of code out into a helper method.
llvm-svn: 61852
2009-01-07 07:18:45 +00:00
Chris Lattner db561146aa use continue to simplify code and reduce nesting, no functionality
change.

llvm-svn: 61851
2009-01-07 06:39:58 +00:00
Chris Lattner 938b54f383 Get TargetData once up front and cache as an ivar instead of
requerying it all over the place.

llvm-svn: 61850
2009-01-07 06:34:28 +00:00
Chris Lattner a63dba9e6c Use the hasAllZeroIndices predicate to simplify some
code, no functionality change.

llvm-svn: 61849
2009-01-07 06:25:07 +00:00
Chris Lattner 2fdcc59bb6 Change m_ConstantInt and m_SelectCst to take their constant integers
as template arguments instead of as instance variables, exposing more
optimization opportunities to the compiler earlier.

llvm-svn: 61776
2009-01-05 23:53:12 +00:00
Evan Cheng 8804293fe9 Find loop back edges only after empty blocks are eliminated.
llvm-svn: 61752
2009-01-05 21:17:27 +00:00
Nick Lewycky e4e5532e05 Move the libcall annotating part from doFinalization to doInitialization.
Finalization occurs after all the FunctionPasses in the group have run, which
is clearly not what we want.

This also means that we have to make sure that we apply the right param 
attributes when creating a new function.

Also, add a missed optimization: strdup and strndup. NoCapture and 
NoAlias return!

llvm-svn: 61658
2009-01-05 00:07:50 +00:00
Nick Lewycky 959af7ba30 Run a post-pass that marks known function declarations by name.
llvm-svn: 61632
2009-01-04 20:27:34 +00:00
Bill Wendling 0c04f9fdc3 Revert this transform. It was causing some dramatic slowdowns in a few tests. See PR3266.
llvm-svn: 61623
2009-01-04 06:19:11 +00:00
Bill Wendling 0fcff2c203 Fix comment.
llvm-svn: 61538
2009-01-01 01:19:59 +00:00
Bill Wendling aedb54a947 Add transformation:
xor (or (icmp, icmp), true) -> and(icmp, icmp)

This is possible because of De Morgan's law.

llvm-svn: 61537
2009-01-01 01:18:23 +00:00
Dale Johannesen 656237beca Revert 61362 and 61402 until SPEC breakage is fixed.
llvm-svn: 61403
2008-12-23 23:21:35 +00:00
Dale Johannesen f8b161bcd1 This fixes the bug in 175.vpr. It doesn't fix the
other SPEC breakage.  I'll be reverting all recent
changes shortly, this checking is mostly so this
change doesn't get lost.

llvm-svn: 61402
2008-12-23 23:05:26 +00:00
Dale Johannesen 93b9aa8799 Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

I owe some testcases for this, want to get it in for nightly runs.

llvm-svn: 61362
2008-12-23 02:12:52 +00:00
Owen Anderson 164274eeb1 Don't forget to remove phi nodes from the value numbering table after we collapse them.
llvm-svn: 61358
2008-12-23 00:49:51 +00:00
Bill Wendling 456e885382 Comment clean-ups. No functionality change.
llvm-svn: 61354
2008-12-22 22:32:22 +00:00
Bill Wendling e7f08e7250 Check that the instruction isn't in the value numbering scope.
llvm-svn: 61353
2008-12-22 22:28:56 +00:00
Bill Wendling 86f01cb9f6 Simplification: Negate the operator== method instead of implementing a full operator!= method.
llvm-svn: 61352
2008-12-22 22:16:31 +00:00
Bill Wendling 3c793441cb Add verification that deleted instruction isn't hiding in the PHI map.
llvm-svn: 61350
2008-12-22 22:14:07 +00:00
Bill Wendling ebb6a543fa Verify removed in a few more places.
llvm-svn: 61349
2008-12-22 21:57:30 +00:00
Bill Wendling 6b18a3994b Add verification functions to GVN which check to see that an instruction was
truely deleted. These will be expanded with further checks of all of the data
structures.

llvm-svn: 61347
2008-12-22 21:36:08 +00:00
Nick Lewycky 10eb8e533f Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2).
llvm-svn: 61297
2008-12-21 00:19:21 +00:00
Nick Lewycky 4bc10c9e77 Remove redundant test for vector-nature. Scan the vector first to see whether
our optz'n will apply to it, then build the replacement vector only if needed.

llvm-svn: 61279
2008-12-20 16:48:00 +00:00
Evan Cheng 3b3de7c228 - CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges.
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.

llvm-svn: 61248
2008-12-19 18:03:11 +00:00
Bill Wendling 070de29fcf Didn't mean to commit this.
llvm-svn: 61222
2008-12-18 22:19:50 +00:00
Bill Wendling 4c13e77d49 Re-XFAIL this test until debug stuff settles down.
llvm-svn: 61219
2008-12-18 22:13:31 +00:00
Nick Lewycky c3a70ade66 Oops! Left out a line.
Simplifying the sdiv might allow further simplifications for our users.

llvm-svn: 61196
2008-12-18 06:42:28 +00:00
Nick Lewycky 0f0e63fe73 Make all the vector elements positive in an srem of constant vector.
llvm-svn: 61195
2008-12-18 06:31:11 +00:00
Dale Johannesen 3e5843b992 Revert previous patch, appears to break bootstrap.
llvm-svn: 61181
2008-12-18 01:23:41 +00:00
Dale Johannesen 12d031b716 Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  (This patch does not handle 
all the cases where this can happen.)  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Everything above is exercised in
CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is
the same IR).

llvm-svn: 61178
2008-12-18 00:57:22 +00:00
Chris Lattner b6372933b5 reapply this hunk from Bill's reversion in r61169, it is conservative
and safe and orthogonal from turning off load pre.

llvm-svn: 61177
2008-12-18 00:51:32 +00:00
Bill Wendling be4fb8a25f Temporarily revert r61027. It was causing a bootstrap failure in "release" mode
with everyone's favorite error messages:

Comparing stages 2 and 3
warning: ./cc1-checksum.o differs
warning: ./cc1plus-checksum.o differs
Bootstrap comparison failure!
./c-decl.o differs
./cp/decl.o differs
./df-core.o differs
./gcc.o differs
./i386.o differs
./stor-layout.o differs
./tree-pretty-print.o differs
./tree.o differs
make[2]: *** [compare] Error 1
make[1]: *** [stage3-bubble] Error 2

See PR3227.

llvm-svn: 61169
2008-12-17 23:31:20 +00:00
Dale Johannesen 904ce8120d Clarify that the scale factor from CheckForIVReuse
can be negative.  Keep track of whether all uses of
an IV are outside the loop.  Some cosmetics; no
functional change.

llvm-svn: 61109
2008-12-16 22:16:28 +00:00
Chris Lattner 0c68ae0603 Enable Load PRE. This teaches GVN to push partially redundant loads up the
CFG when there is exactly one predecessor where the load is not available.
This is designed to not increase code size but still eliminate partially
redundant loads.  This fires 1765 times on 403.gcc even though it doesn't
do critical edge splitting yet (the most common reason for it to fail).

llvm-svn: 61027
2008-12-15 05:28:29 +00:00
Owen Anderson 03aacbae90 Ifdef out some code that I didn't mean to enable by default yet.
llvm-svn: 61024
2008-12-15 03:52:17 +00:00
Chris Lattner 69131fd872 make GVN try to rename inputs to the resultant replaced values, which
cleans up the generated code a bit.  This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)

llvm-svn: 61023
2008-12-15 03:46:38 +00:00
Owen Anderson bfe133e4ac Add support for slow-path GVN with full phi construction for scalars. This is disabled for now, as it actually pessimizes code in the abscence
of phi translation for load elimination.  This slow down GVN a bit, by about 2% on 403.gcc.

llvm-svn: 61021
2008-12-15 02:03:00 +00:00
Chris Lattner f5eef9f6db eliminate warning when asserts disabled.
llvm-svn: 61012
2008-12-14 21:36:23 +00:00
Owen Anderson e34c2399de Generalize GVN's phi construciton routine to work for things other than loads.
llvm-svn: 61009
2008-12-14 19:10:35 +00:00
Bill Wendling 293b9181e5 Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM:
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
  "llvm::APFloat::IEEEsingle", referenced from:
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
  "llvm::APFloat::IEEEdouble", referenced from:
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found

This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.

llvm-svn: 60977
2008-12-13 09:28:44 +00:00
Chris Lattner 1e29f7c97d make RLE preserve the name of the load that it replaces. This is just
a pretification of the IR.

llvm-svn: 60973
2008-12-13 07:22:47 +00:00
Chris Lattner fa9f99aa12 Teach GVN to invalidate some memdep information when it does an RAUW
of a pointer.  This allows is to catch more equivalencies.  For example,
the type_lists_compatible_p function used to require two iterations of
the gvn pass (!) to delete its 18 redundant loads because the first pass
would CSE all the addressing computation cruft, which would unblock the
second memdep/gvn passes from recognizing them.  This change allows
memdep/gvn to catch all 18 when run just once on the function (as is 
typical :) instead of just 3.

On all of 403.gcc, this bumps up the # reundandancies found from:

     63 gvn    - Number of instructions PRE'd
 153991 gvn    - Number of instructions deleted
  50069 gvn    - Number of loads deleted
to:
     63 gvn    - Number of instructions PRE'd
 154137 gvn    - Number of instructions deleted
  50185 gvn    - Number of loads deleted

+120 loads deleted isn't bad.

llvm-svn: 60799
2008-12-09 22:06:23 +00:00
Chris Lattner 254314e6bc rename getNonLocalDependency -> getNonLocalCallDependency, and remove
pointer stuff from it, simplifying the code a bit.

llvm-svn: 60783
2008-12-09 19:38:05 +00:00
Chris Lattner b6fc4b8d92 Switch GVN::processNonLocalLoad to using the new
MemDep::getNonLocalPointerDependency method.  There are
some open issues with this (missed optimizations) and
plenty of future work, but this does allow GVN to eliminate
*slightly* more loads (49246 vs 49033).

Switching over now allows simplification of the other code
path in memdep.

llvm-svn: 60780
2008-12-09 19:25:07 +00:00
Chris Lattner 0a5a8d54a9 random cleanups, no functionality change.
llvm-svn: 60779
2008-12-09 19:21:47 +00:00
Chris Lattner 56b20ffc5f Fix a really subtle off-by-one bug that Duncan noticed with valgrind
on test/CodeGen/Generic/2007-06-06-CriticalEdgeLandingPad.

llvm-svn: 60739
2008-12-09 04:47:21 +00:00
Chris Lattner e598370ae9 remove DebugIterations option. Despite the accusations,
jump threading has been shown to only expose problems not
have bugs itself.  I'm sure it's completely bug free! ;-)

llvm-svn: 60725
2008-12-08 22:44:07 +00:00
Devang Patel 2bb8a2f80f Fix spelling.
Thanks Duncan!

llvm-svn: 60702
2008-12-08 17:07:24 +00:00
Devang Patel 1c469d36b0 Undo previous patch.
llvm-svn: 60701
2008-12-08 17:02:37 +00:00
Chris Lattner 5df5b4cc2e don't bother touching volatile stores, they will just return clobber on
everything interesting anyway.

llvm-svn: 60640
2008-12-07 00:25:15 +00:00
Chris Lattner 57e91eaf61 Reimplement the inner loop of DSE. It now uniformly uses getDependence(),
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase).  This eliminates the last external client 
of MemDep::getDependenceFrom().

llvm-svn: 60619
2008-12-06 00:53:22 +00:00
Dale Johannesen 9efd2ce55b Make LoopStrengthReduce smarter about hoisting things out of
loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608
2008-12-05 21:47:27 +00:00
Chris Lattner 0e3d6337c6 Make a few major changes to memdep and its clients:
1. Merge the 'None' result into 'Normal', making loads
   and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
   distinguish between the cases when memdep knows the value is
   produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
   are CSEs into memdep instead of it being in GVN.  This still
   leaves verification that the arguments are hte same to GVN to
   let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use 
   getModRefInfo(CallSite,CallSite) instead of doing something 
   very weak.  This only really matters for things like DSA, but
   someday maybe we'll have some other decent context sensitive
   analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
   a) readonly call CSE is slightly simpler
   b) I eliminated the "getDependencyFrom" chaining for load 
      elimination and load CSE doesn't have to worry about 
      volatile (they are always clobbers) anymore.
   c) GVN no longer does any 'lastLoad' caching, leaving it to 
      memdep.
7. The logic in DSE is simplified a bit and sped up.  A potentially
   unsafe case was eliminated.

llvm-svn: 60607
2008-12-05 21:04:20 +00:00
Anton Korobeynikov 24600bf05a Revert invalid r60393. It causes llvm-gcc bootstrap fails in release builds.
See PR3160 for details

llvm-svn: 60604
2008-12-05 19:38:49 +00:00
Chris Lattner c100828026 Fix test/Transforms/GVN/pre-load.ll
llvm-svn: 60594
2008-12-05 17:04:12 +00:00
Chris Lattner d2a653af0c Make IsValueFullyAvailableInBlock safe.
llvm-svn: 60588
2008-12-05 07:49:08 +00:00
Devang Patel c56423b500 Rewrite code that 1) filters loops and 2) calculates new loop bounds.
This fixes many bugs. I will add more test cases in a separate check-in.

Some day, the code that manipulates CFG and updates dom. info could use refactoring help.

llvm-svn: 60554
2008-12-04 21:38:42 +00:00
Chris Lattner 8f723670ce Start simplifying a switch that has a successor that is a switch.
llvm-svn: 60534
2008-12-04 06:31:07 +00:00
Chris Lattner 75c2661d24 add a debugging option to help track down j-t problems.
llvm-svn: 60514
2008-12-04 00:07:59 +00:00
Dale Johannesen 4e9e6ea604 Remove an unused field.
llvm-svn: 60508
2008-12-03 22:43:56 +00:00
Dale Johannesen f7a588b909 Fix a misspelled function name.
llvm-svn: 60506
2008-12-03 20:56:12 +00:00
Chris Lattner dc3f6f2c12 Factor some code into a new FoldSingleEntryPHINodes method.
llvm-svn: 60501
2008-12-03 19:44:02 +00:00
Dale Johannesen d49ceff6ba Fix a really wrong comment.
llvm-svn: 60494
2008-12-03 19:25:46 +00:00
Chris Lattner 595c7279bd Teach jump threading some more simple tricks:
1) have it fold "br undef", which does occur with
   surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks.  This removes the successor
   edges, reducing the in-edges of other blocks, allowing 
   recursive simplification.
3) Fold things like:
     br COND, BBX, BBY
  BBX:
     br COND, BBZ, BBW

   which also happens because jump threading iterates.

llvm-svn: 60470
2008-12-03 07:48:08 +00:00
Dale Johannesen 4d2ecb8f68 Minor rewrite per review feedback.
llvm-svn: 60442
2008-12-02 21:17:11 +00:00
Dale Johannesen 70060013d2 Make the code do what the comment says it does.
llvm-svn: 60431
2008-12-02 18:40:09 +00:00
Chris Lattner 1db9bbe802 Implement PRE of loads in the GVN pass with a pretty cheap and
straight-forward implementation.  This does not require any extra
alias analysis queries beyond what we already do for non-local loads.

Some programs really really like load PRE.  For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.

The biggest limitation to the implementation is that it does not split
critical edges.  This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.

The implementation of this should incidentally speed up rejection of 
non-local loads because it avoids creating the repl densemap in cases 
when it won't be used for fully redundant loads.

This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact.  This is pretty close to ready though.

llvm-svn: 60408
2008-12-02 08:16:11 +00:00
Bill Wendling 87beb9b909 Remove some errors that crept in. No functionality change.
llvm-svn: 60403
2008-12-02 06:24:20 +00:00
Bill Wendling 790b4bf9a9 Merge two if-statements into one.
llvm-svn: 60402
2008-12-02 06:22:04 +00:00
Bill Wendling 5635295266 More styalistic changes. No functionality change.
llvm-svn: 60401
2008-12-02 06:18:11 +00:00
Bill Wendling 85de4b35ca - Remove the buggy -X/C -> X/-C transform. This isn't valid when X isn't a
constant. If X is a constant, then this is folded elsewhere.

- Added a note to Target/README.txt to indicate that we'd like to implement
  this when we're able.

llvm-svn: 60399
2008-12-02 05:12:47 +00:00
Bill Wendling 5369db5917 Improve comment.
llvm-svn: 60398
2008-12-02 05:09:00 +00:00
Bill Wendling 21716dff5e - Reduce nesting.
- No need to do a swap on a canonicalized pattern.

No functionality change.

llvm-svn: 60397
2008-12-02 05:06:43 +00:00
Chris Lattner ead1a61b47 some random comment improvements.
llvm-svn: 60395
2008-12-02 04:52:26 +00:00
Owen Anderson d930420ccf Fix an issue that Chris noticed, where local PRE was not properly instantiating
a new value numbering set after splitting a critical edge.  This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.

llvm-svn: 60393
2008-12-02 04:09:22 +00:00
Dale Johannesen 069a4eee55 Consider only references to an IV within the loop when
figuring out the base of the IV.  This produces better
code in the example.  (Addresses use (IV) instead of 
(BASE,IV) - a significant improvement on low-register
machines like x86).

llvm-svn: 60374
2008-12-01 22:00:01 +00:00
Bill Wendling 6f71bce4cf Don't rebuild RHSNeg. Just use the one that's already there.
llvm-svn: 60370
2008-12-01 21:06:30 +00:00
Bill Wendling 84f6f2539f Document what this check is doing. Also, no need to cast to ConstantInt.
llvm-svn: 60369
2008-12-01 21:03:43 +00:00
Bill Wendling e6c87a4952 Use a simple comparison. Overflow on integer negation can only occur when the
integer is "minint".

llvm-svn: 60366
2008-12-01 19:46:27 +00:00
Bill Wendling 47f733e4ea Generalize the FoldOrWithConstant method to fold for any two constants which
don't have overlapping bits.

llvm-svn: 60344
2008-12-01 08:32:40 +00:00
Bill Wendling 22e761b302 Reduce copy-and-paste code by splitting out the code into its own function.
llvm-svn: 60343
2008-12-01 08:23:25 +00:00
Bill Wendling 582fe6b0ca Use m_Specific() instead of double matching.
llvm-svn: 60341
2008-12-01 08:09:47 +00:00
Bill Wendling 4eecfb655b Move pattern check outside of the if-then statement. This prevents us from fiddling with constants unless we have to.
llvm-svn: 60340
2008-12-01 07:47:02 +00:00
Chris Lattner 6f5bf6a718 Rename some variables, only increment BI once at the start of the loop instead of throughout it.
llvm-svn: 60339
2008-12-01 07:35:54 +00:00
Chris Lattner f00aae4968 pull the predMap densemap out of the inner loop of performPRE, so
that it isn't reallocated all the time.  This is a tiny speedup for
GVN: 3.90->3.88s

llvm-svn: 60338
2008-12-01 07:29:03 +00:00
Chris Lattner 2b07d3ccde switch a couple more calls to use array_pod_sort.
llvm-svn: 60337
2008-12-01 06:52:57 +00:00
Chris Lattner 2c2dd15a85 Introduce a new array_pod_sort function and switch LSR to use it
instead of std::sort.  This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.

We should start using array_pod_sort where possible.

llvm-svn: 60335
2008-12-01 06:49:59 +00:00
Chris Lattner 2aebea5735 Eliminate use of setvector for the DeadInsts set, just use a smallvector.
This is a lot cheaper and conceptually simpler.

llvm-svn: 60332
2008-12-01 06:27:41 +00:00
Chris Lattner 4da78e3774 DeleteTriviallyDeadInstructions is always passed the
DeadInsts ivar, just use it directly.

llvm-svn: 60330
2008-12-01 06:14:28 +00:00
Chris Lattner a68a5a4784 simplify DeleteTriviallyDeadInstructions again, unlike my previous
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then 
notifying.

llvm-svn: 60329
2008-12-01 06:11:32 +00:00
Chris Lattner 9e6b243428 simplify these patterns using m_Specific. No need to grep for
xor in testcase (or is a substring).

llvm-svn: 60328
2008-12-01 05:16:26 +00:00
Chris Lattner 88a1f0213d Teach jump threading to clean up after itself, DCE and constfolding the
new instructions it simplifies.  Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block.  Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.

llvm-svn: 60327
2008-12-01 04:48:07 +00:00
Chris Lattner 084b3a47d3 Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs
instead of using FoldPHIArgBinOpIntoPHI.  In addition to being more
obvious, this also fixes a problem where instcombine wouldn't merge two
phis that had different variable indices.  This prevented instcombine
from factoring big chunks of code in 403.gcc.  For example:

 insn_cuid.exit:                
-       %tmp336 = load i32** @uid_cuid, align 4      
-       %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3    
-       %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32*               
-       %tmp339 = load i32* %tmp338, align 4           
-       %tmp340 = getelementptr i32* %tmp336, i32 %tmp339     
        br label %bb62
 
 bb61:       
-       %tmp341 = load i32** @uid_cuid, align 4     
-       %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3        
-       %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32*           
-       %tmp344 = load i32* %tmp343, align 4        
-       %tmp345 = getelementptr i32* %tmp341, i32 %tmp344          
        br label %bb62
 
 bb62:      
-       %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ]         
+       %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ]         
+       %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3     
+       %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32*  
+       %tmp341.pn = load i32** @uid_cuid     
+       %tmp344.pn = load i32* %tmp344.pn.in 
+       %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn   
        %iftmp.62.0 = load i32* %iftmp.62.0.in     

llvm-svn: 60325
2008-12-01 03:42:51 +00:00
Chris Lattner 9d02a70a7d Teach inst combine to merge GEPs through PHIs. This is really
important because it is sinking the loads using the GEPs, but
not the GEPs themselves.  This triggers 647 times on 403.gcc
and makes the .s file much much nicer.  For example before:

        je      LBB1_87 ## bb78
LBB1_62:        ## bb77
        leal    84(%esi), %eax
LBB1_63:        ## bb79
        movl    (%eax), %eax
...
LBB1_87:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
        jmp     LBB1_62 ## bb77


after:

        jne     LBB1_63 ## bb79
LBB1_62:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
LBB1_63:        ## bb79
        movl    84(%esi), %eax

The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):

        br i1 %tmp233, label %bb78, label %bb77
bb77:           
        %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb78:           
        call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
        %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb79:           
        %iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]           
        %iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in             

llvm-svn: 60322
2008-12-01 02:34:36 +00:00
Chris Lattner 9ce8995d24 Make GVN be more intelligent about redundant load
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal.  This makes load elimination
more aggressive.  For example, on 403.gcc, we had:

<     68 gvn    - Number of instructions PRE'd
< 152718 gvn    - Number of instructions deleted
<  49699 gvn    - Number of loads deleted
<   6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses

now we have:

>     64 gvn    - Number of instructions PRE'd
> 153623 gvn    - Number of instructions deleted
>  49856 gvn    - Number of loads deleted
>   5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses

That's an extra 157 loads deleted and extra 905 other instructions nuked.

This slows down GVN very slightly, from 3.91 to 3.96s.

llvm-svn: 60314
2008-12-01 01:31:36 +00:00
Chris Lattner 7e61dafc95 Reimplement the non-local dependency data structure in terms of a sorted
vector instead of a densemap.  This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster.  This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc

This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case.  Here's the stats
for 403.gcc:

  6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses

yay for caching :)

llvm-svn: 60313
2008-12-01 01:15:42 +00:00
Bill Wendling 5b902c5b1e Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of
permutations of this pattern.

llvm-svn: 60312
2008-12-01 01:07:11 +00:00
Chris Lattner 8541edec44 Cache analyses in ivars and add some useful DEBUG output.
This speeds up GVN from 4.0386s to 3.9376s.

llvm-svn: 60310
2008-12-01 00:40:32 +00:00
Chris Lattner 80c7d81e81 improve indentation, do cheap checks before expensive ones,
remove some fixme's.  This speeds up GVN very slightly on 403.gcc 
(4.06->4.03s)

llvm-svn: 60309
2008-11-30 23:39:23 +00:00
Eli Friedman 11c15a5de7 Minor cleanup: use getTrue and getFalse where appropriate. No
functional change.

llvm-svn: 60307
2008-11-30 22:48:49 +00:00
Eli Friedman 55e4becba9 Some minor cleanups to instcombine; no functionality change.
Note that the FoldOpIntoPhi call is dead because it's impossible for the 
first operand of a subtraction to be both a ConstantInt and a PHINode.

llvm-svn: 60306
2008-11-30 21:09:11 +00:00
Bill Wendling de89bc275c Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations.
llvm-svn: 60291
2008-11-30 13:52:49 +00:00
Bill Wendling 9eef421e12 Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This
takes care of all permutations of this pattern.

llvm-svn: 60290
2008-11-30 13:08:13 +00:00
Bill Wendling 2fe3229824 Forgot one remaining call to getSExtValue().
llvm-svn: 60289
2008-11-30 12:41:09 +00:00
Bill Wendling 2d2e7861b5 getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all
APInt calls instead.

This fixes PR3144.

llvm-svn: 60288
2008-11-30 12:38:24 +00:00
Eli Friedman 09bc610945 Optimize memmove and memset into the LLVM builtins. Note that these
only show up in code from front-ends besides llvm-gcc, like clang.

llvm-svn: 60287
2008-11-30 08:32:11 +00:00
Bill Wendling 7abf352f44 Don't make TwoToExp signed by default.
llvm-svn: 60279
2008-11-30 05:29:33 +00:00
Bill Wendling af200e9237 From Hacker's Delight:
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."

In this case, x == -1.

llvm-svn: 60278
2008-11-30 05:01:05 +00:00
Bill Wendling 70635adea3 Instcombine was illegally transforming -X/C into X/-C when either X or C
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.

llvm-svn: 60275
2008-11-30 03:42:12 +00:00
Chris Lattner 3ff6d01586 Fix a fixme by making memdep's handling of allocations more logical.
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.

llvm-svn: 60267
2008-11-30 01:39:32 +00:00
Chris Lattner 63bd586d35 Eliminate the dropInstruction method, which is not needed any more.
Fix a subtle iterator invalidation bug I introduced in the last commit.

llvm-svn: 60258
2008-11-29 23:30:39 +00:00
Chris Lattner 1c6b62eb4d Change MemDep::getNonLocalDependency to return its results as
a smallvector instead of a DenseMap.  This speeds up GVN by 5%
on 403.gcc.

llvm-svn: 60255
2008-11-29 21:33:22 +00:00
Chris Lattner f280b0c729 reimplement getNonLocalDependency with a simpler worklist
formulation that is faster and doesn't require nonLazyHelper.
Much less code.

llvm-svn: 60253
2008-11-29 21:22:42 +00:00
Chris Lattner 8c5ff516c6 Fix a thinko that manifested as a crash on clamav last night.
llvm-svn: 60251
2008-11-29 20:29:04 +00:00
Chris Lattner 51ba8d0630 Split getDependency into getDependency and getDependencyFrom, the
former does caching, the later doesn't.  This dramatically simplifies
the logic in getDependency and getDependencyFrom.

llvm-svn: 60234
2008-11-29 03:47:00 +00:00
Bill Wendling 469e3aa696 Temporarily revert r60195. It's causing an optimized bootstrap of llvm-gcc to fail.
llvm-svn: 60233
2008-11-29 03:43:04 +00:00
Chris Lattner 7f9c8a0f05 Introduce and use a new MemDepResult class to hold the results of a memdep
query.  This makes it crystal clear what cases can escape from MemDep that
the clients have to handle.  This also gives the clients a nice simplified
interface to it that is easy to poke at.

This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.

llvm-svn: 60231
2008-11-29 02:29:27 +00:00
Chris Lattner de04e1173a Reimplement the internal abstraction used by MemDep in terms
of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the 
appropriate pieces and will be useful for internal 
implementation improvements later.

I'm not particularly happy with this.  After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all.  I'll fix this in a subsequent commit.

This has no functionality change.

llvm-svn: 60230
2008-11-29 01:43:36 +00:00
Chris Lattner f3f6a801cc don't revisit instructions off the beginning of the block.
llvm-svn: 60221
2008-11-28 22:50:08 +00:00
Chris Lattner f2a8ba4cf0 simplify some code, remove escaped newline.
llvm-svn: 60213
2008-11-28 21:29:52 +00:00
Chris Lattner 8a172daa55 don't call MergeBasicBlockIntoOnlyPred on a block whose only
predecessor is itself.  This doesn't make sense, and this is
a dead infinite loop anyway.

llvm-svn: 60210
2008-11-28 19:54:49 +00:00
Chris Lattner 1adb6759ef rewrite a big chunk of how DSE does recursive dead operand
elimination to use more modern infrastructure.  Also do a bunch
of small cleanups.

llvm-svn: 60201
2008-11-28 00:27:14 +00:00
Chris Lattner c077a2a535 Simplify LoopStrengthReduce::DeleteTriviallyDeadInstructions by
making it use RecursivelyDeleteTriviallyDeadInstructions to do
the heavy lifting.

llvm-svn: 60195
2008-11-27 23:23:35 +00:00
Chris Lattner 96e2dbe008 use continue to reduce indentation
llvm-svn: 60192
2008-11-27 23:00:20 +00:00
Chris Lattner c6c481cdfc remove doConstantPropagation and dceInstruction, they are just
wrappers around the interesting code and use an obscure iterator
abstraction that dates back many many years.

Move EraseDeadInstructions to Transforms/Utils and name it
RecursivelyDeleteTriviallyDeadInstructions.

llvm-svn: 60191
2008-11-27 22:57:53 +00:00
Chris Lattner 5ef9ebf787 simplify code.
llvm-svn: 60190
2008-11-27 22:56:14 +00:00
Chris Lattner c92fa42ddd simplify this logic.
llvm-svn: 60189
2008-11-27 22:46:09 +00:00
Nick Lewycky 4ab50b93c8 Chris prefers icmp/select over udiv!
llvm-svn: 60187
2008-11-27 22:41:10 +00:00
Nick Lewycky 69941fd0a0 Add a couple of missed optimizations on integer vectors. Multiply and divide
by 1, as well as multiply by -1.

llvm-svn: 60182
2008-11-27 20:21:08 +00:00
Chris Lattner 4059f43b74 defensive patch: if CGP is merging a block with the entry block, make sure
it ends up being the entry block.

llvm-svn: 60180
2008-11-27 19:29:14 +00:00
Chris Lattner 5dfbfcd80d Fix PR3138: if we merge the entry block into another block, make sure to
move the other block back up into the entry position!

llvm-svn: 60179
2008-11-27 19:25:19 +00:00
Chris Lattner e0d019def6 switch InstCombine::visitLoadInst to use
FindAvailableLoadedValue

llvm-svn: 60169
2008-11-27 08:56:30 +00:00
Chris Lattner 72f16e70f0 move FindAvailableLoadedValue from JumpThreading to Transforms/Utils.
llvm-svn: 60166
2008-11-27 08:10:05 +00:00
Chris Lattner 206250284d Use the new MergeBasicBlockIntoOnlyPred function.
llvm-svn: 60163
2008-11-27 07:54:12 +00:00
Chris Lattner 99d6809ac1 move MergeBasicBlockIntoOnlyPred to Transforms/Utils.
llvm-svn: 60162
2008-11-27 07:43:12 +00:00
Chris Lattner 240051aace rename ThreadBlock to ProcessBlock, since it does other things than
just simple threading.

llvm-svn: 60157
2008-11-27 07:20:04 +00:00
Chris Lattner 98d89d1b1b Make jump threading substantially more powerful, in the following ways:
1. Make it fold blocks separated by an unconditional branch.  This enables
   jump threading to see a broader scope.
2. Make jump threading able to eliminate locally redundant loads when they
   feed the branch condition of a block.  This frequently occurs due to
   reg2mem running.
3. Make jump threading able to eliminate *partially redundant* loads when
   they feed the branch condition of a block.  This is common in code with
   lots of loads and stores like C++ code and 255.vortex.

This implements thread-loads.ll and rdar://6402033.

Per the fixme's, several pieces of this should be moved into Transforms/Utils.

llvm-svn: 60148
2008-11-27 05:07:53 +00:00
Chris Lattner 397a11ccd8 Turn on my codegen prepare heuristic by default. It doesn't affect
performance in most cases on the Grawp tester, but does speed some 
things up (like shootout/hash by 15%).  This also doesn't impact 
compile time in a noticable way on the Grawp tester.

It also, of course, gets the testcase it was designed for right :)

llvm-svn: 60120
2008-11-26 22:16:44 +00:00
Chris Lattner fef04acc50 teach the new heuristic how to handle inline asm.
llvm-svn: 60088
2008-11-26 04:59:11 +00:00
Chris Lattner 6d71b7fb95 Improve ValueAlreadyLiveAtInst with a cheap and dirty, but effective
heuristic: the value is already live at the new memory operation if
it is used by some other instruction in the memop's block.  This is
cheap and simple to compute (moreso than full liveness).

This improves the new heuristic even more.  For example, it cuts two
out of three new instructions out of 255.vortex:DbmFileInGrpHdr, 
which is one of the functions that the heuristic regressed.  This
overall eliminates another 40 instructions from 403.gcc and visibly
reduces register pressure in 255.vortex (though this only actually
ends up saving the 2 instructions from the whole program).

llvm-svn: 60084
2008-11-26 03:20:37 +00:00
Chris Lattner e34fe2c52d Start rewroking a subpiece of the profitability heuristic to be
phrased in terms of liveness instead of as a horrible hack.  :)

In pratice, this doesn't change the generated code for either 
255.vortex or 403.gcc, but it could cause minor code changes in 
theory.  This is framework for coming changes.

llvm-svn: 60082
2008-11-26 03:02:41 +00:00
Chris Lattner 383a797f42 add a comment, make save/restore logic more obvious.
llvm-svn: 60076
2008-11-26 02:11:11 +00:00
Chris Lattner eb3e4fb6fb This adds in some code (currently disabled unless you pass
-enable-smarter-addr-folding to llc) that gives CGP a better
cost model for when to sink computations into addressing modes.
The basic observation is that sinking increases register 
pressure when part of the addr computation has to be available
for other reasons, such as having a use that is a non-memory
operation.  In cases where it works, it can substantially reduce
register pressure.

This code is currently an overall win on 403.gcc and 255.vortex
(the two things I've been looking at), but there are several 
things I want to do before enabling it by default:

1. This isn't doing any caching of results, so it is much slower 
   than it could be.  It currently slows down release-asserts llc 
   by 1.7% on 176.gcc: 27.12s -> 27.60s.
2. This doesn't think about inline asm memory operands yet.
3. The cost model botches the case when the needed value is live
   across the computation for other reasons.

I'll continue poking at this, and eventually turn it on as llcbeta.

llvm-svn: 60074
2008-11-26 02:00:14 +00:00
Evan Cheng 496b042e20 Revert r60042. IndVarSimplify should check if APFloat is PPCDoubleDouble first before trying to convert it to an integer.
llvm-svn: 60072
2008-11-26 01:11:57 +00:00
Chris Lattner a9ab165b08 Teach CodeGenPrepare to look through Bitcast instructions when attempting to
optimize addressing modes.  This allows us to optimize things like isel-sink2.ll
into:

	movl	4(%esp), %eax
	cmpb	$0, 4(%eax)
	jne	LBB1_2	## F
LBB1_1:	## TB
	movl	$4, %eax
	ret
LBB1_2:	## F
	movzbl	7(%eax), %eax
	ret

instead of:

_test:
	movl	4(%esp), %eax
	cmpb	$0, 4(%eax)
	leal	4(%eax), %eax
	jne	LBB1_2	## F
LBB1_1:	## TB
	movl	$4, %eax
	ret
LBB1_2:	## F
	movzbl	3(%eax), %eax
	ret

This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s.

Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt
it is really testing what it thinks it is.

llvm-svn: 60068
2008-11-26 00:26:16 +00:00
Chris Lattner f3e95505c5 Teach MatchScaledValue to handle Scales by 1 with MatchAddr (which
can recursively match things) and scales by 0 by ignoring them.
This triggers once in 403.gcc, saving 1 (!!!!) instruction in the 
whole huge app.

llvm-svn: 60013
2008-11-25 07:25:26 +00:00
Chris Lattner 728f90220a significantly refactor all the addressing mode matching logic
into a new AddressingModeMatcher class.  This makes it easier
to reason about and reduces passing around of stuff, but has
no functionality change.

llvm-svn: 60012
2008-11-25 07:09:13 +00:00
Chris Lattner 58f49d2916 refactor all the constantexpr/instruction handling code out into a
new FindMaximalLegalAddressingModeForOperation helper method.

llvm-svn: 60011
2008-11-25 05:15:49 +00:00
Chris Lattner a3fbff15b9 another minor tweak
llvm-svn: 60010
2008-11-25 04:47:41 +00:00
Chris Lattner d616ef5683 minor cleanups no functionality change.
llvm-svn: 60009
2008-11-25 04:42:10 +00:00
Chris Lattner 6416a6b7a0 rearrange and tidy some code, no functionality change.
llvm-svn: 59990
2008-11-24 22:44:16 +00:00
Chris Lattner d917c8c8fe minor cleanups to debug code, no functionality change.
llvm-svn: 59989
2008-11-24 22:40:05 +00:00
Chris Lattner d78894197a reenable the right part of the code.
llvm-svn: 59985
2008-11-24 21:26:21 +00:00
Chris Lattner 992a541002 revert an accidental commit, this fixes the regression on test/CodeGen/X86/isel-sink.ll
llvm-svn: 59976
2008-11-24 19:40:34 +00:00
Chris Lattner 53d6a07869 Fix 3113: If we have a dead cyclic PHI, replace the whole thing
with an undef.

llvm-svn: 59972
2008-11-24 19:25:36 +00:00
Devang Patel 702f45df58 Fix build failure.
llvm-svn: 59844
2008-11-21 21:00:20 +00:00
Devang Patel cb181bb203 Silence unused variable warnings.
llvm-svn: 59841
2008-11-21 20:00:59 +00:00
Chris Lattner dd7083452f reapply Sanjiv's patch to genericize memcpy/memset/memmove to take an
arbitrary integer width for the count.

llvm-svn: 59823
2008-11-21 16:42:48 +00:00
Bill Wendling 4bce2bff88 Revert r59802. It was breaking the build of llvm-gcc:
g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include  -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include  -D_DEBUG  -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS   -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include  -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic'
make[3]: *** [llvm-convert.o] Error 1
make[3]: *** Waiting for unfinished jobs....
rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2

llvm-svn: 59809
2008-11-21 09:09:41 +00:00
Sanjiv Gupta 09a203765a Make mem[cpy,move,set] intrinsics overloaded.
llvm-svn: 59802
2008-11-21 07:49:09 +00:00
Nick Lewycky 07d726ec4d Optimize (x/y)*y into x-(x%y) in general. Div and rem are about the same, and
a subtract is cheaper than a multiply. This generalizes an existing transform.

llvm-svn: 59800
2008-11-21 07:33:58 +00:00
Devang Patel 45f1ae028e Fix unused variable warnings.
llvm-svn: 59778
2008-11-21 01:52:59 +00:00
Devang Patel 827bced2b1 Let instcombiner remove redundant dbg intrinsics.
llvm-svn: 59658
2008-11-19 18:59:41 +00:00
Devang Patel 7ed6c5317c If there are two consecutive llvm.dbg.stoppoint calls then
it is likely that the optimizer deleted code in between these
two intrinsics. Keep only the last llvm.dbg.stoppoint in this case.

llvm-svn: 59657
2008-11-19 18:56:50 +00:00
Bill Wendling cf194e9a27 Cast to remove warning about comparing signed and unsigned.
llvm-svn: 59518
2008-11-18 10:57:27 +00:00
Devang Patel f1e9329209 Give SIToFPInst preference over UIToFPInst because it is faster on platforms that are widely used.
llvm-svn: 59476
2008-11-18 00:40:02 +00:00
Devang Patel 180afd2c55 While handling floating point IVs lift restrictions on initial value and increment value.
llvm-svn: 59471
2008-11-17 23:27:13 +00:00
Devang Patel aa3d68d301 Handle floating point ivs during doInitialization().
llvm-svn: 59466
2008-11-17 21:32:02 +00:00
Chris Lattner 7917b43a28 eliminate some std::set's.
llvm-svn: 59409
2008-11-16 07:17:51 +00:00
Chris Lattner 44152742a0 simplify a bunch more instcombines to use m_Specific etc.
llvm-svn: 59403
2008-11-16 05:38:51 +00:00
Chris Lattner d397fef50d factor the code for simplifying (icmp)|(icmp) into its own function.
llvm-svn: 59402
2008-11-16 05:20:07 +00:00
Chris Lattner 909b969b18 do some computation with apints instead of ConstantInts.
llvm-svn: 59401
2008-11-16 05:14:43 +00:00
Chris Lattner feaea9bdf7 merge a check into a place where it is simpler.
llvm-svn: 59400
2008-11-16 05:10:52 +00:00
Chris Lattner 269cbd5770 factor a whole bunch of code out into a helper function.
llvm-svn: 59398
2008-11-16 05:06:21 +00:00
Chris Lattner b37b6e7e96 simplify the conditions on two gigantic if's, decreasing indentation
a bit.  Next step is to factor out into their own helper functions.

llvm-svn: 59397
2008-11-16 04:55:20 +00:00
Chris Lattner f1be285134 simplify some instcombine matches by using m_Specific
llvm-svn: 59395
2008-11-16 04:46:19 +00:00
Chris Lattner fae5e33111 Use new m_SelectCst template to eliminate macros.
llvm-svn: 59392
2008-11-16 04:33:38 +00:00
Chris Lattner 569d78cbb5 simplify code.
llvm-svn: 59390
2008-11-16 04:26:55 +00:00
Chris Lattner c3f3b059d0 Handle the case where there is no "not". It is possible it got
folded into the select.

llvm-svn: 59389
2008-11-16 04:25:26 +00:00
Chris Lattner 5f6d9a313b factor a bunch of copy/paste code out into a helper function.
Eliminate the cases checking for cond?0:-1, since that is already
handled by commutative checking.

llvm-svn: 59388
2008-11-16 04:24:12 +00:00
Chris Lattner 68d2da2a19 rearrange some code, no functionality change.
llvm-svn: 59381
2008-11-16 03:56:24 +00:00
Chris Lattner e02c7c7ad2 if we're going to use a macro, use it maximally. no functionality change.
llvm-svn: 59380
2008-11-16 03:54:57 +00:00
Devang Patel 53b39b5467 Cleanup debug info. assocated with deleted instructions.
llvm-svn: 59012
2008-11-11 00:54:10 +00:00
Devang Patel d0ce981372 If the sign of exit condition and split condition does not match
then do not split loop index.

llvm-svn: 58995
2008-11-10 19:48:34 +00:00
Bill Wendling 7ef7314d1a Third time's a charm.
The previous patches didn't match correctly. Also, we need to make sure that
the conditional is the same before doing the transformation.

llvm-svn: 58978
2008-11-10 06:59:06 +00:00
Mon P Wang 25f0106fd9 Added support for the following definition of shufflevector
<result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> 

llvm-svn: 58964
2008-11-10 04:46:22 +00:00
Bill Wendling 4fb13c051d Correction for the last patch. Should match the conditional in the first part
of the select match, not the select instruction itself.

llvm-svn: 58947
2008-11-09 23:37:53 +00:00
Bill Wendling 1579287550 The method of doing the matching with a 'select' instruction was wrong. The
original code was matching like this:

	if (match(A, m_Not(m_Value(B))))

B was already matched as a 'select' instruction. However, this isn't matching
what we think it's matching. It would match B as a 'Value', so basically
anything would match to it. In this case, a Constant matched. B was replaced
with a constant representation. And then the wrong value would be used in the
SelectInst::Create statement, causing a crash.

After thinking on this for a moment, and after Nick L. told me how the pattern
matching stuff was supposed to work, the solution was to match NOT an m_Value,
but an m_Select.

llvm-svn: 58946
2008-11-09 23:17:42 +00:00
Nuno Lopes 2e42927e7c fix leakage of ValueNumbering
llvm-svn: 58933
2008-11-09 12:45:23 +00:00
Bill Wendling 3f547be28f If the LHS of the FCMP is coming from a UIToFP instruction, then we don't want
to generate signed ICMP instructions to replace the FCMP. This would violate
the following:

define i1 @test1(i32 %val) {
  %1 = uitofp i32 %val to double
  %2 = fcmp ole double %1, 0.000000e+00
  ret i1 %2
}

would be transformed into:

define i1 @test1(i32 %val) {
  %1 = icmp slt i33 %val, 1
  ret i1 %1
}

which is obviously wrong. This patch modifes InstCombiner::FoldFCmp_IntToFP_Cst
to handle when the LHS comes from UIToFP.

llvm-svn: 58929
2008-11-09 04:26:50 +00:00
Mon P Wang 5ca2ec65bd Fixed scalarizing an extract subvector and prevent an infinite loop
when simplify a vector. 

llvm-svn: 58820
2008-11-06 22:52:21 +00:00
Oscar Fuentes 076e048cf7 CMake: updated list of source files.
llvm-svn: 58736
2008-11-05 00:11:22 +00:00
Dan Gohman 8cdea717a3 Add a new pass to simplify specific half_powr function calls. This is
a specialized pass that it not likely to be generally useful.

llvm-svn: 58732
2008-11-04 23:41:45 +00:00
Dale Johannesen 0a7b4f5800 Allow SROA of vectors. Removing this caused a
huge performance regression in something we care
about.  This may not be final fix.

llvm-svn: 58718
2008-11-04 20:54:03 +00:00
Devang Patel fe57d109b6 Ignore conditions that are outside the loop.
llvm-svn: 58631
2008-11-03 19:38:07 +00:00
Devang Patel c1631db93b Turn floating point IVs into integer IVs where possible.
This allows SCEV users to effectively calculate trip count.
LSR later on transforms back integer IVs to floating point IVs
later on to avoid int-to-float casts inside the loop.

llvm-svn: 58625
2008-11-03 18:32:19 +00:00
Nick Lewycky d73806a9cc Replace explicit loop with utility function.
llvm-svn: 58593
2008-11-03 03:49:14 +00:00
Nick Lewycky 8d8acf327b Fix demanded bits analysis with srem by negative number. Based on a patch
by Richard Osborne.

llvm-svn: 58555
2008-11-02 02:41:50 +00:00
Dan Gohman 83eea0b17f Fix this recently moved code to use the correct type. CI is now a
ConstantInt, and SI is the original cast instruction. This fixes
PR2996.

llvm-svn: 58549
2008-11-02 00:17:33 +00:00
Dan Gohman 13cbcf1c18 Canonicalize sext(i1) to i1?-1:0, and update various instcombine
optimizations accordingly.

llvm-svn: 58457
2008-10-30 20:40:10 +00:00
Dan Gohman 2c34c130bf (A & sext(C)) | (B & ~sext(C) -> C ? A : B
llvm-svn: 58351
2008-10-28 22:38:57 +00:00