Commit Graph

3063 Commits

Author SHA1 Message Date
Dan Gohman eb6be650ce Teach IndVarSimplify to optimize code using the C "int" type for
loop induction on LP64 targets. When the induction variable is
used in addressing, IndVars now is usually able to inserst a
64-bit induction variable and eliminates the sign-extending cast.
This is also useful for code using C "short" types for
induction variables on targets with 32-bit addressing.

Inserting a wider induction variable is easy; the tricky part is
determining when trunc(sext(i)) expressions are no-ops. This
requires range analysis of the loop trip count. A common case is
when the original loop iteration starts at 0 and exits when the
induction variable is signed-less-than a fixed value; this case
is now handled.

This replaces IndVarSimplify's OptimizeCanonicalIVType. It was
doing the same optimization, but it was limited to loops with
constant trip counts, because it was running after the loop
rewrite, and the information about the original induction
variable is lost by that point.

Rename ScalarEvolution's executesAtLeastOnce to
isLoopGuardedByCond, generalize it to be able to test for
ICMP_NE conditions, and move it to be a public function so that
IndVars can use it.

llvm-svn: 64407
2009-02-12 22:19:27 +00:00
Dan Gohman 656b097b8a Add a utility function to LoopInfo to return the exit block
when the loop has exactly one exit, and make use of it in
LoopIndexSplit.

llvm-svn: 64388
2009-02-12 18:08:24 +00:00
Dan Gohman e0d32c490a This code doesn't actually use the ExitingBlocks list.
llvm-svn: 64376
2009-02-12 16:36:26 +00:00
Chris Lattner 096f44de61 improve naming of values in GVN, patch by Jay Foad!
llvm-svn: 64363
2009-02-12 07:00:35 +00:00
Chris Lattner 5297c63565 fix PR3537: if resetting bbi back to the start of a block, we need to
forget about already inserted expressions.

llvm-svn: 64362
2009-02-12 06:56:08 +00:00
Nick Lewycky b92c4d72a7 Don't mark all args to strtod and friends as nocapture.
llvm-svn: 64352
2009-02-12 03:18:34 +00:00
Nate Begeman 318aea93bf the two non-mask arguments to a shufflevector must be the same width, but they do not have to be the same
width as the result value.

llvm-svn: 64335
2009-02-11 22:36:25 +00:00
Devang Patel da1a632a87 Use early exits. Reduce indentation.
llvm-svn: 64226
2009-02-10 19:28:07 +00:00
Devang Patel caf4485781 Enable scalar replacement of AllocaInst whose one of the user is dbg info.
llvm-svn: 64207
2009-02-10 07:00:59 +00:00
Dale Johannesen cd19967754 Fix PR 3471, and some cleanups.
llvm-svn: 64177
2009-02-09 22:14:15 +00:00
Bill Wendling 415515077b Mistakenly turned this on.
llvm-svn: 64065
2009-02-08 01:32:00 +00:00
Bill Wendling 5469ec1072 Revert r63999. It was breaking self-hosting builds.
llvm-svn: 64062
2009-02-08 00:58:05 +00:00
Mon P Wang 21eb52a74f Instrcombine should not change load(cast p) to cast(load p) if the cast
changes the address space of the pointer.

llvm-svn: 64035
2009-02-07 22:19:29 +00:00
Mike Stump f009a51794 Insert space to avoid warning and make code more readable.
llvm-svn: 64003
2009-02-07 03:36:02 +00:00
Devang Patel 7cb8df4ce7 Ignore DbgInfoIntrinsics.
llvm-svn: 63923
2009-02-06 06:19:06 +00:00
Chris Lattner bbbb74372b fix PR3489, use bits instead of bytes.
llvm-svn: 63916
2009-02-06 04:34:07 +00:00
Devang Patel 409b794cfe Ignore dbg intrinsics while propagating conditional expression info. Take 2.
llvm-svn: 63898
2009-02-05 23:32:52 +00:00
Devang Patel 02f58e1e8d Revert rev. 63876. It is causing llvm-gcc bootstrap failure.
llvm-svn: 63888
2009-02-05 21:46:41 +00:00
Devang Patel 58cb603d2a Remove dead blocks in the end.
llvm-svn: 63880
2009-02-05 19:59:42 +00:00
Devang Patel 5922e26d1a Ignore dbg intrinsics while propagating conditional expression info.
llvm-svn: 63876
2009-02-05 19:15:39 +00:00
Devang Patel 43a1161379 If "optimize for size" attribute is set then block non-trivial loop unswitches but allow trivial loop unswitches.
llvm-svn: 63670
2009-02-03 22:04:27 +00:00
Chris Lattner ef37dc8511 teach "convert from scalar" to handle loads of fca's.
llvm-svn: 63659
2009-02-03 21:08:45 +00:00
Chris Lattner f5df53cb46 refactor the interface to ConvertUsesOfLoadToScalar,
renaming it to ConvertScalar_ExtractValue

llvm-svn: 63658
2009-02-03 21:01:03 +00:00
Chris Lattner 576baa4adf convert ConvertUsesOfLoadToScalar to use IRBuilder,
no functionality change.

llvm-svn: 63652
2009-02-03 19:45:44 +00:00
Chris Lattner c1fb96d347 switch ConvertScalar_InsertValue to use an IRBuilder, no
functionality change.

llvm-svn: 63651
2009-02-03 19:41:50 +00:00
Chris Lattner 18f56c295c make scalar conversion handle stores of first class
aggregate values.  loads are not yet handled (coming
soon to an sroa near you).

llvm-svn: 63649
2009-02-03 19:30:11 +00:00
Chris Lattner 73eff2e6e8 Make SROA produce a vector only when the alloca is actually
accessed at least once as a vector.  This prevents it from
compiling the example in not-a-vector into:

define double @test(double %A, double %B) {
	%tmp4 = insertelement <7 x double> undef, double %A, i32 0
	%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
	%tmp2 = extractelement <7 x double> %tmp, i32 4
	ret double %tmp2
}

instead, producing the integer code.  Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.

llvm-svn: 63638
2009-02-03 18:15:05 +00:00
Evan Cheng 8542caa3f7 APInt'fy SimplifyDemandedVectorElts so it can analyze vectors with more than 64 elements.
llvm-svn: 63631
2009-02-03 10:05:09 +00:00
Chris Lattner 80810b4c2d add another case of undefined behavior without crashing, PR3466.
llvm-svn: 63620
2009-02-03 07:08:57 +00:00
Chris Lattner 6aa6b1f263 Teach ConvertUsesToScalar to handle memset, allowing it to handle
crazy cases like:

struct f {  int A, B, C, D, E, F; };
short test4() {
  struct f A;
  A.A = 1;
  memset(&A.B, 2, 12);
  return A.C;
}

llvm-svn: 63596
2009-02-03 02:01:43 +00:00
Chris Lattner 09b65ab288 rearrange how SRoA handles promotion of allocas to vectors.
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type 
itself.  This allows us to un-xfail 
test/CodeGen/X86/vec_ins_extract.ll.

llvm-svn: 63590
2009-02-03 01:30:09 +00:00
Chris Lattner 43cecd7c26 inline SROA::ConvertToScalar, no functionality change.
llvm-svn: 63544
2009-02-02 20:44:45 +00:00
Chris Lattner 18eba4f211 Fix a bug which caused us to miscompile a couple of Ada
tests.  Thanks for the beautiful reduced testcase Duncan!

llvm-svn: 63529
2009-02-02 18:02:59 +00:00
Duncan Sands 6f361ff345 Fix a comment (bytes -> bits), reformat a comment
and remove trailing whitespace.  No functionality
change.

llvm-svn: 63511
2009-02-02 10:06:20 +00:00
Duncan Sands 33d6e97e33 Fix an obvious thinko.
llvm-svn: 63510
2009-02-02 09:53:14 +00:00
Chris Lattner 1aafe4cece reduce indentation, (~XorCST->getValue()).isSignBit() -> isMaxSignedValue()
llvm-svn: 63500
2009-02-02 07:15:30 +00:00
Nick Lewycky f23908151a Reinstate this optimization to fold icmp of xor when possible. Don't try to
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.

llvm-svn: 63487
2009-01-31 21:30:05 +00:00
Chris Lattner 9e2b9f3234 Fix PR3452 (an infinite loop bootstrapping) by disabling the recent
improvements to the EvaluateInDifferentType code.  This code works 
by just inserted a bunch of new code and then seeing if it is 
useful.  Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point.  Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.

llvm-svn: 63483
2009-01-31 19:05:27 +00:00
Chris Lattner 76a63ed099 now that all the pieces are in place, teach instcombine's
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it.  This allows
it to simplify the code in multi-use-or.ll into a single 'add 
double'.

This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch.  When working on fixing those bugs,
this should be disabled.

llvm-svn: 63481
2009-01-31 08:40:03 +00:00
Chris Lattner 3e2cb66c56 simplify/clarify control flow and improve comments, no functionality change.
llvm-svn: 63480
2009-01-31 08:24:16 +00:00
Chris Lattner 83c6a141b8 make some fairly meaty internal changes to how SimplifyDemandedBits works.
Now, if it detects that "V" is the same as some other value, 
SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately.
This has two benefits:
1) simpler code in the recursive SimplifyDemandedBits routine.
2) it allows future fun stuff in instcombine where an operation has multiple
   uses and can be simplified in one context, but not all.

#2 isn't implemented yet, this patch should have no functionality change.

llvm-svn: 63479
2009-01-31 08:15:18 +00:00
Chris Lattner 585cfb2ce7 minor cleanups
llvm-svn: 63477
2009-01-31 07:26:06 +00:00
Chris Lattner 94cfb281c3 make sure to set Changed=true when instcombine hacks on the code,
not doing so prevents it from properly iterating and prevents it
from deleting the entire body of dce-iterate.ll

llvm-svn: 63476
2009-01-31 07:04:22 +00:00
Chris Lattner ec99c46d44 Simplify and generalize the SROA "convert to scalar" transformation to
be able to handle *ANY* alloca that is poked by loads and stores of 
bitcasts and GEPs with constant offsets.  Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc.  This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.

One case that is pretty great is that we compile 
2006-11-07-InvalidArrayPromote.ll into:

define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
	%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
	%tmp105 = bitcast <4 x i32> %tmp10 to i128
	%tmp1056 = zext i128 %tmp105 to i256	
	%tmp.upgrd.43 = lshr i256 %tmp1056, 96
	%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32	
	ret i32 %tmp.upgrd.44
}

which turns into:

_func:
	subl	$28, %esp
	cvttps2dq	%xmm1, %xmm0
	movaps	%xmm0, (%esp)
	movl	12(%esp), %eax
	addl	$28, %esp
	ret

Which is pretty good code all things considering :).

One effect of this is that SROA will start generating arbitrary bitwidth 
integers that are a multiple of 8 bits.  In the case above, we got a 
256 bit integer, but the codegen guys assure me that it can handle the 
simple and/or/shift/zext stuff that we're doing on these operations.

This addresses rdar://6532315

llvm-svn: 63469
2009-01-31 02:28:54 +00:00
Chris Lattner df17987c19 Fix some issues with volatility, move "CanConvertToScalar" check
after the others.

llvm-svn: 63227
2009-01-28 20:16:43 +00:00
Duncan Sands 5a913d61e3 Rename getAnalysisToUpdate to getAnalysisIfAvailable.
llvm-svn: 63198
2009-01-28 13:14:17 +00:00
Mon P Wang 3537a62704 Fixed optimization of combining two shuffles where the first shuffle inputs
has a different number of elements than the output.

llvm-svn: 62998
2009-01-26 04:39:00 +00:00
Chris Lattner 9449991c4f Handle single-entry phi nodes gracefully in condprop.
llvm-svn: 62985
2009-01-26 02:18:20 +00:00
Chris Lattner 7b6647c178 Fix PR3408 by making a non-obvious assumption very obvious, and
handling the flaw inherent in that assumption.  :)

llvm-svn: 62984
2009-01-26 02:11:30 +00:00
Chris Lattner 57cb472b56 More cleanups and simplifications, no functionality change.
llvm-svn: 62983
2009-01-26 01:57:01 +00:00