llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	0a2e11260e	Don't unswitch really large loops even if they are mostly filled with empty blocks. llvm-svn: 28959	2006-06-28 16:38:55 +00:00
Andrew Lenharth	ebfa24ee9a	Catch more function pointer casting problems Remove the Function pointer cast in these calls, converting it to a cast of argument. %tmp60 = tail call int cast (int (ulong)* %str to int (int))( int 10 ) %tmp60 = tail call int cast (int (ulong) %str to int (int)*)( uint %tmp51 ) llvm-svn: 28953	2006-06-28 01:01:52 +00:00
Owen Anderson	bb3ae5eb8f	Fix for 2006-06-27-DeadSwitchCase.ll Be more careful when updating Phi nodes after eliminating dead switch cases. Fix proposed by Chris. llvm-svn: 28947	2006-06-27 22:26:09 +00:00
Owen Anderson	b659bb4196	De-pessimize the handling of LCSSA Phi nodes in IndVarSimplify. Hopefully this will make Shootout-C/nestedloop faster. llvm-svn: 28924	2006-06-27 02:17:08 +00:00
Chris Lattner	49771a0462	random code cleanups, no functionality change llvm-svn: 28914	2006-06-26 19:10:05 +00:00
Owen Anderson	f52351e50f	Make LoopUnswitch able to unswitch loops with live-out values by taking advantage of LCSSA. This results several times the number of unswitchings occurring on tests such and timberwolfmc, unix-tbl, and ldecod. llvm-svn: 28912	2006-06-26 07:44:36 +00:00
Chris Lattner	053fb9319d	Fix IndVarsSimplify/2006-06-16-Indvar-LCSSA-Crash.ll, a case where a "LCSSA" phi node causes indvars to break dominance properties. This fixes causes indvars to avoid inserting aggressive code in this case, instead indvars should be fixed to be more aggressive in the face of lcssa phi's. llvm-svn: 28850	2006-06-17 01:02:31 +00:00
Chris Lattner	c482a9e31a	Implement Transforms/InstCombine/bswap.ll, turning common shift/and/or bswap idioms into bswap intrinsics. llvm-svn: 28803	2006-06-15 19:07:26 +00:00
Chris Lattner	0c4f5a655a	Fix Transforms/LoopUnswitch/2006-06-13-SingleEntryPHI.ll, a loop unswitch bug exposed by the recent lcssa work. llvm-svn: 28779	2006-06-14 04:46:17 +00:00
Owen Anderson	fd0a3d6e5c	Reapply my 6/9 changes. The bug Evan saw no longer occurs. llvm-svn: 28759	2006-06-12 21:49:21 +00:00
Evan Cheng	1b6e310e6f	Back out Owen's 6/9 changes. They broke MultiSource/Benchmarks/Prolangs-C/bison (and perhaps others). llvm-svn: 28747	2006-06-11 09:32:57 +00:00
Owen Anderson	b1dc1d44f8	Add LCSSA as a requirement for LoopUnswitch, and assert that LoopUnswitch preserves LCSSA. llvm-svn: 28739	2006-06-09 18:40:32 +00:00
Evan Cheng	398f70292c	RewriteExpr, either the new PHI node of induction variable or the post-increment value, should be first cast to the appropriated type (to the type of the common expr). Otherwise, the rewrite of a use based on (common + iv) may end up with an incorrect type. llvm-svn: 28735	2006-06-09 00:12:42 +00:00
Reid Spencer	d4b795902c	Fix a spello in a comment. llvm-svn: 28714	2006-06-07 21:24:10 +00:00
Chris Lattner	95cebb082f	Fix a bug in a recent patch. This fixes UnitTests/Vector/Altivec/casts.c on PPC/altivec llvm-svn: 28698	2006-06-06 22:26:02 +00:00
Chris Lattner	540886f0ae	Remove unneeded hook. Patch by Anton K. Thanks! llvm-svn: 28664	2006-06-02 19:11:46 +00:00
Chris Lattner	f905a7b994	Silence a -pedantic warning. llvm-svn: 28632	2006-06-01 17:16:21 +00:00
Chris Lattner	1df0e98ac2	Swap the order of operands created here. For +&\|^, the order doesn't matter, but for sub, it really does! Fix fixes a miscompilation of fibheap_cut in llvmgcc4. llvm-svn: 28600	2006-05-31 21:14:00 +00:00
Chris Lattner	dab43b2b0e	Implement Transforms/InstCombine/store.ll:test2. llvm-svn: 28503	2006-05-26 19:19:20 +00:00
Chris Lattner	0e47716e69	Transform things like (splat(splat)) -> splat llvm-svn: 28490	2006-05-26 00:29:06 +00:00
Chris Lattner	12249be286	Introduce a helper function that simplifies interpretation of shuffle masks. No functionality change. llvm-svn: 28489	2006-05-25 23:48:38 +00:00
Chris Lattner	99155be33f	Turn (cast (shuffle (cast)) -> shuffle (cast) if it reduces the # casts in the program. This exposes more opportunities for the instcombiner, and implements vec_shuffle.ll:test6 llvm-svn: 28487	2006-05-25 23:24:33 +00:00
Chris Lattner	83f6578b0c	extract element from a shuffle vector can be trivially turned into an extractelement from the SV's source. This implement vec_shuffle.ll:test[45] llvm-svn: 28485	2006-05-25 22:53:38 +00:00
Chris Lattner	d0622b6894	Silence a bogus gcc warning llvm-svn: 28422	2006-05-20 23:14:03 +00:00
Chris Lattner	e4cb4768fa	Declare that lowerinvoke doesn't interact with other lowering passes. Patch written by Domagoj Babic! llvm-svn: 28367	2006-05-17 21:05:27 +00:00
Evan Cheng	18d0438148	Backing out last check-in for now. It's causing an infinite loop gccas lencode. llvm-svn: 28284	2006-05-14 06:46:03 +00:00
Chris Lattner	3987a8532d	Add/Sub/Mul are safe to promote here as well. Incrementing a single-bit bitfield now gives this code: _plus: lwz r2, 0(r3) rlwimi r2, r2, 0, 1, 31 xoris r2, r2, 32768 stw r2, 0(r3) blr instead of this: _plus: lwz r2, 0(r3) srwi r4, r2, 31 slwi r4, r4, 31 addis r4, r4, -32768 rlwimi r2, r4, 0, 0, 0 stw r2, 0(r3) blr this can obviously still be improved. llvm-svn: 28275	2006-05-13 02:16:08 +00:00
Chris Lattner	1ebbe6a22e	Implement simple promotion for cast elimination in instcombine. This is currently very limited, but can be extended in the future. For example, we now compile: uint %test30(uint %c1) { %c2 = cast uint %c1 to ubyte %c3 = xor ubyte %c2, 1 %c4 = cast ubyte %c3 to uint ret uint %c4 } to: _xor: movzbl 4(%esp), %eax xorl $1, %eax ret instead of: _xor: movb $1, %al xorb 4(%esp), %al movzbl %al, %eax ret More impressively, we now compile: struct B { unsigned bit : 1; }; void xor(struct B *b) { b->bit = b->bit ^ 1; } To (X86/PPC): _xor: movl 4(%esp), %eax xorl $-2147483648, (%eax) ret _xor: lwz r2, 0(r3) xoris r2, r2, 32768 stw r2, 0(r3) blr instead of (X86/PPC): _xor: movl 4(%esp), %eax movl (%eax), %ecx movl %ecx, %edx shrl $31, %edx # TRUNCATE movb %dl, %dl xorb $1, %dl movzbl %dl, %edx andl $2147483647, %ecx shll $31, %edx orl %ecx, %edx movl %edx, (%eax) ret _xor: lwz r2, 0(r3) srwi r4, r2, 31 xori r4, r4, 1 rlwimi r2, r4, 31, 0, 0 stw r2, 0(r3) blr This implements InstCombine/cast.ll:test30. llvm-svn: 28273	2006-05-13 02:06:03 +00:00
Chris Lattner	1443bc52be	Refactor some code, making it simpler. When doing the initial pass of constant folding, if we get a constantexpr, simplify the constant expr like we would do if the constant is folded in the normal loop. This fixes the missed-optimization regression in Transforms/InstCombine/getelementptr.ll last night. llvm-svn: 28224	2006-05-11 17:11:52 +00:00
Chris Lattner	a36ee4ea34	Two changes: 1. Implement InstCombine/deadcode.ll by not adding instructions in unreachable blocks (due to constants in conditional branches/switches) to the worklist. This causes them to be deleted before instcombine starts up, leading to better optimization. 2. In the prepass over instructions, do trivial constprop/dce as we go. This has the effect of improving the effectiveness of #1. In addition, it significantly speeds up instcombine on test cases with large amounts of constant folding code (for example, that produced by code specialization or partial evaluation). In one example, it speeds up instcombine from 0.0589s to 0.0224s with a release build (a 2.6x speedup). llvm-svn: 28215	2006-05-10 19:00:36 +00:00
Chris Lattner	4fe87d67c4	Patch to make some xforms preserve each other. Patch contributed by Domagoj Babic! llvm-svn: 28181	2006-05-09 04:13:41 +00:00
Chris Lattner	1d441adfbf	Move some code around. Make the "fold (and (cast A), (cast B)) -> (cast (and A, B))" transformation only apply when both casts really will cause code to be generated. If one or both doesn't, then this xform doesn't remove a cast. This fixes Transforms/InstCombine/2006-05-06-Infloop.ll llvm-svn: 28141	2006-05-06 09:00:16 +00:00
Chris Lattner	e745c7de0e	Fix an infinite loop compiling oggenc last night. llvm-svn: 28128	2006-05-05 20:51:30 +00:00
Chris Lattner	3af1053488	Implement InstCombine/cast.ll:test29 llvm-svn: 28126	2006-05-05 06:39:07 +00:00
Chris Lattner	fb29692055	Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll llvm-svn: 28101	2006-05-04 17:33:35 +00:00
Chris Lattner	2d3a02725d	Add pass ID's for various passes, so they can be AddRequiredID. Patch by Domagoj Babic! llvm-svn: 28048	2006-05-02 04:24:36 +00:00
Chris Lattner	655d08fda8	Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll llvm-svn: 28019	2006-04-28 22:21:41 +00:00
Chris Lattner	e63d808b6e	Fix Transforms/Reassociate/2006-04-27-ReassociateVector.ll llvm-svn: 28007	2006-04-28 04:14:49 +00:00
Chris Lattner	b6cb64b7e6	Add support for inserting undef into a vector. This implements Transforms/InstCombine/vec_insert_to_shuffle.ll llvm-svn: 27997	2006-04-27 21:14:21 +00:00
Chris Lattner	dae49df407	Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll llvm-svn: 27912	2006-04-20 20:48:50 +00:00
Andrew Lenharth	f89e630b2f	Make code match cvs commit message :) llvm-svn: 27881	2006-04-20 15:41:37 +00:00
Andrew Lenharth	61eae29ad6	If we can convert the return pointer type into an integer that IntPtrType can be converted to losslessly, we can continue the conversion to a direct call. llvm-svn: 27880	2006-04-20 14:56:47 +00:00
Chris Lattner	36dd7c98d1	Turn x86 unaligned load/store intrinsics into aligned load/store instructions if the pointer is known aligned. llvm-svn: 27781	2006-04-17 22:26:56 +00:00
Chris Lattner	9095186deb	Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform Make the insert/extract elt -> shuffle code more aggressive. This fixes CodeGen/PowerPC/vec_shuffle.ll llvm-svn: 27728	2006-04-16 00:51:47 +00:00
Chris Lattner	34cebe785d	Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask'). llvm-svn: 27727	2006-04-16 00:03:56 +00:00
Chris Lattner	39fac448d6	significant cleanups to code that uses insert/extractelt heavily. This builds maximal shuffles out of them where possible. llvm-svn: 27717	2006-04-15 01:39:45 +00:00
Chris Lattner	3323ce165d	Teach scalarrepl to promote unions of vectors and floats, producing insert/extractelement operations. This implements Transforms/ScalarRepl/vector_promote.ll llvm-svn: 27710	2006-04-14 21:42:41 +00:00
Reid Spencer	13a1a7a4a6	Get rid of a signed/unsigned compare warning. llvm-svn: 27625	2006-04-12 19:28:15 +00:00
Chris Lattner	b19a5c661b	Turn casts into getelementptr's when possible. This enables SROA to be more aggressive in some cases where LLVMGCC 4 is inserting casts for no reason. This implements InstCombine/cast.ll:test27/28. llvm-svn: 27620	2006-04-12 18:09:35 +00:00
Chris Lattner	2d37f920ad	Implement vec_shuffle.ll:test3 llvm-svn: 27573	2006-04-10 23:06:36 +00:00
Chris Lattner	fbb77a408b	Implement InstCombine/vec_shuffle.ll:test[12] llvm-svn: 27571	2006-04-10 22:45:52 +00:00
Chris Lattner	17bd60588c	Add supprot for shufflevector llvm-svn: 27513	2006-04-08 01:19:12 +00:00
Chris Lattner	e79d249c29	Lower vperm(x,y, mask) -> shuffle(x,y,mask) if mask is constant. This allows us to compile oh-so-realistic stuff like this: vec_vperm(A, B, (vector unsigned char){14}); to: vspltb v0, v0, 14 instead of: vspltisb v0, 14 vperm v0, v2, v1, v0 llvm-svn: 27452	2006-04-06 19:19:17 +00:00
Chris Lattner	caba72b6ff	vector casts of casts are eliminable. Transform this: %tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1] %tmp = cast <4 x int> %tmp to <4 x float> ; <<4 x float>> [#uses=1] into: %tmp = cast <4 x uint> %tmp to <4 x float> ; <<4 x float>> [#uses=1] llvm-svn: 27355	2006-04-02 05:43:13 +00:00
Chris Lattner	ebca476b27	Allow transforming this: %tmp = cast <4 x uint>* %testData to <4 x int>* ; <<4 x int>> [#uses=1] %tmp = load <4 x int> %tmp ; <<4 x int>> [#uses=1] to this: %tmp = load <4 x uint>* %testData ; <<4 x uint>> [#uses=1] %tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1] llvm-svn: 27353	2006-04-02 05:37:12 +00:00
Chris Lattner	f42d0aeda1	Turn altivec lvx/stvx intrinsics into loads and stores. This allows the elimination of one load from this: int AreSecondAndThirdElementsBothNegative( vector float in ) { #define QNaN 0x7FC00000 const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN ); vector float test = vec_ld( 0, (float) &testData ); return ! vec_any_ge( test, *in ); } Now generating: _AreSecondAndThirdElementsBothNegative: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI1_0) lis r5, ha16(LCPI1_0) addi r6, r1, -16 lvx v0, r5, r4 stvx v0, 0, r6 lvx v1, 0, r3 vcmpgefp. v0, v0, v1 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 xori r3, r3, 1 cntlzw r3, r3 srwi r3, r3, 5 mtspr 256, r2 blr llvm-svn: 27352	2006-04-02 05:30:25 +00:00
Chris Lattner	6cf4914fd4	Fix InstCombine/2006-04-01-InfLoop.ll llvm-svn: 27330	2006-04-01 22:05:01 +00:00
Chris Lattner	dcd0792622	Fold A^(B&A) -> (B&A)^A Fold (B&A)^A == ~B & A This implements InstCombine/xor.ll:test2[56] llvm-svn: 27328	2006-04-01 08:03:55 +00:00
Chris Lattner	8d1d8d364c	If we can look through vector operations to find the scalar version of an extract_element'd value, do so. llvm-svn: 27323	2006-03-31 23:01:56 +00:00
Chris Lattner	92346c315e	extractelement(undef,x) -> undef llvm-svn: 27300	2006-03-31 18:25:14 +00:00
Chris Lattner	612fa8e6f3	Fix Transforms/InstCombine/2006-03-30-ExtractElement.ll llvm-svn: 27261	2006-03-30 22:02:40 +00:00
Chris Lattner	d70d9f5b24	Don't crash on packed logical ops llvm-svn: 27125	2006-03-25 21:58:26 +00:00
Chris Lattner	f365f5f0c1	Fix spello llvm-svn: 27052	2006-03-24 07:14:34 +00:00
Chris Lattner	5821a6a17a	add the actual cost to the debug info llvm-svn: 27051	2006-03-24 07:14:00 +00:00
Jim Laskey	83f99115db	Can't combine anymore - we don't have a chain through llvm.dbg intrinsics. llvm-svn: 26992	2006-03-23 18:10:42 +00:00
Chris Lattner	7d80b4f366	silence a bogus gcc warning llvm-svn: 26953	2006-03-22 17:27:24 +00:00
Chris Lattner	d783c76c18	Teach cee to propagate through switch statements. This implements Transforms/CorrelatedExprs/switch.ll Patch contributed by Eric Kidd! llvm-svn: 26872	2006-03-19 19:37:24 +00:00
Evan Cheng	c28282bd87	- Fixed a bogus if condition. - Added more debugging info. - Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride. llvm-svn: 26841	2006-03-18 08:03:12 +00:00
Evan Cheng	f09f0ebd48	Sort StrideOrder so we can process the smallest strides first. This allows for more IV reuses. llvm-svn: 26837	2006-03-18 00:44:49 +00:00
Evan Cheng	4520698820	Allow users of iv / stride to be rewritten with expression that is a multiply of a smaller stride even if they have a common loop invariant expression part. llvm-svn: 26828	2006-03-17 19:52:23 +00:00
Evan Cheng	3df447d354	For each loop, keep track of all the IV expressions inserted indexed by stride. For a set of uses of the IV of a stride which is a multiple of another stride, do not insert a new IV expression. Rather, reuse the previous IV and rewrite the uses as uses of IV expression multiplied by the factor. e.g. x = 0 ...; x ++ y = 0 ...; y += 4 then use of y can be rewritten as use of 4*x for x86. llvm-svn: 26803	2006-03-16 21:53:05 +00:00
Chris Lattner	c5f866bb4a	Implement a FIXME, recusively reassociating AAB + AAC --> A(AB+AC) --> A(A*(B+C)) This implements Reassociate/mul-factor3.ll llvm-svn: 26757	2006-03-14 16:04:29 +00:00
Chris Lattner	2fc319d444	extract some code into a method, no functionality change llvm-svn: 26755	2006-03-14 07:11:11 +00:00
Chris Lattner	d6bde46d85	Promote shifts by a constant to multiplies so that we can reassociate (x<<1)+(y<<1) -> (X+Y)<<1. This implements Transforms/Reassociate/shift-factor.ll llvm-svn: 26753	2006-03-14 06:55:18 +00:00
Evan Cheng	c567c4efbb	Added target lowering hooks which LSR consults to make more intelligent transformation decisions. llvm-svn: 26738	2006-03-13 23:14:23 +00:00
Chris Lattner	fc34f8bb48	Fix a miscompilation of 188.ammp with the new CFE. 188.ammp is accessing arrays out of range in a horrible way, but we shouldn't break it anyway. Details in the comments. llvm-svn: 26606	2006-03-08 01:05:29 +00:00
Chris Lattner	53ef5a032c	Teach the alignment handling code to look through constant expr casts and GEPs llvm-svn: 26580	2006-03-07 01:28:57 +00:00
Chris Lattner	82f2ef20b6	Teach instcombine to increase the alignment of memset/memcpy/memmove when the pointer is known to come from either a global variable, alloca or malloc. This allows us to compile this: P = malloc(28); memset(P, 0, 28); into explicit stores on PPC instead of a memset call. llvm-svn: 26577	2006-03-06 20:18:44 +00:00
Chris Lattner	6bc98653c2	Make vector narrowing more effective, implementing Transforms/InstCombine/vec_narrow.ll. This add support for narrowing extract_element(insertelement) also. llvm-svn: 26538	2006-03-05 00:22:33 +00:00
Chris Lattner	4c065091d8	Add factoring of multiplications, e.g. turning AA+AB into A*(A+B). Testcase here: Transforms/Reassociate/mulfactor.ll llvm-svn: 26524	2006-03-04 09:31:13 +00:00
Chris Lattner	32c01df299	Canonicalize (X+C1)C2 -> XC2+C1*C2 This implements Transforms/InstCombine/add.ll:test31 llvm-svn: 26519	2006-03-04 06:04:02 +00:00
Chris Lattner	681ef2f083	Change this to work with renamed intrinsics. llvm-svn: 26484	2006-03-03 01:34:17 +00:00
Chris Lattner	85dda9a2bd	Generalize the REM folding code to handle another case Nick Lewycky pointed out: realize the AND can provide factors and look through Casts. llvm-svn: 26469	2006-03-02 06:50:58 +00:00
Chris Lattner	c5b6c9a12a	Fix a regression in a patch from a couple of days ago. This fixes Transforms/InstCombine/2006-02-28-Crash.ll llvm-svn: 26427	2006-02-28 19:47:20 +00:00
Chris Lattner	b70f141893	Implement rem.ll:test[7-9] and PR712 llvm-svn: 26415	2006-02-28 05:49:21 +00:00
Chris Lattner	2a7c7b8bab	Simplify some code now that the RHS of a rem can't be 0 llvm-svn: 26413	2006-02-28 05:40:55 +00:00
Chris Lattner	0de4a8d7b7	Rearrange some code, fold "rem X, 0", implementing rem.ll:test6 llvm-svn: 26411	2006-02-28 05:30:45 +00:00
Chris Lattner	c7bfed0f7b	Merge two almost-identical pieces of code. Make this code more powerful by using ComputeMaskedBits instead of looking for an AND operand. This lets us fold this: int %test23(int %a) { %tmp.1 = and int %a, 1 %tmp.2 = seteq int %tmp.1, 0 %tmp.3 = cast bool %tmp.2 to int ;; xor tmp1, 1 ret int %tmp.3 } into: xor (and a, 1), 1 llvm-svn: 26396	2006-02-27 02:38:23 +00:00
Chris Lattner	f5c8a0b83f	Fold (A^B) == A -> B == 0 and (A-B) == A -> B == 0 llvm-svn: 26394	2006-02-27 01:44:11 +00:00
Chris Lattner	f78df7c14d	Fold (X\|C1)^C2 -> X^(C1\|C2) when possible. This implements InstCombine/or.ll:test23. llvm-svn: 26385	2006-02-26 19:57:54 +00:00
Chris Lattner	b580d26e7d	Fix a problem that Nate noticed that boils down to an over conservative check in the code that does "select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y)))". We now compile this loop: LBB1_1: ; no_exit add r6, r2, r3 subf r3, r2, r3 cmpwi cr0, r2, 0 addi r7, r5, 4 lwz r2, 0(r5) addi r4, r4, 1 blt cr0, LBB1_4 ; no_exit LBB1_3: ; no_exit mr r3, r6 LBB1_4: ; no_exit cmpwi cr0, r4, 16 mr r5, r7 bne cr0, LBB1_1 ; no_exit into this instead: LBB1_1: ; no_exit srawi r6, r2, 31 add r2, r2, r6 xor r6, r2, r6 addi r7, r5, 4 lwz r2, 0(r5) addi r4, r4, 1 add r3, r3, r6 cmpwi cr0, r4, 16 mr r5, r7 bne cr0, LBB1_1 ; no_exit llvm-svn: 26356	2006-02-24 18:05:58 +00:00
Chris Lattner	e5521db5bc	Fix Regression/Transforms/LoopUnswitch/2006-02-22-UnswitchCrash.ll, which caused SPASS to fail building last night. We can't trivially unswitch a loop if the exit block has phi nodes in it, because we don't know which predecessor to use. llvm-svn: 26320	2006-02-22 23:55:00 +00:00
Chris Lattner	8a5a324dac	Add some comments, simplify some code, and fix a bug that caused rewriting to rewrite with the wrong value. llvm-svn: 26311	2006-02-22 06:37:14 +00:00
Chris Lattner	c2e3a7a4ce	improved support for branch folding, still not enabled. llvm-svn: 26289	2006-02-18 07:57:38 +00:00
Jeff Cohen	0add83e969	Fix bugs identified by VC++. llvm-svn: 26287	2006-02-18 03:20:33 +00:00
Chris Lattner	19fa8ac938	Implement deletion of dead blocks, currently disabled. llvm-svn: 26285	2006-02-18 02:42:34 +00:00
Chris Lattner	cb853de534	a previous patch completely disabled trivial unswitching, this fixees it. Thanks to nate for pointing this out :) llvm-svn: 26280	2006-02-18 01:32:04 +00:00
Chris Lattner	29f771ba21	initial trivial support for folding branches that have now-constant destinations. llvm-svn: 26279	2006-02-18 01:27:45 +00:00
Chris Lattner	8e44ff50b0	When unswitching a loop, make sure to update loop info with exit blocks in the right loop. llvm-svn: 26277	2006-02-18 00:55:32 +00:00
Chris Lattner	baddba41c7	Fix loops where the header has an exit, fixing a loop-unswitch crash on crafty llvm-svn: 26258	2006-02-17 06:39:56 +00:00
Chris Lattner	6fd136239b	start of some new simplification code, not thoroughly tested, use at your own risk :) llvm-svn: 26248	2006-02-17 00:31:07 +00:00
Nate Begeman	8a77efe4f7	Rework the SelectionDAG-based implementations of SimplifyDemandedBits and ComputeMaskedBits to match the new improved versions in instcombine. Tested against all of multisource/benchmarks on ppc. llvm-svn: 26238	2006-02-16 21:11:51 +00:00
Chris Lattner	fa335f6083	Change SplitBlock to increment a BasicBlock::iterator, not an Instruction*. Apparently they do different things :) This fixes a testcase that nate reduced from spass. Also included are a couple minor code changes that don't affect the generated code at all. llvm-svn: 26235	2006-02-16 19:36:22 +00:00
Jeff Cohen	55f63f1b53	Fix VC++ warning. llvm-svn: 26228	2006-02-16 04:07:37 +00:00
Chris Lattner	ff42e81028	fix a bug where we unswitched the wrong way llvm-svn: 26225	2006-02-16 01:24:41 +00:00
Chris Lattner	fdff0bb43e	Implement trivial unswitching for switch stmts. This allows us to trivial unswitch this loop on 2 before sweating to unswitch on 1/3. void test4(int N, int i, int C, intP, intQ) { int j; for (j = 0; j < N; ++j) { switch (C) { // general unswitching. default: P[i+j] = 0; break; case 1: Q[i+j] = 0; break; case 3: P[i+j] = Q[i+j]; break; case 2: break; // TRIVIAL UNSWITCH on C==2 } } } llvm-svn: 26223	2006-02-15 22:52:05 +00:00
Chris Lattner	e5cb76d744	make "trivial" unswitching significantly more general. It can now handle this for example: for (j = 0; j < N; ++j) { // trivial unswitch if (C) P[i+j] = 0; } turning it into the obvious code without bothering to duplicate an empty loop. llvm-svn: 26220	2006-02-15 22:03:36 +00:00
Chris Lattner	65152d80ec	Checking the wrong value. This caused us to emit silly code like Y = seteq bool X, true instead of just using X :) llvm-svn: 26215	2006-02-15 19:05:52 +00:00
Chris Lattner	01db04efb0	more refactoring, no functionality change. llvm-svn: 26194	2006-02-15 01:44:42 +00:00
Chris Lattner	b0cbe7106e	pull some code out into a function llvm-svn: 26191	2006-02-15 00:07:43 +00:00
Chris Lattner	0b8ec1a132	Use statistics to keep track of what flavors of loops we are unswitching llvm-svn: 26157	2006-02-14 01:01:41 +00:00
Chris Lattner	8b10ab3002	Implement Instcombine/and.ll:test34 llvm-svn: 26155	2006-02-13 23:07:23 +00:00
Chris Lattner	7d8522884b	If any of the sign extended bits are demanded, the input sign bit is demanded for a sign extension. This fixes InstCombine/2006-02-13-DemandedMiscompile.ll and Ptrdist/bc. llvm-svn: 26152	2006-02-13 22:41:07 +00:00
Chris Lattner	68e7475777	Be careful not to request or look at bits shifted in from outside the size of the input. This fixes the mediabench/gsm/toast failure last night. llvm-svn: 26138	2006-02-13 06:09:08 +00:00
Chris Lattner	f5b4ef7f58	remove some more dead special case code llvm-svn: 26135	2006-02-12 08:07:37 +00:00
Chris Lattner	5b2edb1fca	Eliminate special case hacks that are superceded by general purpose hacks llvm-svn: 26134	2006-02-12 08:02:11 +00:00
Chris Lattner	ee0f280743	Three changes: 1. Teach GetConstantInType to handle boolean constants. 2. Teach instcombine to fold (compare X, CST) when X has known 0/1 bits. Testcase here: set.ll:test22 3. Improve the "(X >> c1) & C2 == 0" folding code to allow a noop cast between the shift and and. More aggressive bitfolding for other reasons was turning signed shr's into unsigned shr's, leaving the noop cast in the way. llvm-svn: 26131	2006-02-12 02:07:56 +00:00
Chris Lattner	0157e7f55b	Port the recent innovations in ComputeMaskedBits to SimplifyDemandedBits. This allows us to simplify on conditions where bits are not known, but they are not demanded either! This also fixes a couple of bugs in ComputeMaskedBits that were exposed during this work. In the future, swaths of instcombine should be removed, as this code subsumes a bunch of ad-hockery. llvm-svn: 26122	2006-02-11 09:31:47 +00:00
Chris Lattner	fbadd7e1ee	implement unswitching of loops with switch stmts and selects in them llvm-svn: 26114	2006-02-11 00:43:37 +00:00
Chris Lattner	f1b151684d	Update PHI nodes in successors of exit blocks. llvm-svn: 26113	2006-02-10 23:26:14 +00:00
Chris Lattner	fe4151efe7	Reform the unswitching code in terms of edge splitting, not block splitting. llvm-svn: 26112	2006-02-10 23:16:39 +00:00
Chris Lattner	ec6b40a093	Fix a case where UnswitchTrivialCondition broke critical edges with phi's in the successors llvm-svn: 26108	2006-02-10 19:08:15 +00:00
Chris Lattner	6e263155a6	add some notes, move some code around. Implement unswitching of loops with branches on partially invariant computations. llvm-svn: 26104	2006-02-10 02:30:37 +00:00
Chris Lattner	4935417a84	Move code around to be more logical, no functionality change. llvm-svn: 26103	2006-02-10 02:01:22 +00:00
Chris Lattner	3fc3148b85	When unswitching a trivial loop, do admit we are doing it! :) llvm-svn: 26102	2006-02-10 01:36:35 +00:00
Chris Lattner	ed7a67b0de	Implement unconditional unswitching of 'trivial' loops, those loops that contain branches in their entry block that control whether or not the loop is a noop or not. llvm-svn: 26101	2006-02-10 01:24:09 +00:00
Chris Lattner	4f0e66df6a	Simplify control flow a bit, note that unswitch preserves canonical loop form llvm-svn: 26098	2006-02-09 22:15:42 +00:00
Chris Lattner	8976219850	Make the threshold a parameter llvm-svn: 26093	2006-02-09 20:15:48 +00:00
Chris Lattner	2826e0511b	Simplify the loop-unswitch pass, by not even trying to unswitch loops with uses of loop values outside the loop. We need loop-closed SSA form to do this right, or to use SSA rewriting if we really care. llvm-svn: 26089	2006-02-09 19:14:52 +00:00
Chris Lattner	24cd2fa269	Fix 80-column violations llvm-svn: 26088	2006-02-09 07:41:14 +00:00
Chris Lattner	4534dd59a3	Enhance MVIZ in three ways: 1. Teach it new tricks: in particular how to propagate through signed shr and sexts. 2. Teach it to return a bitset of known-1 and known-0 bits, instead of just zero. 3. Teach instcombine (AND X, C) to fold when we know all C bits of X. This implements Regression/Transforms/InstCombine/bittest.ll, and allows future things to be simplified. llvm-svn: 26087	2006-02-09 07:38:58 +00:00
Chris Lattner	ab2dc4d70d	Simplify some code, reducing calls to MaskedValueIsZero. Implement a minor optimization where we reduce the number of bits in AND masks when possible. llvm-svn: 26056	2006-02-08 07:34:50 +00:00
Chris Lattner	5997cf9381	Use EraseInstFromFunction in a few cases to put the uses of the removed instruction onto the worklist (in case they are now dead). Add a really trivial local DSE implementation to help out bitfield code. We now fold this: struct S { unsigned char a : 1, b : 1, c : 1, d : 2, e : 3; S(); }; S::S() : a(0), b(0), c(1), d(0), e(6) {} to this: void %_ZN1SC1Ev(%struct.S* %this) { entry: %tmp.1 = getelementptr %struct.S* %this, int 0, uint 0 store ubyte 38, ubyte* %tmp.1 ret void } much earlier (in gccas instead of only in gccld after DSE runs). llvm-svn: 26050	2006-02-08 03:25:32 +00:00
Chris Lattner	06a0ed1ee0	Implement some more interesting select sccp cases. This implements: test/Regression/Transforms/SCCP/select.ll llvm-svn: 26049	2006-02-08 02:38:11 +00:00
Chris Lattner	ddba3289b5	Fix a problem in my patch yesterday, causing a miscompilation of 176.gcc llvm-svn: 26045	2006-02-08 01:20:23 +00:00
Chris Lattner	44314827d6	Fix Transforms/InstCombine/2006-02-07-SextZextCrash.ll llvm-svn: 26040	2006-02-07 19:07:40 +00:00
Chris Lattner	92a6865321	Generalize MaskedValueIsZero into a ComputeMaskedNonZeroBits function, which is just as efficient as MVIZ and is also more general. Fix a few minor bugs introduced in recent patches llvm-svn: 26036	2006-02-07 08:05:22 +00:00
Chris Lattner	c3ebf40031	Make MaskedValueIsZero take a uint64_t instead of a ConstantIntegral as a mask. This allows the code to be simpler and more efficient. Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive. llvm-svn: 26035	2006-02-07 07:27:52 +00:00
Chris Lattner	77defbae0a	Use Type::getIntegralTypeMask() to simplify some code llvm-svn: 26034	2006-02-07 07:00:41 +00:00
Chris Lattner	2590e511d8	Implement the beginnings of a facility for simplifying expressions based on 'demanded bits', inspired by Nate's work in the dag combiner. This isn't complete, but needs to unrelated instcombiner changes to continue. llvm-svn: 26033	2006-02-07 06:56:34 +00:00
Chris Lattner	2e90b732fa	Turn A % (C << N), where C is 2^k, into A & ((C << N)-1) [urem only]. Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only]. Tested with: rem.ll:test5, div.ll:test10 llvm-svn: 26003	2006-02-05 07:54:04 +00:00
Chris Lattner	d30c4991a1	Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces #LLVM LOC, and auto-cse's cast instructions. llvm-svn: 25974	2006-02-04 09:52:43 +00:00
Chris Lattner	2959f0003e	Fix two significant bugs in LSR: 1. When rewriting code in outer loops, sometimes we would insert code into inner loops that is invariant in that loop. 2. Notice that 4(2+x) is 8+4x and use that to simplify expressions. This is a performance neutral change. llvm-svn: 25964	2006-02-04 07:36:50 +00:00
Jeff Cohen	15a8c15a1f	Improve compatibility with VC2005, patch by Morten Ofstad! llvm-svn: 25661	2006-01-26 20:41:32 +00:00
Chris Lattner	c0f633a598	Fix Regression/Transforms/ScalarRepl/2006-01-24-IllegalUnionPromoteCrash.ll llvm-svn: 25587	2006-01-24 19:36:27 +00:00
Chris Lattner	c597b8a55e	Make iostream #inclusion explicit llvm-svn: 25514	2006-01-22 23:32:06 +00:00
Chris Lattner	e154abf9b3	Implement casts.ll:test26: a cast from float -> double -> integer, doesn't need the float->double part. llvm-svn: 25452	2006-01-19 07:40:22 +00:00
Robert Bocchino	6dce25019d	Lowerpacked and SCCP support for the insertelement operation. llvm-svn: 25406	2006-01-17 20:06:55 +00:00
Chris Lattner	307b7ea15f	fix a crash due to missing parens llvm-svn: 25363	2006-01-16 19:47:21 +00:00
Chris Lattner	0de2c7d3d8	This pass has never worked correctly. Remove. llvm-svn: 25349	2006-01-16 01:06:00 +00:00
Chris Lattner	ef530c24c1	FunctionPass's cannot do IPO things. llvm-svn: 25315	2006-01-14 19:30:35 +00:00
Robert Bocchino	a83529678e	Added instcombine support for extractelement. llvm-svn: 25299	2006-01-13 22:48:06 +00:00
Chris Lattner	503221f5c5	Do a simple instcombine xforms to delete llvm.stackrestore cases. llvm-svn: 25294	2006-01-13 21:28:09 +00:00
Chris Lattner	c66b223b28	Simplify this a tiny bit by using the new IntrinsicInst functionality. llvm-svn: 25292	2006-01-13 20:11:04 +00:00
Chris Lattner	cb36710ff9	Switch these to using ETForest instead of DominatorSet to compute itself. Patch written by Daniel Berlin! llvm-svn: 25202	2006-01-11 05:10:20 +00:00
Chris Lattner	48e4a2ebd8	Switch this to using ETForest instead of DominatorSet to compute itself. Patch written by Daniel Berlin! llvm-svn: 25201	2006-01-11 05:09:40 +00:00
Robert Bocchino	bd518d153b	Added lower packed support for the extractelement operation. llvm-svn: 25180	2006-01-10 19:05:05 +00:00
Chris Lattner	9cbfbc21bb	fix some 176.gcc miscompilation from my previous patch. llvm-svn: 25137	2006-01-07 01:32:28 +00:00
Chris Lattner	330628a6d8	silence some bogus gcc warnings on fenris llvm-svn: 25130	2006-01-06 17:59:59 +00:00
Chris Lattner	eb372a0276	Enhance the shift-shift folding code to allow a no-op cast to occur in between the shifts. This allows us to fold this (which is the 'integer add a constant' sequence from cozmic's scheme compmiler): int %x(uint %anf-temporary776) { %anf-temporary777 = shr uint %anf-temporary776, ubyte 1 %anf-temporary800 = cast uint %anf-temporary777 to int %anf-temporary804 = shl int %anf-temporary800, ubyte 1 %anf-temporary805 = add int %anf-temporary804, -2 %anf-temporary806 = or int %anf-temporary805, 1 ret int %anf-temporary806 } into this: int %x(uint %anf-temporary776) { %anf-temporary776 = cast uint %anf-temporary776 to int %anf-temporary776.mask1 = add int %anf-temporary776, -2 %anf-temporary805 = or int %anf-temporary776.mask1, 1 ret int %anf-temporary805 } note that instcombine already knew how to eliminate the AND that the two shifts fold into. This is tested by InstCombine/shift.ll:test26 -Chris llvm-svn: 25128	2006-01-06 07:52:12 +00:00
Chris Lattner	b330939d90	Simplify the code a bit more llvm-svn: 25126	2006-01-06 07:22:22 +00:00
Chris Lattner	145539343f	Extract a bunch of code out of visitShiftInst into FoldShiftByConstant. No functionality changes. llvm-svn: 25125	2006-01-06 07:12:35 +00:00
Duraid Madina	7a3ad6cae2	getting there... llvm-svn: 25021	2005-12-26 13:48:44 +00:00
Chris Lattner	8c9e14620f	Fix Transforms/ScalarRepl/2005-12-14-UnionPromoteCrash.ll, a crash on undefined behavior in 126.gcc on big-endian systems. llvm-svn: 24708	2005-12-14 17:23:59 +00:00
Chris Lattner	3b0a62d8a5	Implement a little hack for parity with GCC on crafty. This speeds up 186.crafty by about 16% (from 15.109s to 13.045s) on my system. This turns allocas with unions/casts into scalars. For example crafty has something like this: union doub { unsigned short i[4]; long long d; }; int f(long long a) { return ((union doub){.d=a}).i[1]; } Instead of generating loads and stores to an alloca, we now promote the whole thing to a scalar long value. This implements: Transforms/ScalarRepl/AggregatePromote.ll llvm-svn: 24667	2005-12-12 07:19:13 +00:00
Chris Lattner	077200737c	getRawValue zero extens for unsigned values, use getsextvalue so that we know that small negative values fit into the immediate field of addressing modes. llvm-svn: 24608	2005-12-05 18:23:57 +00:00
Chris Lattner	dc4ffef633	Fix a bug where we didn't realize that vaarg reads memory. This fixes Transforms/DeadStoreElimination/2005-11-30-vaarg.ll llvm-svn: 24545	2005-11-30 19:38:22 +00:00
Andrew Lenharth	5fc3794e71	since reg2mem requires it, might as well mention that it preserves it llvm-svn: 24491	2005-11-25 16:04:54 +00:00
Andrew Lenharth	061029dee2	Reg2Mem is something a pass may depend on, so allow that llvm-svn: 24488	2005-11-22 22:14:23 +00:00
Andrew Lenharth	71b09bbb07	turns out, demotion and invokes and critical edges don't mix llvm-svn: 24487	2005-11-22 21:45:19 +00:00
Chris Lattner	9c37f23645	Fix a crash building 176.gcc due to my recent patch, which only fixed half the problem. llvm-svn: 24414	2005-11-18 18:30:47 +00:00
Chris Lattner	bca0be812d	This was checking the wrong GEP expression. Fixing this fixes a gccas crash compiling mysql reported by Ted Kremenek. llvm-svn: 24402	2005-11-17 19:35:42 +00:00
Andrew Lenharth	d9c13b1336	the pain isn't gone unless the phinodes are spilled too llvm-svn: 24288	2005-11-10 19:39:09 +00:00
Andrew Lenharth	8e66c0c8a9	this works with backedges to the existing entry block alot better llvm-svn: 24270	2005-11-10 17:35:34 +00:00
Andrew Lenharth	4130a4f061	The pass everyone has been waiting for! Reg2Mem for fun you can opt -reg2mem -mem2reg llvm-svn: 24267	2005-11-10 01:58:38 +00:00
Nate Begeman	848622f87f	Add support alignment of allocation instructions. Add support for specifying alignment and size of setjmp jmpbufs. No targets currently do anything with this information, nor is it presrved in the bytecode representation. That's coming up next. llvm-svn: 24196	2005-11-05 09:21:28 +00:00
Chris Lattner	16b29e9562	Implement Transforms/TailCallElim/return-undef.ll, a trivial case that has been sitting in my inbox since May 18. :) llvm-svn: 24194	2005-11-05 08:21:11 +00:00
Chris Lattner	dd0c174082	Turn sdiv into udiv if both operands have a clear sign bit. This occurs a few times in crafty: OLD: %tmp.36 = div int %tmp.35, 8 ; <int> [#uses=1] NEW: %tmp.36 = div uint %tmp.35, 8 ; <uint> [#uses=0] OLD: %tmp.19 = div int %tmp.18, 8 ; <int> [#uses=1] NEW: %tmp.19 = div uint %tmp.18, 8 ; <uint> [#uses=0] OLD: %tmp.117 = div int %tmp.116, 8 ; <int> [#uses=1] NEW: %tmp.117 = div uint %tmp.116, 8 ; <uint> [#uses=0] OLD: %tmp.92 = div int %tmp.91, 8 ; <int> [#uses=1] NEW: %tmp.92 = div uint %tmp.91, 8 ; <uint> [#uses=0] Which all turn into shrs. llvm-svn: 24190	2005-11-05 07:40:31 +00:00
Chris Lattner	e9ff0eaf5b	Turn srem -> urem when neither input has their sign bit set. This triggers 8 times in vortex, allowing the srems to be turned into shrs: OLD: %tmp.104 = rem int %tmp.5.i37, 16 ; <int> [#uses=1] NEW: %tmp.104 = rem uint %tmp.5.i37, 16 ; <uint> [#uses=0] OLD: %tmp.98 = rem int %tmp.5.i24, 16 ; <int> [#uses=1] NEW: %tmp.98 = rem uint %tmp.5.i24, 16 ; <uint> [#uses=0] OLD: %tmp.91 = rem int %tmp.5.i19, 8 ; <int> [#uses=1] NEW: %tmp.91 = rem uint %tmp.5.i19, 8 ; <uint> [#uses=0] OLD: %tmp.88 = rem int %tmp.5.i14, 8 ; <int> [#uses=1] NEW: %tmp.88 = rem uint %tmp.5.i14, 8 ; <uint> [#uses=0] OLD: %tmp.85 = rem int %tmp.5.i9, 1024 ; <int> [#uses=2] NEW: %tmp.85 = rem uint %tmp.5.i9, 1024 ; <uint> [#uses=0] OLD: %tmp.82 = rem int %tmp.5.i, 512 ; <int> [#uses=2] NEW: %tmp.82 = rem uint %tmp.5.i1, 512 ; <uint> [#uses=0] OLD: %tmp.48.i = rem int %tmp.5.i.i161, 4 ; <int> [#uses=1] NEW: %tmp.48.i = rem uint %tmp.5.i.i161, 4 ; <uint> [#uses=0] OLD: %tmp.20.i2 = rem int %tmp.5.i.i, 4 ; <int> [#uses=1] NEW: %tmp.20.i2 = rem uint %tmp.5.i.i, 4 ; <uint> [#uses=0] it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61) so the payoff isn't as great. llvm-svn: 24189	2005-11-05 07:28:37 +00:00
Andrew Lenharth	662295587d	make this 64 bit clean, fixed test30 of /Regression/Transforms/InstCombine/add.ll llvm-svn: 24158	2005-11-02 18:35:40 +00:00
Chris Lattner	09efd4e5b6	Limit the search depth of MaskedValueIsZero to 6 instructions, to avoid bad cases. This fixes Markus's second testcase in PR639, and should seal it for good. llvm-svn: 24123	2005-10-31 18:35:52 +00:00
Chris Lattner	27d351f159	This pass is now obsolete since all targets have moved to the SelectionDAG infrastructure and the simple isels have been removed. llvm-svn: 24090	2005-10-29 05:33:46 +00:00
Chris Lattner	8f663e8bbc	Pull some code out into a function, give it the ability to see through +. This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1) llvm-svn: 24081	2005-10-29 04:36:15 +00:00
Chris Lattner	8270c33606	Remove a special case, allowing the general case to handle it. No functionality change. llvm-svn: 24076	2005-10-29 03:19:53 +00:00
Chris Lattner	b9d3ca5c3c	Fix a bit of backwards logic that broke exptree and smg2000 llvm-svn: 24056	2005-10-28 16:27:35 +00:00
Chris Lattner	c4f67e67d2	Do not sink any instruction with side effects, including vaarg. This fixes PR640 llvm-svn: 24046	2005-10-27 17:13:11 +00:00
Chris Lattner	c6372cca78	Fix typo llvm-svn: 24033	2005-10-27 06:26:26 +00:00
Chris Lattner	0fe7551bc0	Teach instcombine to promote stuff like (cast (malloc sbyte, 8X) to int) into: malloc int, (2*X) llvm-svn: 24032	2005-10-27 06:24:46 +00:00
Chris Lattner	b3ecf96900	Promote cases like cast (malloc sbyte, 100) to int* into (malloc [25 x int]) directly without having to convert to (malloc [100 x sbyte]) first. llvm-svn: 24031	2005-10-27 06:12:00 +00:00
Chris Lattner	bb17180a23	Minor change to this file to support obscure cases with constant array amounts llvm-svn: 24030	2005-10-27 05:53:56 +00:00
Chris Lattner	38a1b00a0f	fold nested and's early to avoid inefficiencies in MaskedValueIsZero. This fixes a very slow compile in PR639. llvm-svn: 24011	2005-10-26 17:18:16 +00:00
Jeff Cohen	2b8cbf319c	Update Visual Studio projects to reflect moved file. llvm-svn: 23998	2005-10-26 05:36:51 +00:00
Chris Lattner	46705b2f2d	Handle allocations that, even after removing dead uses, still have more than one use (but one is a cast). This handles the very common case of: X = alloc [n x byte] Y = cast X to somethingbetter seteq X, null In order to avoid infinite looping when there are multiple casts, we only allow this if the xform is strictly increasing the alignment of the allocation. llvm-svn: 23961	2005-10-24 06:35:18 +00:00
Chris Lattner	355ecc09f8	Fix a bug where we would 'promote' an allocation from one type to another where the second has less alignment required. If we had explicit alignment support in the IR, we could handle this case, but we can't until we do. llvm-svn: 23960	2005-10-24 06:26:18 +00:00
Chris Lattner	ac87beb03a	Before promoting a malloc type, remove dead uses. This makes instcombine more effective at promoting these allocations, catching them earlier in the compile process. llvm-svn: 23959	2005-10-24 06:22:12 +00:00
Chris Lattner	216be91817	Pull some code out into a function, no functionality change llvm-svn: 23958	2005-10-24 06:03:58 +00:00
Chris Lattner	bde3845548	DONT_BUILD_RELINKED is gone and implied by BUILD_ARCHIVE now llvm-svn: 23940	2005-10-24 02:26:13 +00:00
Chris Lattner	8c087e962c	Only build .a file versions of these libraries, instead of .a and .o versions. This should speed up build times. llvm-svn: 23933	2005-10-24 01:59:48 +00:00
Chris Lattner	bd77fac034	Make sure that anything using the ADCE pass pulls in the UnifyFunctionExitNodes code llvm-svn: 23931	2005-10-24 01:40:23 +00:00
Jeff Cohen	11e26b52b2	When a function takes a variable number of pointer arguments, with a zero pointer marking the end of the list, the zero must be cast to the pointer type. An un-cast zero is a 32-bit int, and at least on x86_64, gcc will not extend the zero to 64 bits, thus allowing the upper 32 bits to be random junk. The new END_WITH_NULL macro may be used to annotate a such a function so that GCC (version 4 or newer) will detect the use of un-casted zero at compile time. llvm-svn: 23888	2005-10-23 04:37:20 +00:00
Chris Lattner	5df0e36e98	My previous patch was too conservative. Reject FP and void types, but do allow pointer types. llvm-svn: 23859	2005-10-21 05:45:41 +00:00
Chris Lattner	0c0b38bb4c	Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an inner loop like this: LBB_RateConvertMono8AltiVec_2: ; no_exit lis r2, ha16(.CPI_RateConvertMono8AltiVec_0) lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2) fmr f3, f3 fadd f0, f2, f0 fadd f3, f0, f3 fcmpu cr0, f3, f1 bge cr0, LBB_RateConvertMono8AltiVec_2 ; no_exit to an inner loop like this: LBB_RateConvertMono8AltiVec_1: ; no_exit fsub f2, f2, f1 fcmpu cr0, f2, f1 fmr f0, f2 bge cr0, LBB_RateConvertMono8AltiVec_1 ; no_exit Doh! good catch! llvm-svn: 23838	2005-10-20 04:47:10 +00:00
Chris Lattner	da1b152c43	Make this work for FP constantexprs llvm-svn: 23773	2005-10-17 20:18:38 +00:00
Chris Lattner	7fde91e365	Oops, X+0.0 isn't foldable, but X+-0.0 is. llvm-svn: 23772	2005-10-17 17:56:38 +00:00
Chris Lattner	32979336a7	relax this a bit, as we only support the default rounding mode llvm-svn: 23771	2005-10-17 17:49:32 +00:00
Chris Lattner	192cd18f53	Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling out CSE's of base expressions it could build a result whose order was nondet. llvm-svn: 23698	2005-10-11 18:41:04 +00:00
Chris Lattner	5c9d63da31	Fix another problem where LSR was being nondeterminstic. Also remove elements from the end of a vector instead of the beginning llvm-svn: 23697	2005-10-11 18:30:57 +00:00
Chris Lattner	b7a3894e7c	Fix another lsr-is-nondeterministic case llvm-svn: 23695	2005-10-11 18:17:57 +00:00
Chris Lattner	03b9eb506c	Make MaskedValueIsZero a bit more aggressive llvm-svn: 23677	2005-10-09 22:08:50 +00:00
Chris Lattner	62010c450f	Fix funky xcode indentation llvm-svn: 23674	2005-10-09 06:36:35 +00:00
Chris Lattner	eb4be8b942	Hrm, you didn't see this. llvm-svn: 23673	2005-10-09 06:24:02 +00:00
Chris Lattner	4ea0a3eaac	Fix a source of non-determinism in the backend: the order of processing IV strides dependend on the pointer order of the strides in memory. Non-determinism is bad. llvm-svn: 23672	2005-10-09 06:20:55 +00:00
Jeff Cohen	572910c9a2	Remove useless variable. llvm-svn: 23656	2005-10-07 05:28:29 +00:00
Chris Lattner	f07a587c79	Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In particular, it should realize that phi's use their values in the pred block not the phi block itself. This change turns our em3d loop from this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_6 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; endif.loopexit.loopexit_crit_edge addi r3, r2, 1 blr LBB_test_6: ; loopexit or r3, r2, r2 blr into: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r6, r6 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 or r2, r6, r6 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r2, r2 blr Unfortunately, this is actually worse code, because the register coallescer is getting confused somehow. If it were doing its job right, it could turn the code into this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r6, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r6, r6 blr ... which I'll work on next. :) llvm-svn: 23604	2005-10-03 02:50:05 +00:00
Chris Lattner	e4ed42a426	Refactor some code into a function llvm-svn: 23603	2005-10-03 01:04:44 +00:00
Chris Lattner	360928dbed	This break is bogus and I have no idea why it was there. Basically it prevents memoizing code when IV's are used by phinodes outside of loops. In a simple example, we were getting this code before (note that r6 and r7 are isomorphic IV's): li r6, 0 or r7, r6, r6 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r7, r7 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r2, r7, 1 addi r7, r7, 1 addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit Now we get: li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit this was noticed in em3d. llvm-svn: 23602	2005-10-03 00:37:33 +00:00
Chris Lattner	8fcce170cf	when checking if we should move a split edge block outside of a loop, check the presplit pred, not the post-split pred. This was causing us to make the wrong decision in some cases, leaving the critical edge block in the loop. llvm-svn: 23601	2005-10-03 00:31:52 +00:00
Jeff Cohen	f8a5e5ae6e	Fix VC++ warnings. llvm-svn: 23579	2005-10-01 03:57:14 +00:00
Chris Lattner	a554c9470b	Insert stores after phi nodes in the normal dest. This fixes LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 23525	2005-09-29 17:44:20 +00:00
Chris Lattner	3b63bb375c	add a note about a way to improve this code further, that I won't be getting to right now. llvm-svn: 23485	2005-09-27 22:44:59 +00:00
Chris Lattner	e285f5ed8f	Avoid spilling stack slots... to stack slots. llvm-svn: 23478	2005-09-27 21:33:12 +00:00
Chris Lattner	87eb249300	Completely rewrite 'correct' eh support. This changes how setjmp insertion is performed so it is only at most once per function that contains an invoke instead of once per invoke in the function. This patch has the following perks: 1. It fixes PR631, which complains about slowness. 2. If fixes PR240, which complains about non-volatile vars being live across setjmp/longjmps. 3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not forcing the jmpbufs to always be 8-bytes off the alignment of the structure. 4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us now about 4% faster than GCC. Further improvements are also possible. llvm-svn: 23477	2005-09-27 21:18:17 +00:00
Chris Lattner	92233d2175	Make the pass name simpler llvm-svn: 23476	2005-09-27 21:10:32 +00:00
Chris Lattner	02ae21e1e0	Eliminate GetGEPGlobalInitializer in favor of the more powerful ConstantFoldLoadThroughGEPConstantExpr function in the utils lib. llvm-svn: 23446	2005-09-26 05:28:52 +00:00
Chris Lattner	0b011ec8e2	Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils as ConstantFoldLoadThroughGEPConstantExpr. llvm-svn: 23445	2005-09-26 05:28:06 +00:00
Chris Lattner	0b3557f54a	Move MaskedValueIsZero up. Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll llvm-svn: 23428	2005-09-24 23:43:33 +00:00
Chris Lattner	b4b2530a1a	Refactor this code a bit and make it more general. This now compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } To: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) slwi r3, r3, 6 add r3, r4, r3 rlwimi r3, r4, 0, 26, 14 stw r3, 0(r2) blr instead of: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 26, 21, 31 add r3, r5, r3 rlwimi r4, r3, 6, 15, 25 stw r4, 0(r2) blr by eliminating an 'and'. I'm pretty sure this is as small as we can go :) llvm-svn: 23386	2005-09-18 07:22:02 +00:00
Chris Lattner	797dee7705	Compile struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } to: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX and %ECX, 131008 mov %EDX, DWORD PTR [%ESP + 4] shl %EDX, 6 add %EDX, %ECX and %EDX, 131008 and %EAX, -131009 or %EDX, %EAX mov DWORD PTR [b], %EDX ret instead of: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX shr %ECX, 6 and %ECX, 2047 add %ECX, DWORD PTR [%ESP + 4] shl %ECX, 6 and %ECX, 131008 and %EAX, -131009 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23385	2005-09-18 06:30:59 +00:00
Chris Lattner	01f56c68e9	Generalize this transform, using MaskedValueIsZero, allowing us to compile: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } To: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 add DWORD PTR [b], %EAX ret instead of: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 mov %ECX, DWORD PTR [b] add %EAX, %ECX and %EAX, -131072 and %ECX, 131071 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23384	2005-09-18 06:02:59 +00:00
Chris Lattner	4ebc8ab4e0	fix typeo llvm-svn: 23383	2005-09-18 05:25:20 +00:00
Chris Lattner	e5b23a6d67	Remove unintentionally committed code llvm-svn: 23382	2005-09-18 05:12:51 +00:00
Chris Lattner	27cb9dbd35	implement shift.ll:test25. This compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } to: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r3, 0(r2) rlwinm r4, r3, 0, 0, 14 add r4, r4, r3 rlwimi r4, r3, 0, 15, 31 stw r4, 0(r2) blr instead of: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) srwi r5, r4, 17 add r3, r5, r3 slwi r3, r3, 17 rlwimi r3, r4, 0, 15, 31 stw r3, 0(r2) blr llvm-svn: 23381	2005-09-18 05:12:10 +00:00
Chris Lattner	af517574ce	Implement add.ll:test29. Codegening: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus1 (unsigned int x) { b.i += x; } as: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) add r3, r4, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr instead of: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 0, 26, 31 add r3, r5, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr llvm-svn: 23379	2005-09-18 04:24:45 +00:00
Chris Lattner	027eaf01cf	remove debug output llvm-svn: 23377	2005-09-18 03:50:25 +00:00
Chris Lattner	1521298993	Implement or.ll:test21. This teaches instcombine to be able to turn this: struct { unsigned int bit0:1; unsigned int ubyte:31; } sdata; void foo() { sdata.ubyte++; } into this: foo: add DWORD PTR [sdata], 2 ret instead of this: foo: mov %EAX, DWORD PTR [sdata] mov %ECX, %EAX add %ECX, 2 and %ECX, -2 and %EAX, 1 or %EAX, %ECX mov DWORD PTR [sdata], %EAX ret llvm-svn: 23376	2005-09-18 03:42:07 +00:00
Chris Lattner	a393e4d4b3	Fix the regression last night compiling povray llvm-svn: 23348	2005-09-14 17:32:56 +00:00
Chris Lattner	2a8932960d	Add a simple xform to simplify array accesses with casts in the way. This is useful for 178.galgel where resolution of dope vectors (by the optimizer) causes the scales to become apparent. llvm-svn: 23328	2005-09-13 18:36:04 +00:00
Chris Lattner	fd018c8dfe	Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327	2005-09-13 02:09:55 +00:00
Chris Lattner	567b81f0d2	Add a helper function, allowing us to simplify some code a bit, changing indentation, no functionality change llvm-svn: 23325	2005-09-13 00:40:14 +00:00
Chris Lattner	219175c84d	Implement a simple xform to turn code like this: if () { store A -> P; } else { store B -> P; } into a PHI node with one store, in the most trival case. This implements load.ll:test10. llvm-svn: 23324	2005-09-12 23:23:25 +00:00
Chris Lattner	e0bfdf1485	Another load-peephole optimization: do gcse when two loads are next to each other. This implements InstCombine/load.ll:test9 llvm-svn: 23322	2005-09-12 22:21:03 +00:00
Chris Lattner	b990f7d8ed	Implement a trivial form of store->load forwarding where the store and the load are exactly consequtive. This is picked up by other passes, but this triggers thousands of times in fortran programs that use static locals (and is thus a compile-time speedup). llvm-svn: 23320	2005-09-12 22:00:15 +00:00
Chris Lattner	8048b85e8f	Fix a regression from last night, which caused this pass to create invalid code for IV uses outside of loops that are not dominated by the latch block. We should only convert these uses to use the post-inc value if they ARE dominated by the latch block. Also use a new LoopInfo method to simplify some code. This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll llvm-svn: 23318	2005-09-12 17:11:27 +00:00
Chris Lattner	a67648396a	_test: li r2, 0 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r2, 1 stw r2, 0(r4) blr [zion ~/llvm]$ cat > ~/xx Uses of IV's outside of the loop should use hte post-incremented version of the IV, not the preincremented version. This helps many loops (e.g. in sixtrack) which used to generate code like this (this is the code from the dont-hoist-simple-loop-constants.ll testcase): _test: li r2, 0 ** IV starts at 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 Copy for loop exit li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 IV+2 cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 IV+2 stw r2, 0(r4) blr And now generated code like this: _test: li r2, 1 * IV starts at 1 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 * IV.postinc + 0 blt cr0, LBB_test_1 LBB_test_2: ; loopexit.2.loopexit stw r2, 0(r4) * IV.postinc + 0 blr llvm-svn: 23313	2005-09-12 06:04:47 +00:00
Chris Lattner	530fe6ab30	implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll. We used to emit this code for it: _test: li r2, 1 ;; Value tying up a register for the whole loop li r5, 0 LBB_test_1: ; no_exit.2 or r6, r5, r5 li r5, 0 stw r5, 0(r3) addi r5, r6, 1 addi r3, r3, 4 add r7, r2, r5 ;; should be addi r7, r5, 1 cmpwi cr0, r7, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r6, 2 stw r2, 0(r4) blr now we emit this: _test: li r2, 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 ;; whoa, fold those adds! cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 stw r2, 0(r4) blr more improvement coming. llvm-svn: 23306	2005-09-10 01:18:45 +00:00
Chris Lattner	b5e381a8cf	Fix a problem that Dan Berlin noticed, where reassociation would not succeed in building maximal expressions before simplifying them. In particular, i cases like this: X-(A+B+X) the code would consider A+B+X to be a maximal expression (not understanding that the single use '-' would be turned into a + later), simplify it (a noop) then later get simplified again. Each of these simplify steps is where the cost of reassociation comes from, so this patch should speed up the already fast pass a bit. Thanks to Dan for noticing this! llvm-svn: 23214	2005-09-02 07:07:58 +00:00
Chris Lattner	9fe263aa75	Avoid creating garbage instructions, just move the old add instruction to where we need it when converting -(A+B+C) -> -A + -B + -C. llvm-svn: 23213	2005-09-02 06:38:04 +00:00
Chris Lattner	d1325da091	add some assertions and fix problems where reassociate could access the Ops vector out of range llvm-svn: 23211	2005-09-02 05:23:22 +00:00
Chris Lattner	8ca5b2a6d2	Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll llvm-svn: 23019	2005-08-24 17:55:32 +00:00
Chris Lattner	ea7dfd53d6	Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash on 177.mesa llvm-svn: 22843	2005-08-17 21:22:41 +00:00

... 3 4 5 6 7 ...

1467 Commits