llvm-project

Commit Graph

Author	SHA1	Message	Date
Bill Wendling	8f13b5c43e	Replace #include <iostream> with llvm_* streams. llvm-svn: 31924	2006-11-26 10:02:32 +00:00
Bill Wendling	5dbf43c983	Removed #include <iostream> and replaced with llvm_* streams. llvm-svn: 31923	2006-11-26 09:46:52 +00:00
Bill Wendling	a7459ca813	Removed #include <iostream> and used the llvm_cerr/DOUT streams instead. llvm-svn: 31922	2006-11-26 09:17:06 +00:00
Nick Lewycky	09b7e4d3ab	Update to new predicate simplifier VRP design. Fixes PR966 and PR967. Remove predicate simplifier from default gcc3 pipeline. New design is too slow to enable by default. Add new testcases for problems encountered in development. llvm-svn: 31895	2006-11-22 23:49:16 +00:00
Chris Lattner	ec45a4c88c	This xform is handled by FoldOpIntoPhi in visitCastInst in a more elegant way. llvm-svn: 31889	2006-11-21 17:05:13 +00:00
Chris Lattner	95adf8f1da	Do not convert massive blocks on phi nodes into select statements. Instead only do these transformations if there are a small number of phi's. This speeds up Ptrdist/ks from 2.35s to 2.19s on my mac pro. llvm-svn: 31853	2006-11-18 19:19:36 +00:00
Chris Lattner	21eba2da26	If an indvar with a variable stride is used by the exit condition, go ahead and handle it like constant stride vars. This fixes some bad codegen in variable stride cases. For example, it compiles this: void foo(int k, int i) { for (k=i+i; k <= 8192; k+=i) flags2[k] = 0; } to: LBB1_1: #bb.preheader movl %eax, %ecx addl %ecx, %ecx movl L_flags2$non_lazy_ptr, %edx LBB1_2: #bb movb $0, (%edx,%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB1_2 #bb LBB1_5: #return ret or (if the array is local and we are in dynamic-nonpic or static mode): LBB3_2: #bb movb $0, _flags2(%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB3_2 #bb and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) slwi r3, r4, 1 LBB1_2: ;bb li r5, 0 add r6, r4, r3 stbx r5, r2, r3 cmpwi cr0, r6, 8192 bgt cr0, LBB1_5 ;return instead of: leal (%eax,%eax,2), %ecx movl %eax, %edx addl %edx, %edx addl L_flags2$non_lazy_ptr, %edx xorl %esi, %esi LBB1_2: #bb movb $0, (%edx,%esi) movl %eax, %edi addl %esi, %edi addl %ecx, %esi cmpl $8192, %esi jg LBB1_5 #return and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) mulli r3, r4, 3 slwi r5, r4, 1 li r6, 0 add r2, r2, r5 LBB1_2: ;bb li r5, 0 add r7, r3, r6 stbx r5, r2, r6 add r6, r4, r6 cmpwi cr0, r7, 8192 ble cr0, LBB1_2 ;bb This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and implements LoopStrengthReduce/var_stride_used_by_compare.ll llvm-svn: 31809	2006-11-17 06:17:33 +00:00
Chris Lattner	e3a63d136d	Fix a gcc 4.2 warning. llvm-svn: 31751	2006-11-15 04:53:24 +00:00
Chris Lattner	f05d69ae72	implement InstCombine/shift-simplify.ll by transforming: (X >> Z) op (Y >> Z) -> (X op Y) >> Z for all shifts and all ops={and/or/xor}. llvm-svn: 31729	2006-11-14 07:46:50 +00:00
Chris Lattner	d12a4bf799	implement InstCombine/and-compare.ll:test1. This compiles: typedef struct { unsigned prefix : 4; unsigned code : 4; unsigned unsigned_p : 4; } tree_common; int foo(tree_common a, tree_common b) { return a->code == b->code; } into: _foo: movl 4(%esp), %eax movl 8(%esp), %ecx movl (%eax), %eax xorl (%ecx), %eax # TRUNCATE movb %al, %al shrb $4, %al testb %al, %al sete %al movzbl %al, %eax ret instead of: _foo: movl 8(%esp), %eax movb (%eax), %al shrb $4, %al movl 4(%esp), %ecx movb (%ecx), %cl shrb $4, %cl cmpb %al, %cl sete %al movzbl %al, %eax ret saving one cycle by eliminating a shift. llvm-svn: 31727	2006-11-14 06:06:06 +00:00
Chris Lattner	d4dee405cb	Fix InstCombine/2006-11-10-ashr-miscompile.ll a miscompilation introduced by the shr -> [al]shr patch. This was reduced from 176.gcc. llvm-svn: 31653	2006-11-10 23:38:52 +00:00
Chris Lattner	82928ca290	second patch to fix PR992/993. llvm-svn: 31610	2006-11-09 23:36:08 +00:00
Chris Lattner	924f4fee8b	Minimal patch to fix PR992/PR993 llvm-svn: 31608	2006-11-09 23:17:45 +00:00
Chris Lattner	6e2c15c158	Teach ShrinkDemandedConstant how to handle X+C. This implements: add.ll:test33, add.ll:test34, shift-sra.ll:test2 llvm-svn: 31586	2006-11-09 05:12:27 +00:00
Chris Lattner	4f218d56f5	reenable factoring of GEP expressions, being more precise about the case that it bad to do. llvm-svn: 31563	2006-11-08 19:42:28 +00:00
Chris Lattner	cd62f11227	make this code more efficient by not creating a phi node we are just going to delete in the first place. This also makes it simpler. llvm-svn: 31562	2006-11-08 19:29:23 +00:00
Jim Laskey	61feeb90f9	Remove redundant <cmath>. llvm-svn: 31561	2006-11-08 19:16:44 +00:00
Chris Lattner	a3acfca920	disable this factoring optzn for GEPs for now, this severely pessimizes some loops. llvm-svn: 31560	2006-11-08 18:49:31 +00:00
Reid Spencer	fdff938a7e	For PR950: This patch converts the old SHR instruction into two instructions, AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not dependent on the sign of their operands. llvm-svn: 31542	2006-11-08 06:47:33 +00:00
Chris Lattner	4967f6ddea	scalarrepl should not split the two elements of the vsiidx array: int func(vFloat v0, vFloat v1) { int ii; vSInt32 vsiidx[2]; vsiidx[0] = _mm_cvttps_epi32(v0); vsiidx[1] = _mm_cvttps_epi32(v1); ii = ((int *) vsiidx)[4]; return ii; } This fixes Transforms/ScalarRepl/2006-11-07-InvalidArrayPromote.ll llvm-svn: 31524	2006-11-07 22:42:47 +00:00
Jeff Cohen	7d6f3db3e2	Unbreak VC++ build. llvm-svn: 31464	2006-11-05 19:31:28 +00:00
Nick Lewycky	67bad5adbc	Remove commented line from earlier debugging. llvm-svn: 31460	2006-11-05 14:19:40 +00:00
Andrew Lenharth	0ebb0b03e6	The wrong parameter was being tested to deturmine i32 vs i64 llvm-svn: 31431	2006-11-03 22:45:50 +00:00
Chris Lattner	62e2cad6b8	remove dead code llvm-svn: 31398	2006-11-03 01:34:58 +00:00
Reid Spencer	de46e48420	For PR786: Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting fall out by removing unused variables. Remaining warnings have to do with unused functions (I didn't want to delete code without review) and unused variables in generated code. Maintainers should clean up the remaining issues when they see them. All changes pass DejaGnu tests and Olden. llvm-svn: 31380	2006-11-02 20:25:50 +00:00
Reid Spencer	7eb55b395f	For PR950: Replace the REM instruction with UREM, SREM and FREM. llvm-svn: 31369	2006-11-02 01:53:59 +00:00
Devang Patel	2cb4f83b38	There can be more than one PHINode at the start of the block. llvm-svn: 31362	2006-11-01 23:04:45 +00:00
Devang Patel	44519a8feb	Handle PHINode with only one incoming value. This fixes http://llvm.org/bugs/show_bug.cgi?id=979 llvm-svn: 31358	2006-11-01 22:26:43 +00:00
Chris Lattner	5a0bd61c64	Fix GlobalOpt/2006-11-01-ShrinkGlobalPhiCrash.ll and McGill/chomp llvm-svn: 31352	2006-11-01 18:03:33 +00:00
Chris Lattner	eebea43b48	Factor gep instructions through phi nodes. llvm-svn: 31346	2006-11-01 07:43:41 +00:00
Chris Lattner	14f82c7dcd	Turn a phi of many loads into a phi of the address and a single load of the result. This can significantly shrink code and exposes identities more aggressively. llvm-svn: 31344	2006-11-01 07:13:54 +00:00
Chris Lattner	dc826fc068	Fix a bug in the previous patch llvm-svn: 31342	2006-11-01 04:55:47 +00:00
Chris Lattner	cadac0c5c3	Fold things like "phi [add (a,b), add(c,d)]" into two phi's and one add. This triggers thousands of times on multisource. llvm-svn: 31341	2006-11-01 04:51:18 +00:00
Chris Lattner	984d6e1669	generalize the fix for PR977 to also fix Transforms/LCSSA/2006-10-31-UnreachableBlock-2.ll llvm-svn: 31317	2006-10-31 18:56:48 +00:00
Chris Lattner	eb68f080ef	Fix PR977 and Transforms/LCSSA/2006-10-31-UnreachableBlock.ll llvm-svn: 31315	2006-10-31 17:52:18 +00:00
Chris Lattner	fc519cd2d1	Fix SimplifyCFG/2006-10-29-InvokeCrash.ll, a crash compiling QT. llvm-svn: 31284	2006-10-29 21:21:20 +00:00
Chris Lattner	3e763f5708	add option to isCriticalEdge llvm-svn: 31258	2006-10-28 06:58:17 +00:00
Chris Lattner	a6eb7e0803	break edges more intelligently llvm-svn: 31257	2006-10-28 06:45:33 +00:00
Chris Lattner	80ea207bfa	Expose a smarter way to break critical edges. llvm-svn: 31256	2006-10-28 06:44:56 +00:00
Chris Lattner	400ac04e64	SplitCriticalEdge checks to see if an edge is critical, don't check twice llvm-svn: 31255	2006-10-28 06:38:14 +00:00
Chris Lattner	5191c65485	prepare for a change I'm about to make llvm-svn: 31248	2006-10-28 00:59:20 +00:00
Reid Spencer	00c482b7a2	Simplify code a bit by changing instances of: InsertNewInstBefore(new CastInst(Val, ValTy, Val->GetName()), I) into: InsertCastBefore(Val, ValTy, I) llvm-svn: 31204	2006-10-26 19:19:06 +00:00
Reid Spencer	7e80b0b31e	For PR950: Make necessary changes to support DIV -> [SUF]Div. This changes llvm to have three division instructions: signed, unsigned, floating point. The bytecode and assembler are bacwards compatible, however. llvm-svn: 31195	2006-10-26 06:15:43 +00:00
Nick Lewycky	5b979ae531	Fix 2006-10-25-AddSetCC. A relational operator (like setlt) can never produce an EQ property. llvm-svn: 31193	2006-10-26 02:35:18 +00:00
Nick Lewycky	9d17c82a26	Resurrect r1.25. Fix and comment the "or", "and" and "xor" transformations. llvm-svn: 31189	2006-10-25 23:48:24 +00:00
Chris Lattner	53f53db919	hide symbols properly llvm-svn: 31184	2006-10-25 21:14:31 +00:00
Chris Lattner	ebb1ad4382	Fix Transforms/ScalarRepl/2006-10-23-PointerUnionCrash.ll llvm-svn: 31151	2006-10-24 06:26:32 +00:00
Chris Lattner	dc7b9beb20	Revert back to r1.21, which was the last revision of predsimplify that passes llvm-gcc bootstrap. llvm-svn: 31146	2006-10-24 00:36:21 +00:00
Chris Lattner	fe7b6ef346	Handle fallout from the recent branch-on-undef changes. This fixes Prolangs-C/agrep and SCCP/2006-10-23-IPSCCP-Crash.ll llvm-svn: 31132	2006-10-23 18:57:02 +00:00
Nick Lewycky	53b4158448	Remove the Backwards operation. Resolving now works at the time when a property is added by running through the list of uses of the value and adding resolved properties to the property set. llvm-svn: 31126	2006-10-23 01:56:02 +00:00
Nick Lewycky	6f5c30fcec	Fix similar missing optimization opportunity in XOR. llvm-svn: 31123	2006-10-22 22:22:58 +00:00
Nick Lewycky	af2b0571d0	Whoops! Add missing NULL check. llvm-svn: 31121	2006-10-22 21:38:24 +00:00
Nick Lewycky	2c734f3fc1	Handle "if ((x\|y) != 0)" for ints like we do for bools. Fixes missed optimization opportunity pointed out by Chris Lattner. llvm-svn: 31118	2006-10-22 21:36:41 +00:00
Nick Lewycky	f345008339	AllocaInst can't return a null pointer. Fixes missed optimization opportunity pointed out by Andrew Lewycky. llvm-svn: 31115	2006-10-22 19:53:27 +00:00
Chris Lattner	250eff20da	Add a workaround for PR962, disabling the more aggressive form of this transformation. This speeds up a C++ app 2.25x. llvm-svn: 31113	2006-10-22 18:42:26 +00:00
Chris Lattner	af17096dcf	3 Changes: 1. Better document what is going on here. 2. Only hack on one branch per iteration, making the results less conservative. 3. Handle the problematic case by marking edges executable instead of by playing with value lattice states. This is far less pessimistic, and fixes SCCP/ipsccp-gvar.ll. llvm-svn: 31106	2006-10-22 05:59:17 +00:00
Chris Lattner	af1222c1a7	llvm-extract should remove module-level asm llvm-svn: 31086	2006-10-20 21:35:41 +00:00
Chris Lattner	319c86fd38	Fix an ugly problem in SCCP. This fixes Benchmarks/Misc-C++/mandel-text.cpp llvm-svn: 31073	2006-10-20 20:19:08 +00:00
Chris Lattner	5dee3b2526	Fix miscompilation of MallocBench/espresso which code review pointed out but apparently didn't make it into the final patch. llvm-svn: 31070	2006-10-20 18:20:21 +00:00
Reid Spencer	e0fc4dfc22	For PR950: This patch implements the first increment for the Signless Types feature. All changes pertain to removing the ConstantSInt and ConstantUInt classes in favor of just using ConstantInt. llvm-svn: 31063	2006-10-20 07:07:24 +00:00
Devang Patel	5d417e35bc	While creating mask, use 1ULL instead of 1. llvm-svn: 31062	2006-10-20 01:16:56 +00:00
Chris Lattner	b8b11599dd	Fix SimplifyCFG/2006-10-19-UncondDiv.ll by disabling a bad xform. llvm-svn: 31061	2006-10-20 00:42:07 +00:00
Devang Patel	5d6df959e3	It is OK to remove extra cast if operation is EQ/NE even though source and destination sign may not match but other conditions are met. llvm-svn: 31056	2006-10-19 20:59:13 +00:00
Devang Patel	88afd00d1d	Typo Typo. llvm-svn: 31055	2006-10-19 19:21:36 +00:00
Devang Patel	472530d9fc	Typo. llvm-svn: 31054	2006-10-19 19:05:38 +00:00
Devang Patel	b42aef4925	Fix bug in PR454 resolution. Added new test case. This fixes llvmAsmParser.cpp miscompile by llvm on PowerPC Darwin. llvm-svn: 31053	2006-10-19 18:54:08 +00:00
Reid Spencer	3c514959dd	Undo Chris' last patch, it caused a regression. llvm-svn: 30991	2006-10-16 23:08:08 +00:00
Chris Lattner	9a1c7dd27a	fix a buggy check that accidentally disabled this xform llvm-svn: 30967	2006-10-15 22:42:15 +00:00
Nick Lewycky	77e030bca9	Replace custom dispatch code with two uses of InstVisitor. Improves compile-time performance. llvm-svn: 30896	2006-10-12 02:02:44 +00:00
Chris Lattner	41b442242d	Implement SROA of unions with mixed pointers/integers in them. This implements PR892 and Transforms/ScalarRepl/union-pointer.ll:test2 llvm-svn: 30825	2006-10-08 23:53:04 +00:00
Chris Lattner	05f8272afa	Implement Transforms/ScalarRepl/union-pointer.ll:test llvm-svn: 30823	2006-10-08 23:28:04 +00:00
Chris Lattner	2deeaeaca7	add a new SimplifyDemandedVectorElts method, which works similarly to SimplifyDemandedBits. The idea is that some operations can be simplified if not all of the computed elements are needed. Some targets (like x86) have a large number of intrinsics that operate on a single element, but pass other elts through unmodified. If those other elements are not needed, the intrinsics can be simplified to scalar operations, and insertelement ops can be removed. This turns (f.e.): ushort %Convert_sse(float %f) { %tmp = insertelement <4 x float> undef, float %f, uint 0 ; <<4 x float>> [#uses=1] %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1 ; <<4 x float>> [#uses=1] %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2 ; <<4 x float>> [#uses=1] %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3 ; <<4 x float>> [#uses=1] %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } into: ushort %Convert_sse(float %f) { entry: %tmp28 = sub float %f, 1.000000e+00 ; <float> [#uses=1] %tmp37 = mul float %tmp28, 5.000000e-01 ; <float> [#uses=1] %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0 ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } which improves codegen from: _Convert_sse: movss LCPI1_0, %xmm0 movss 4(%esp), %xmm1 subss %xmm0, %xmm1 movss LCPI1_1, %xmm0 mulss %xmm0, %xmm1 movss LCPI1_2, %xmm0 minss %xmm0, %xmm1 xorps %xmm0, %xmm0 maxss %xmm0, %xmm1 cvttss2si %xmm1, %eax andl $65535, %eax ret to: _Convert_sse: movss 4(%esp), %xmm0 subss LCPI1_0, %xmm0 mulss LCPI1_1, %xmm0 movss LCPI1_2, %xmm1 minss %xmm1, %xmm0 xorps %xmm1, %xmm1 maxss %xmm1, %xmm0 cvttss2si %xmm0, %eax andl $65535, %eax ret This is just a first step, it can be extended in many ways. Testcase here: Transforms/InstCombine/vec_demanded_elts.ll llvm-svn: 30752	2006-10-05 06:55:50 +00:00
Chris Lattner	52886e72d7	This case isn't implemented yet. It seems unlikely to be needed, but if it ever is, we want to get an assert instead of silent bad codegen. llvm-svn: 30716	2006-10-04 04:58:58 +00:00
Nick Lewycky	58a910dff5	Simplify logic further. Ensure that we copy KnownProperties before calling visitBasicBlock, else we may leak properties into blocks where they don't belong. llvm-svn: 30705	2006-10-03 17:36:01 +00:00
Nick Lewycky	1d00f3e144	Simplify, now that predsimplify depends on break-crit-edges. Fix SwitchInst where dest-block is the same as one of the cases. llvm-svn: 30700	2006-10-03 15:19:11 +00:00
Nick Lewycky	755f801adc	Move break-crit-edges before the predicate simplifier. Allows us to optimize in more cases. llvm-svn: 30699	2006-10-03 14:52:23 +00:00
Evan Cheng	ff510a58c2	Revert previous patch. Still breaking things. llvm-svn: 30698	2006-10-03 07:26:07 +00:00
Chris Lattner	8aca0ee8c3	Fix PR932 and Analysis/Dominators/2006-10-02-BreakCritEdges.ll: The critical edge block dominates the dest block if the destblock dominates all edges other than the one incoming from the critical edge. llvm-svn: 30696	2006-10-03 07:02:02 +00:00
Chris Lattner	7d19067c42	Fix a bug from r1.391 of this file, where we checked the size instead of the alignment when promoting allocations. This implements InstCombine/cast.ll:test32 llvm-svn: 30682	2006-10-01 19:40:58 +00:00
Chris Lattner	4797c891c0	Fix debug output llvm-svn: 30680	2006-09-30 23:32:50 +00:00
Chris Lattner	24d3d4280a	Implement SRA of heap allocations. llvm-svn: 30679	2006-09-30 23:32:09 +00:00
Chris Lattner	80a01ef6f0	Add some ifdef'd out debug info llvm-svn: 30676	2006-09-30 19:40:30 +00:00
Chris Lattner	6ab03f6a08	Eliminate ConstantBool::True and ConstantBool::False. Instead, provide ConstantBool::getTrue() and ConstantBool::getFalse(). llvm-svn: 30665	2006-09-28 23:35:22 +00:00
Owen Anderson	7cb6809c25	Another attempt at making ArgPromotion smarter. This patch no longer breaks Burg. llvm-svn: 30657	2006-09-28 23:02:22 +00:00
Chris Lattner	525804f31e	simplify code llvm-svn: 30656	2006-09-28 22:58:25 +00:00
Chris Lattner	e03ca2ca4a	set DEBUG_TYPE right llvm-svn: 30623	2006-09-27 04:58:23 +00:00
Nick Lewycky	059c79264f	Style changes only. Remove dead code, fix a comment. llvm-svn: 30588	2006-09-23 15:13:08 +00:00
Chris Lattner	6bd6da4097	Be far more careful when splitting a loop header, either to form a preheader or when splitting loops with a common header into multiple loops. In particular the old code would always insert the preheader before the old loop header. This is disasterous in cases where the loop hasn't been rotated. For example, it can produce code like: .. outside the loop... jmp LBB1_2 #bb13.outer LBB1_1: #bb1 movsd 8(%esp,%esi,8), %xmm1 mulsd (%edi), %xmm1 addsd %xmm0, %xmm1 addl $24, %edi incl %esi jmp LBB1_3 #bb13 LBB1_2: #bb13.outer leal (%edx,%eax,8), %edi pxor %xmm1, %xmm1 xorl %esi, %esi LBB1_3: #bb13 movapd %xmm1, %xmm0 cmpl $4, %esi jl LBB1_1 #bb1 Note that the loop body is actually LBB1_1 + LBB1_3, which means that the loop now contains an uncond branch WITHIN it to jump around the inserted loop header (LBB1_2). Doh. This patch changes the preheader insertion code to insert it in the right spot, producing this code: ... outside the loop, fall into the header ... LBB1_1: #bb13.outer leal (%edx,%eax,8), %esi pxor %xmm0, %xmm0 xorl %edi, %edi jmp LBB1_3 #bb13 LBB1_2: #bb1 movsd 8(%esp,%edi,8), %xmm0 mulsd (%esi), %xmm0 addsd %xmm1, %xmm0 addl $24, %esi incl %edi LBB1_3: #bb13 movapd %xmm0, %xmm1 cmpl $4, %edi jl LBB1_2 #bb1 Totally crazy, no branch in the loop! :) llvm-svn: 30587	2006-09-23 08:19:21 +00:00
Chris Lattner	608cd05e3f	Teach UpdateDomInfoForRevectoredPreds to handle revectored preds that are not reachable, making it general purpose enough for use by InsertPreheaderForLoop. Eliminate custom dominfo updating code in InsertPreheaderForLoop, using UpdateDomInfoForRevectoredPreds instead. llvm-svn: 30586	2006-09-23 07:40:52 +00:00
Chris Lattner	51c95cdd82	Fix Transforms/IndVarsSimplify/2006-09-20-LFTR-Crash.ll llvm-svn: 30555	2006-09-21 05:12:20 +00:00
Nick Lewycky	fde9c308b2	Don't rewrite ConstantExpr::get. llvm-svn: 30552	2006-09-21 01:05:35 +00:00
Nick Lewycky	d74c55f483	Once we're down to "setcc type constant1, constant2", at least come up with the right answer. llvm-svn: 30550	2006-09-20 23:02:24 +00:00
Nick Lewycky	cfff1c3f86	Use a total ordering to compare instructions. Fixes infinite loop in resolve(). llvm-svn: 30540	2006-09-20 17:04:01 +00:00
Andrew Lenharth	44cb67af5c	simplify llvm-svn: 30535	2006-09-20 15:37:57 +00:00
Chris Lattner	380c7e9a59	We went through all that trouble to compute whether it was safe to transform this comparison, but never checked it. Whoops, no wonder we miscompiled 177.mesa! llvm-svn: 30511	2006-09-20 04:44:59 +00:00
Evan Cheng	cd3f6ff0e5	Back out Chris' last set of changes. This breaks 177.mesa and povray somehow. llvm-svn: 30505	2006-09-20 01:39:40 +00:00
Evan Cheng	453280b94d	80 col. llvm-svn: 30504	2006-09-20 01:10:02 +00:00
Andrew Lenharth	4f339bebb0	If we have an add, do it in the pointer realm, not the int realm. This is critical in the linux kernel for pointer analysis correctness llvm-svn: 30496	2006-09-19 18:24:51 +00:00
Chris Lattner	12f52faf93	implement select.ll:test19-22 llvm-svn: 30482	2006-09-19 06:18:21 +00:00
Nick Lewycky	b9c5483a93	Walk down the dominator tree instead of the control flow graph. That means that we can't modify the CFG any more, at least not until it's possible to update the dominator tree (PR217). llvm-svn: 30469	2006-09-18 21:09:35 +00:00
Chris Lattner	de07792595	Fix an infinite loop building the CFE llvm-svn: 30465	2006-09-18 18:27:05 +00:00
Chris Lattner	67a35bbce7	Implement a trivial optzn: of vastart is never called in a function that takes ... args, remove the '...'. This is Transforms/DeadArgElim/dead_vaargs.ll llvm-svn: 30459	2006-09-18 07:02:31 +00:00
Chris Lattner	4922a0e53f	Implement InstCombine/cast.ll:test31. This speeds up 462.libquantum by 26%. llvm-svn: 30456	2006-09-18 05:27:43 +00:00
Chris Lattner	420c4bcc8d	Implement Transforms/InstCombine/shift-sra.ll:test0 llvm-svn: 30450	2006-09-18 04:31:40 +00:00
Chris Lattner	b3f24c91b0	Rewrite shift/and/compare sequences to promote better licm of the RHS. Use isLogicalShift/isArithmeticShift to simplify code. llvm-svn: 30448	2006-09-18 04:22:48 +00:00
Chris Lattner	850465d53f	Fix Transforms/InstCombine/2006-09-15-CastToBool.ll and PR913 llvm-svn: 30405	2006-09-16 03:14:10 +00:00
Chris Lattner	9482cc5b16	revert previous two patches. They cause miscompilation of MultiSource/Applications/Burg llvm-svn: 30397	2006-09-15 17:24:45 +00:00
Owen Anderson	edadd3faee	Revert my previous work on ArgumentPromotion. Further investigation has revealed these changes to be incorrect. They just weren't showing up in any of our current testcases. llvm-svn: 30385	2006-09-15 05:22:51 +00:00
Anton Korobeynikov	d61d39ec53	Adding dllimport, dllexport and external weak linkage types. DLL* linkages got full (I hope) codegeneration support in C & both x86 assembler backends. External weak linkage added for future use, we don't provide any codegeneration, etc. support for it. llvm-svn: 30374	2006-09-14 18:23:27 +00:00
Chris Lattner	237ccf2a51	Second half of the fix for Transforms/Inline/inline_cleanup.ll This folds unconditional branches that are often produced by code specialization. llvm-svn: 30307	2006-09-13 21:27:00 +00:00
Nick Lewycky	12efffc96b	Add some more consistency checks. llvm-svn: 30305	2006-09-13 19:32:53 +00:00
Nick Lewycky	51ce8d6b46	Fix unionSets so that it can merge correctly. llvm-svn: 30304	2006-09-13 19:24:01 +00:00
Chris Lattner	6ef6d06d21	Implement the first half of Transforms/Inline/inline_cleanup.ll llvm-svn: 30303	2006-09-13 19:23:57 +00:00
Nick Lewycky	3a4dc7b489	Erase dead instructions. llvm-svn: 30298	2006-09-13 18:55:37 +00:00
Devang Patel	fab4972a6e	Initialize DontInternalize. llvm-svn: 30281	2006-09-13 01:02:26 +00:00
Chris Lattner	1d7ec20a4d	An sinkable instruction may exist with uses, if those uses are in dead blocks. Handle this. This fixes PR908 and Transforms/LICM/2006-09-12-DeadUserOfSunkInstr.ll llvm-svn: 30275	2006-09-12 19:17:09 +00:00
Chris Lattner	d28627009a	Fix PR905 and InstCombine/2006-09-11-EmptyStructCrash.ll llvm-svn: 30266	2006-09-11 21:43:16 +00:00
Nick Lewycky	e94f42a740	Skip the linear search if the answer is already known. llvm-svn: 30251	2006-09-11 17:23:34 +00:00
Chris Lattner	d1f8e07808	Allow tail duplication in more cases, relaxing the previous restriction a bit. This fixes Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 30237	2006-09-10 18:17:58 +00:00
Nick Lewycky	9a22d7b60f	Replace EquivalenceClasses with a custom-built data structure. Many common operations (like findProperties) should be faster, at the expense of unionSets being slower in cases that are rare in practise. Don't erase a dead Instruction. This fixes a memory corruption issue. llvm-svn: 30235	2006-09-10 02:27:07 +00:00
Chris Lattner	0468987592	Implement Transforms/InstCombine/hoist_instr.ll llvm-svn: 30234	2006-09-09 22:02:56 +00:00
Chris Lattner	27ff96d87a	Make inlining costs more accurate. llvm-svn: 30231	2006-09-09 20:40:44 +00:00
Chris Lattner	d79dc79831	Turn div X, (Cond ? Y : 0) -> div X, Y This implements select.ll::test18. llvm-svn: 30230	2006-09-09 20:26:32 +00:00
Chris Lattner	c465046e65	Throttle back tail duplication to avoid creating really ugly sequences of code. For Transforms/TailDup/if-tail-dup.ll, f.e., it produces: _foo: movl 8(%esp), %eax movl 4(%esp), %ecx testl $1, %ecx je LBB1_2 #cond_next LBB1_1: #cond_true movl $1, (%eax) LBB1_2: #cond_next testl $2, %ecx je LBB1_4 #cond_next10 LBB1_3: #cond_true6 movl $1, 4(%eax) LBB1_4: #cond_next10 testl $4, %ecx je LBB1_6 #cond_next18 LBB1_5: #cond_true14 movl $1, 8(%eax) LBB1_6: #cond_next18 testl $8, %ecx je LBB1_8 #return LBB1_7: #cond_true22 movl $1, 12(%eax) ret LBB1_8: #return ret instead of: _foo: movl 4(%esp), %eax testl $2, %eax sete %cl movl 8(%esp), %edx testl $1, %eax je LBB1_2 #cond_next LBB1_1: #cond_true movl $1, (%edx) testb %cl, %cl jne LBB1_4 #cond_next10 jmp LBB1_3 #cond_true6 LBB1_2: #cond_next testb %cl, %cl jne LBB1_4 #cond_next10 LBB1_3: #cond_true6 movl $1, 4(%edx) testl $4, %eax je LBB1_6 #cond_next18 jmp LBB1_5 #cond_true14 LBB1_4: #cond_next10 testl $4, %eax je LBB1_6 #cond_next18 LBB1_5: #cond_true14 movl $1, 8(%edx) testl $8, %eax je LBB1_8 #return jmp LBB1_7 #cond_true22 LBB1_6: #cond_next18 testl $8, %eax je LBB1_8 #return LBB1_7: #cond_true22 movl $1, 12(%edx) ret LBB1_8: #return ret llvm-svn: 30158	2006-09-07 21:30:15 +00:00
Chris Lattner	845b223da4	Fix Duraid's changes to work when TLI is null. This fixes the failing lowerinvoke regtests. llvm-svn: 30115	2006-09-05 17:48:07 +00:00
Duraid Madina	cf6749e4c0	add setJumpBufSize() and setJumpBufAlignment() to target-lowering. Call these from your backend to enjoy setjmp/longjmp goodness, see lib/Target/IA64/IA64ISelLowering.cpp for an example llvm-svn: 30095	2006-09-04 06:21:35 +00:00
Owen Anderson	19b80e76df	Make ArgumentPromotion handle recursive functions that pass pointers in their recursive calls. llvm-svn: 30057	2006-09-02 21:19:44 +00:00
Nick Lewycky	8e5599354a	Improve handling of SelectInst. Reorder operations to remove duplicated work. Fix to leave floating-point types out of the optimization. Add tests to predsimplify.ll for SwitchInst and SelectInst handling. llvm-svn: 30055	2006-09-02 19:40:38 +00:00
Nick Lewycky	f6f529d008	Don't confuse canonicalize and lookup. Fixes predsimplify.reg4.ll. Also corrects missing optimization opportunity removing cases from a switch. llvm-svn: 30009	2006-09-01 03:26:35 +00:00
Nick Lewycky	08674ab707	Properties where both Values weren't in the union (as being equal to another Value) weren't being found by findProperties. This fixes predsimplify.ll test6, a missed optimization opportunity. llvm-svn: 29991	2006-08-31 00:39:16 +00:00
Nick Lewycky	5f8f9af65c	Move to using the EquivalenceClass ADT. Removes SynSets. If a branch's condition has become a ConstantBool, simplify it immediately. Removing the edge saves work and exposes up more optimization opportunities in the pass. Add support for SelectInst. llvm-svn: 29970	2006-08-30 02:46:48 +00:00
Devang Patel	f489d0f85c	Do not rely on std::sort and std::erase to get list of unique exit blocks. The output is dependent on addresses of basic block. Add and use Loop::getUniqueExitBlocks. llvm-svn: 29966	2006-08-29 22:29:16 +00:00
Owen Anderson	a8a2e5c666	Clean up a bit. llvm-svn: 29950	2006-08-29 06:10:56 +00:00
Nick Lewycky	b2e8ae1700	Add PredicateSimplifier pass. Collapses equal variables into one form and simplifies expressions. This implements the optimization described in PR807. llvm-svn: 29947	2006-08-28 22:44:55 +00:00
Owen Anderson	62c84fe371	Make LoopUnroll fold excessive BasicBlocks. This results in a significant speedup of gccas on 252.eon llvm-svn: 29936	2006-08-28 02:09:46 +00:00
Chris Lattner	97c9f20c52	simplify AnalysisGroup registration, eliminating one typeid call. llvm-svn: 29932	2006-08-28 00:42:29 +00:00
Chris Lattner	c2d3d3112e	eliminate RegisterOpt. It does the same thing as RegisterPass. llvm-svn: 29925	2006-08-27 22:42:52 +00:00
Chris Lattner	3d27be1333	s\|llvm/Support/Visibility.h\|llvm/Support/Compiler.h\| llvm-svn: 29911	2006-08-27 12:54:02 +00:00
Owen Anderson	403b95af47	Fix a crash related to updating Phi nodes in the original header block. This was causing a crash in 175.vpr llvm-svn: 29887	2006-08-25 22:13:55 +00:00
Owen Anderson	8e4b029573	Add an assertion to check that we're really preserving LCSSA. llvm-svn: 29886	2006-08-25 22:12:36 +00:00
Owen Anderson	8cca95cf5d	Reapply the indvars patch, since nothing blew up last night. llvm-svn: 29874	2006-08-25 17:41:25 +00:00
Owen Anderson	94446a4267	Revert my previous patch. Since there are some major changes that went in today, I'm going to wait to put this in HEAD until tomorrow, so as not to clutter the nightly tester. llvm-svn: 29868	2006-08-25 03:45:57 +00:00
Owen Anderson	15a6423431	Specify that indvars actually preserve LCSSA. This has been done for a while, but I forgot to put in the analysis usage. llvm-svn: 29867	2006-08-25 03:32:13 +00:00
Owen Anderson	e001d811ba	Implement unrolling of multiblock loops. This significantly improves the utility of the LoopUnroll pass. Also, add a testcase for multiblock-loop unrolling. llvm-svn: 29859	2006-08-24 21:28:19 +00:00
Reid Spencer	5495fe8dd6	Fix a grammaro in a comment. llvm-svn: 29765	2006-08-18 09:01:07 +00:00
Chris Lattner	6441cf93c9	Handle single-entry PHI nodes correctly. This fixes PR877 and Transforms/CondProp/2006-08-14-SingleEntryPhiCrash.ll llvm-svn: 29673	2006-08-14 21:38:05 +00:00
Chris Lattner	f18b396cc2	Don't attempt to split subloops out of a loop with a huge number of backedges. Not only will this take huge amounts of compile time, the resultant loop nests won't be useful for optimization. This reduces loopsimplify time on Transforms/LoopSimplify/2006-08-11-LoopSimplifyLongTime.ll from ~32s to ~0.4s with a debug build of llvm on a 2.7Ghz G5. llvm-svn: 29647	2006-08-12 05:25:00 +00:00
Chris Lattner	85d9944f9a	Reimplement the loopsimplify code which deletes edges from unreachable blocks that target loop blocks. Before, the code was run once per loop, and depended on the number of predecessors each block in the loop had. Unfortunately, scanning preds can be really slow when huge numbers of phis exist or when phis with huge numbers of inputs exist. Now, the code is run once per function and scans successors instead of preds, which is far faster. In addition, the new code is simpler and is goto free, woo. This change speeds up a nasty testcase Duraid provided me from taking hours to taking ~72s with a debug build. The functionality this implements is already tested in the testsuite as Transforms/CodeExtractor/2004-03-13-LoopExtractorCrash.ll. llvm-svn: 29644	2006-08-12 04:51:20 +00:00
Reid Spencer	2b6d18a64f	Make this example pass use some things from lib/Support (EscapeString, SlowOperatingInfo, Statistics). Besides providing an example of how to use these facilities, it also serves to debug problems with runtime linking when dlopening a loadable module. These three support facilities exercise different combinations of Text/Weak Weak/Text and Text/Text linking between the executable and the module. llvm-svn: 29552	2006-08-07 23:17:24 +00:00
Reid Spencer	e6458c3fb2	For PR780: 1. Change the usage of LOADABLE_MODULE so that it implies all the things necessary to make a loadable module. This reduces the user's burdern to get a loadable module correctly built. 2. Document the usage of LOADABLE_MODULE in the MakefileGuide 3. Adjust the makefile for lib/Transforms/Hello to use the new specification for building loadable modules 4. Adjust the sample project to not attempt to build a shared library for its little library. This was just wasteful and not instructive at all. llvm-svn: 29551	2006-08-07 23:12:15 +00:00

1 2 3 4 5 ...

2736 Commits