llvm-project

Commit Graph

Author	SHA1	Message	Date
Devang Patel	d3ccc073a2	Mem2Reg does not need TargetData. llvm-svn: 36444	2007-04-25 18:32:35 +00:00
Devang Patel	073be55d8e	Remove unused function argument. llvm-svn: 36441	2007-04-25 17:15:20 +00:00
Anton Korobeynikov	a97b694c82	Implement aliases. This fixes PR1017 and it's dependent bugs. CFE part will follow. llvm-svn: 36435	2007-04-25 14:27:10 +00:00
Chris Lattner	827cb98a0a	If an alloca only has two types of uses: 1) reads 2) a memcpy/memmove that copies from a constant global, then we can change the reads to read from the global instead of from the alloca. This eliminates the alloca and the memcpy, and promotes secondary optimizations (because the loads are now loads from a constant global). This is important for a common C idiom: void foo() { int A[] = {1,2,3,4,5,6,7,8,9...}; ... only reads of A ... } For some reason, people forget to mark the array static or const. This triggers on these multisource benchmarks: JM/ldecode: block_pos, [3 x [4 x [4 x i32]]] FreeBench/mason: m, [18 x i32], inlined 4 times MiBench/office-stringsearch: search_strings, [1332 x i8] MiBench/office-stringsearch: find_strings, [1333 x i8] Prolangs-C++/city: dirs, [9 x i8], inlined 4 places and these spec benchmarks: 177.mesa: message, [8 x [32 x i8]] 186.crafty: bias_rl45, [64 x i32] 186.crafty: diag_sq, [64 x i32] 186.crafty: empty, [9 x i8] 186.crafty: xlate, [15 x i8] 186.crafty: status, [13 x i8] 186.crafty: bdinfo, [25 x i8] 445.gobmk: routines, [16 x i8] 458.sjeng: piece_rep, [14 x i8*] 458.sjeng: t, [13 x i32], inlined 4 places. 464.h264ref: block8x8_idx, [3 x [4 x [4 x i32]]] 464.h264ref: block_pos, [3 x [4 x [4 x i32]]] 464.h264ref: j_off_tab, [12 x i32] This implements Transforms/ScalarRepl/memcpy-from-global.ll llvm-svn: 36429	2007-04-25 06:40:51 +00:00
Chris Lattner	31e5addb67	refactor the SROA code out into its own method, no functionality change. llvm-svn: 36426	2007-04-25 05:02:56 +00:00
Owen Anderson	510fefcd8a	Undo my previous changes. Since my approach to this problem is being revised, this approach is no longer appropriate. llvm-svn: 36421	2007-04-25 04:18:54 +00:00
Devang Patel	d3208523b2	Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048376.html llvm-svn: 36417	2007-04-25 00:37:04 +00:00
Owen Anderson	c24701ed7f	Rollback some changes that adversely affected performance. I'm currently rethinking my approach to this, so hopefully I'll find a way to do this without making this slower. llvm-svn: 36392	2007-04-24 06:40:39 +00:00
Devang Patel	38bc86f057	Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048333.html llvm-svn: 36380	2007-04-23 22:42:03 +00:00
Owen Anderson	64995e1b3f	Make PredicateSimplifier not use DominatorTree. llvm-svn: 36300	2007-04-21 07:38:12 +00:00
Owen Anderson	2965adb849	Fix a comment. llvm-svn: 36299	2007-04-21 07:12:44 +00:00
Jeff Cohen	5959f42498	Comment out usage of write() for now. llvm-svn: 36287	2007-04-20 22:40:10 +00:00
Devang Patel	83a3adcc3f	Avoid recursion. llvm-svn: 36272	2007-04-20 20:04:37 +00:00
Owen Anderson	2da606c757	Move more passes to using ETForest instead of DominatorTree. llvm-svn: 36271	2007-04-20 06:27:13 +00:00
Zhou Sheng	aafe4e216e	Make use of ConstantInt::isZero instead of ConstantInt::isNullValue. llvm-svn: 36261	2007-04-19 05:39:12 +00:00
Zhou Sheng	82fcf3cb5f	Make the operations of APInt variables more efficient. llvm-svn: 36260	2007-04-19 05:35:00 +00:00
Evan Cheng	db9b65d67a	Revert Owen's last check-in. This is breaking Mac OS X / PPC llvm-gcc bootstrap. llvm-svn: 36258	2007-04-18 22:39:00 +00:00
Owen Anderson	9421f03959	Revert changes that caused breakage. llvm-svn: 36255	2007-04-18 06:46:57 +00:00
Owen Anderson	9a6091dec1	Switch more uses of DominatorTree over to ETForest. llvm-svn: 36254	2007-04-18 05:43:13 +00:00
Owen Anderson	550e8db9c7	Use ETForest instead of DominatorTree. llvm-svn: 36252	2007-04-18 05:25:43 +00:00
Owen Anderson	fc40d446c9	Use ETForest instead of DominatorTree. llvm-svn: 36249	2007-04-18 04:55:33 +00:00
Owen Anderson	08293fd6d1	Use new ETForest accessor. llvm-svn: 36248	2007-04-18 04:46:35 +00:00
Owen Anderson	f38f2f2394	Use ETForest instead of DominatorTree. llvm-svn: 36247	2007-04-18 04:39:32 +00:00
Dan Gohman	2ce1116b33	Spell doFinalization right, so that it is a proper virtual override and gets called. llvm-svn: 36208	2007-04-17 18:21:36 +00:00
Chris Lattner	233f97ac6a	remove use of BasicBlock::getNext llvm-svn: 36205	2007-04-17 18:09:47 +00:00
Chris Lattner	24e2d9ca03	remove use of BasicBlock::getNext llvm-svn: 36202	2007-04-17 17:54:12 +00:00
Chris Lattner	cd9bda71a0	eliminate use of Instruction::getNext() llvm-svn: 36200	2007-04-17 17:51:03 +00:00
Chris Lattner	77a3edcb92	remove use of Instruction::getNext llvm-svn: 36199	2007-04-17 17:47:54 +00:00
Devang Patel	abdff3fecd	Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070416/047888.html llvm-svn: 36182	2007-04-16 23:03:45 +00:00
Anton Korobeynikov	fb80151c42	Removed tabs everywhere except autogenerated & external files. Add make target for tabs checking. llvm-svn: 36146	2007-04-16 18:10:23 +00:00
Chris Lattner	343c88cdb9	Fix PR1335 and Transforms/Inline/2007-04-15-InlineEH.ll llvm-svn: 36090	2007-04-15 21:38:06 +00:00
Owen Anderson	f35a1dbc7a	Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for constructing ImmediateDominator is now folded into DomTree construction. This is part of the ongoing work for PR217. llvm-svn: 36063	2007-04-15 08:47:27 +00:00
Chris Lattner	f8a7bf317e	fix SimplifyLibCalls/IsDigit.ll llvm-svn: 36047	2007-04-15 05:38:40 +00:00
Chris Lattner	4a6e0cbd41	Extend store merging to support the 'if/then' version in addition to if/then/else. This sinks the two stores in this example into a single store in cond_next. In this case, it allows elimination of the load as well: store double 0.000000e+00, double* @s.3060 %tmp3 = fcmp ogt double %tmp1, 5.000000e-01 ; <i1> [#uses=1] br i1 %tmp3, label %cond_true, label %cond_next cond_true: ; preds = %entry store double 1.000000e+00, double* @s.3060 br label %cond_next cond_next: ; preds = %entry, %cond_true %tmp6 = load double* @s.3060 ; <double> [#uses=1] This implements Transforms/InstCombine/store-merge.ll:test2 llvm-svn: 36040	2007-04-15 01:02:18 +00:00
Chris Lattner	14a251b937	refactor some code, no functionality change. llvm-svn: 36037	2007-04-15 00:07:55 +00:00
Chris Lattner	28d921d04f	fix long lines llvm-svn: 36031	2007-04-14 23:32:02 +00:00
Chris Lattner	7bfdd0abe1	Implement Transforms/InstCombine/vec_extract_elt.ll, transforming: define i32 @test(float %f) { %tmp7 = insertelement <4 x float> undef, float %f, i32 0 %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32> %tmp19 = extractelement <4 x i32> %tmp17, i32 0 ret i32 %tmp19 } into: define i32 @test(float %f) { %tmp19 = bitcast float %f to i32 ; <i32> [#uses=1] ret i32 %tmp19 } On PPC, this is the difference between: _test: mfspr r2, 256 oris r3, r2, 8192 mtspr 256, r3 stfs f1, -16(r1) addi r3, r1, -16 addi r4, r1, -32 lvx v2, 0, r3 stvx v2, 0, r4 lwz r3, -32(r1) mtspr 256, r2 blr and: _test: stfs f1, -4(r1) nop nop nop lwz r3, -4(r1) blr llvm-svn: 36025	2007-04-14 23:02:14 +00:00
Chris Lattner	b37fb6a0da	Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn unsigned test(float f) { return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f )); } into: _test: movss 4(%esp), %xmm0 mulss %xmm0, %xmm0 movd %xmm0, %eax ret instead of: _test: movss 4(%esp), %xmm0 mulss %xmm0, %xmm0 xorps %xmm1, %xmm1 movss %xmm0, %xmm1 movd %xmm1, %eax ret GCC gets: _test: subl $28, %esp movss 32(%esp), %xmm0 mulss %xmm0, %xmm0 xorps %xmm1, %xmm1 movss %xmm0, %xmm1 movaps %xmm1, %xmm0 movd %xmm0, 12(%esp) movl 12(%esp), %eax addl $28, %esp ret llvm-svn: 36020	2007-04-14 22:29:23 +00:00
Chris Lattner	a6b5660209	avoid copying sets and vectors around. llvm-svn: 36017	2007-04-14 22:10:17 +00:00
Chris Lattner	6f58839b20	avoid iterator invalidation. llvm-svn: 36002	2007-04-14 18:06:52 +00:00
Jeff Cohen	4bd0fd367a	An even better fix. llvm-svn: 35998	2007-04-14 17:18:29 +00:00
Jeff Cohen	7233aa9369	Fix recent regression that broke several llvm-tests. llvm-svn: 35996	2007-04-14 16:55:19 +00:00
Chris Lattner	49fa8d2bff	Implement a few missing xforms: printf("foo\n") -> puts. printf("x") -> putchar printf("") -> noop. Still need to do the xforms for fprintf. This implements Transforms/SimplifyLibCalls/Printf.ll llvm-svn: 35984	2007-04-14 01:17:48 +00:00
Chris Lattner	02137eec8f	in addition to merging, constantmerge should also delete trivially dead globals, in order to clean up after simplifylibcalls. llvm-svn: 35982	2007-04-14 01:11:54 +00:00
Chris Lattner	efb33d28c6	Implement PR1201 and test/Transforms/InstCombine/malloc-free-delete.ll llvm-svn: 35981	2007-04-14 00:20:02 +00:00
Chris Lattner	164b76565b	use an accessor to simplify code. llvm-svn: 35979	2007-04-14 00:17:39 +00:00
Chris Lattner	efd3051d60	Now that codegen prepare isn't defeating me, I can finally fix what I set out to do! :) This fixes a problem where LSR would insert a bunch of code into each MBB that uses a particular subexpression (e.g. IV+base+C). The problem is that this code cannot be CSE'd back together if inserted into different blocks. This patch changes LSR to attempt to insert a single copy of this code and share it, allowing codegenprepare to duplicate the code if it can be sunk into various addressing modes. On CodeGen/ARM/lsr-code-insertion.ll, for example, this gives us code like: add r8, r0, r5 str r6, [r8, #+4] .. ble LBB1_4 @cond_next LBB1_3: @cond_true str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 ldr r6, LCPI1_1 str r6, [r8, #+4] instead of: add r10, r0, r6 str r8, [r10, #+4] ... ble LBB1_4 @cond_next LBB1_3: @cond_true add r8, r0, r6 str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 add r8, r0, r6 ldr r10, LCPI1_1 str r10, [r8, #+4] Besides being smaller and more efficient, this makes it immediately obvious that it is profitable to predicate LBB1_3 now :) llvm-svn: 35972	2007-04-13 20:42:26 +00:00
Chris Lattner	feee64e997	Completely rewrite addressing-mode related sinking of code. In particular, this fixes problems where codegenprepare would sink expressions into load/stores that are not valid, and fixes cases where it would miss important valid ones. This fixes several serious codesize and perf issues, particularly on targets with complex addressing modes like arm and x86. For example, now we compile CodeGen/X86/isel-sink.ll to: _test: movl 8(%esp), %eax movl 4(%esp), %ecx cmpl $1233, %eax ja LBB1_2 #F LBB1_1: #T movl $4, (%ecx,%eax,4) movl $141, %eax ret LBB1_2: #F movl (%ecx,%eax,4), %eax ret instead of: _test: movl 8(%esp), %eax leal (,%eax,4), %ecx addl 4(%esp), %ecx cmpl $1233, %eax ja LBB1_2 #F LBB1_1: #T movl $4, (%ecx) movl $141, %eax ret LBB1_2: #F movl (%ecx), %eax ret llvm-svn: 35970	2007-04-13 20:30:56 +00:00
Devang Patel	38705d5494	Remove use of SlowOperationInformer. llvm-svn: 35967	2007-04-13 18:58:18 +00:00
Devang Patel	b730fe57bf	Undo previous check-in. llvm-svn: 35966	2007-04-13 18:35:15 +00:00

1 2 3 4 5 ...

3129 Commits