llvm-project

Commit Graph

Author	SHA1	Message	Date
Owen Anderson	459e079912	Remove dead code, that I apparently wrote a while back. We seem to be doing well enough without whatever this was trying to do. When/if someone has the time to do some empirical evaluations, it might be worth it to figure out what this code was trying to do and see if it's worth resurrecting/fixing. llvm-svn: 123684	2011-01-17 22:39:54 +00:00
Cameron Zwarich	b410858a5f	Roll r123609 back in with two changes that fix test failures with expensive checks enabled: 1) Use '<' to compare integers in a comparison function rather than '<='. 2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize the priority queue. The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at just under 16% rather than 17%. llvm-svn: 123662	2011-01-17 17:38:41 +00:00
Cameron Zwarich	67431d7943	Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot. llvm-svn: 123618	2011-01-17 07:26:51 +00:00
Cameron Zwarich	814cd9233e	Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to eliminating a potentially quadratic data structure, this also gives a 17% speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial experiment gave a greater speedup around 25%, but I moved the dominator tree level computation from dominator tree construction to PromoteMemToReg. Since this approach to computing IDFs has a much lower overhead than the old code using precomputed DFs, it is worth looking at using this new code for the second scalarrepl pass as well. llvm-svn: 123609	2011-01-17 01:08:59 +00:00
Anders Carlsson	d3db83349e	Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead. llvm-svn: 123596	2011-01-16 21:25:33 +00:00
Chris Lattner	7c9f4c9c2b	tidy up a comment, as suggested by duncan llvm-svn: 123590	2011-01-16 17:46:19 +00:00
Rafael Espindola	751677a040	Don't merge two constants if we care about the address of both. This fixes the original testcase in PR8927. It also causes a clang binary built with a patched clang to increase in size by 0.21%. We can probably get some of the size back by writing a pass that detects that a global never has its pointer compared and adds unnamed_addr to it (maybe extend global opt). It is also possible that there are some other cases clang could add unnamed_addr to. I will investigate extending globalopt next. llvm-svn: 123584	2011-01-16 17:05:09 +00:00
Chris Lattner	e5f8de8639	fix PR8932, a case where arg promotion could infinitely promote. llvm-svn: 123574	2011-01-16 08:09:24 +00:00
Chris Lattner	ed1fb92cfe	simplify a little llvm-svn: 123573	2011-01-16 07:11:21 +00:00
Chris Lattner	6fab2e9418	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	7cd8cf7d24	Use an irbuilder to get some trivial constant folding when doing a store of a constant. llvm-svn: 123570	2011-01-16 05:58:24 +00:00
Chris Lattner	adb1a233b1	remove a dead check, this was needed before we had an explicit veto on uses of phis. llvm-svn: 123569	2011-01-16 05:37:55 +00:00
Chris Lattner	d55581ded8	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Chris Lattner	ea7131a062	remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the first line of the function because it isn't a good idea, even for compares. llvm-svn: 123566	2011-01-16 05:14:26 +00:00
Chris Lattner	ff2e737714	more cleanups: use the IR builder. llvm-svn: 123565	2011-01-16 05:08:00 +00:00
Chris Lattner	25ce280511	tidy up code. llvm-svn: 123564	2011-01-16 04:37:29 +00:00
Owen Anderson	4e54efd625	Improve the safety of my globalopt enhancement by ensuring that the bitcast of the stored value to the new store type is always. Also, add a testcase. llvm-svn: 123563	2011-01-16 04:33:33 +00:00
Chris Lattner	8b4952fcf7	simplify this code, it is still broken but will follow up on llvm-commits. llvm-svn: 123558	2011-01-16 02:05:10 +00:00
Chris Lattner	1e209b87ad	remove the partial specialization pass. It is unmaintained and has bugs. llvm-svn: 123554	2011-01-16 00:27:10 +00:00
Nick Lewycky	4a1ff16b29	Add missing whitespace. llvm-svn: 123543	2011-01-15 18:42:52 +00:00
Nick Lewycky	0296a481f9	Make constmerge a two-pass algorithm so that it won't miss merging opporuntities. Fixes PR8978. llvm-svn: 123541	2011-01-15 18:14:21 +00:00
Benjamin Kramer	ed5f2e504e	Try to unbreak selfhost. llvm-svn: 123537	2011-01-15 11:25:34 +00:00
Nick Lewycky	540f9536c8	Add a cache that protects mergefunc's internals from more surprises in DenseSet. Also, replace tabs with spaces. Yes, it's 2011. llvm-svn: 123535	2011-01-15 10:16:23 +00:00
Chris Lattner	af26390790	temporarily revert r123526. While working on a follow-on patch I realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527	2011-01-15 07:51:19 +00:00
Chris Lattner	8df83c4a24	fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526	2011-01-15 07:36:13 +00:00
Chris Lattner	ee588defc6	simplify code, no functionality change. llvm-svn: 123525	2011-01-15 07:29:01 +00:00
Chris Lattner	1b93be501d	Now that instruction optzns can update the iterator as they go, we can have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524	2011-01-15 07:25:29 +00:00
Chris Lattner	7a2771440f	make the current instruction iterator an ivar, allowing xforms that potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523	2011-01-15 07:14:54 +00:00
Chris Lattner	9c10d587f6	implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520	2011-01-15 06:32:33 +00:00
Chris Lattner	e20dd530d0	one more instcombine variant that is needed to work with future changes, no functionality change currently. llvm-svn: 123517	2011-01-15 05:50:18 +00:00
Chris Lattner	497459d5fd	fix typo llvm-svn: 123516	2011-01-15 05:42:47 +00:00
Chris Lattner	f3c4eefff8	Catch ~x < cst just like ~x < ~y, we currently handle this through means that are about to disappear. llvm-svn: 123515	2011-01-15 05:41:33 +00:00
Chris Lattner	311aa63c87	reduce indentation llvm-svn: 123514	2011-01-15 05:40:29 +00:00
Chris Lattner	b68ec5c339	Generalize LoadAndStorePromoter a bit and switch LICM to use it. llvm-svn: 123501	2011-01-15 00:12:35 +00:00
Owen Anderson	3e2f6cf7ae	Fix a false-positive warning. llvm-svn: 123480	2011-01-14 22:31:13 +00:00
Owen Anderson	9eb7cb48e4	Enhance GlobalOpt to be able evaluate initializers that involve stores through bitcasts, at least in simple cases. This fixes clang's CodeGenCXX/virtual-base-dtor.cpp llvm-svn: 123477	2011-01-14 22:19:20 +00:00
Chris Lattner	b498f9aff3	switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. llvm-svn: 123457	2011-01-14 19:50:47 +00:00
Chris Lattner	95294b8796	Add a new LoadAndStorePromoter class, which implements the general "promote a bunch of load and stores" logic, allowing the code to be shared and reused. llvm-svn: 123456	2011-01-14 19:36:13 +00:00
Chris Lattner	9987a6f49b	split SROA into two passes: one that uses DomFrontiers (-scalarrepl) and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436	2011-01-14 08:13:00 +00:00
Chris Lattner	543384efb4	Implement full support for promoting allocas to registers using SSAUpdater instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434	2011-01-14 07:50:47 +00:00
Chris Lattner	90f3a9a1c7	indentation llvm-svn: 123426	2011-01-14 04:23:53 +00:00
Duncan Sands	7f60dc1eb0	Move some shift transforms out of instcombine and into InstructionSimplify. While there, I noticed that the transform "undef >>a X -> undef" was wrong. For example if X is 2 then the top two bits must be equal, so the result can not be anything. I fixed this in the constant folder as well. Also, I made the transform for "X << undef" stronger: it now folds to undef always, even though X might be zero. This is in accordance with the LangRef, but I must admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef" following the LangRef and the constant folder, likewise fairly aggressive. llvm-svn: 123417	2011-01-14 00:37:45 +00:00
Bob Wilson	328e91bbe1	Fix whitespace. llvm-svn: 123396	2011-01-13 20:59:44 +00:00
Bob Wilson	c8056a952e	Check for empty structs, and for consistency, zero-element arrays. llvm-svn: 123383	2011-01-13 18:26:59 +00:00
Bob Wilson	08713d3c5f	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381	2011-01-13 17:45:11 +00:00
Bob Wilson	12eec40c83	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380	2011-01-13 17:45:08 +00:00
Devang Patel	30f3ebbc1f	Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior. llvm-svn: 123318	2011-01-12 19:12:45 +00:00
Chris Lattner	dd5f60b7a7	revert 123144, reenabling the rest of memset formation. llvm-svn: 123302	2011-01-12 03:25:15 +00:00
Chris Lattner	654098f411	revert r123146 which disabled code that wasn't the root cause of the bootstrap miscompare issue. llvm-svn: 123299	2011-01-12 01:52:23 +00:00
Chris Lattner	fa7c29d255	revert r123149, reenabling an improvement to memcpyopt that wasn't the source of the bootstrap problem. llvm-svn: 123298	2011-01-12 01:43:46 +00:00

1 2 3 4 5 ...

7591 Commits