llvm-project

Commit Graph

Author	SHA1	Message	Date
Jay Foad	5ded9df82a	Remove unnecessary specialization OperandTraits<User>. llvm-svn: 123577	2011-01-16 08:23:16 +00:00
Jay Foad	59809c7a62	Move the implementation of the User class into a new source file, User.cpp. llvm-svn: 123575	2011-01-16 08:10:57 +00:00
Chris Lattner	e5f8de8639	fix PR8932, a case where arg promotion could infinitely promote. llvm-svn: 123574	2011-01-16 08:09:24 +00:00
Chris Lattner	ed1fb92cfe	simplify a little llvm-svn: 123573	2011-01-16 07:11:21 +00:00
Chris Lattner	c326ebd118	add some commentary llvm-svn: 123572	2011-01-16 06:39:44 +00:00
Chris Lattner	6fab2e9418	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	7cd8cf7d24	Use an irbuilder to get some trivial constant folding when doing a store of a constant. llvm-svn: 123570	2011-01-16 05:58:24 +00:00
Chris Lattner	adb1a233b1	remove a dead check, this was needed before we had an explicit veto on uses of phis. llvm-svn: 123569	2011-01-16 05:37:55 +00:00
Chris Lattner	d55581ded8	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Evan Cheng	572756ac11	Spill R4 if it's going to be used to restore SP from FP. llvm-svn: 123567	2011-01-16 05:14:33 +00:00
Chris Lattner	ea7131a062	remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the first line of the function because it isn't a good idea, even for compares. llvm-svn: 123566	2011-01-16 05:14:26 +00:00
Chris Lattner	ff2e737714	more cleanups: use the IR builder. llvm-svn: 123565	2011-01-16 05:08:00 +00:00
Chris Lattner	25ce280511	tidy up code. llvm-svn: 123564	2011-01-16 04:37:29 +00:00
Owen Anderson	4e54efd625	Improve the safety of my globalopt enhancement by ensuring that the bitcast of the stored value to the new store type is always. Also, add a testcase. llvm-svn: 123563	2011-01-16 04:33:33 +00:00
Chris Lattner	08f43456c9	fix PR8983, a broken assertion. llvm-svn: 123562	2011-01-16 03:43:53 +00:00
Venkatraman Govindaraju	1b0e2cbf3f	Implement AnalyzeBranch in Sparc Backend. llvm-svn: 123561	2011-01-16 03:15:11 +00:00
Chris Lattner	218092e68e	fix PR8981, a crash trying to form a conditional inc with a floating point compare. llvm-svn: 123560	2011-01-16 02:56:53 +00:00
Chris Lattner	2d186574a6	reapply my fix for PR8961 with a tweak to properly handle multi-instruction sequences like calls. Many thanks to Jakob for finding a testcase. llvm-svn: 123559	2011-01-16 02:27:38 +00:00
Chris Lattner	8b4952fcf7	simplify this code, it is still broken but will follow up on llvm-commits. llvm-svn: 123558	2011-01-16 02:05:10 +00:00
Michael J. Spencer	2ff30b84f8	Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1." llvm-svn: 123557	2011-01-16 01:43:22 +00:00
Chandler Carruth	ef28abefd0	Simplify a README.txt entry significantly to expose the core issue. llvm-svn: 123556	2011-01-16 01:40:23 +00:00
Chris Lattner	c703334ff1	one of michael's recent patches broke this, temporarily disable it so the bots go green llvm-svn: 123555	2011-01-16 01:04:49 +00:00
Chris Lattner	1e209b87ad	remove the partial specialization pass. It is unmaintained and has bugs. llvm-svn: 123554	2011-01-16 00:27:10 +00:00
Michael J. Spencer	53dcdc7420	Archive: Fix spelling. llvm-svn: 123552	2011-01-15 21:43:45 +00:00
Michael J. Spencer	a0ce763290	Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1. llvm-svn: 123551	2011-01-15 21:43:37 +00:00
Michael J. Spencer	8685f387eb	Support/GraphWriter: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1. llvm-svn: 123550	2011-01-15 21:43:25 +00:00
Benjamin Kramer	bec03ea725	Add an assert so we don't silently miscompile ctpop for bit widths > 128. llvm-svn: 123549	2011-01-15 21:19:37 +00:00
Michael J. Spencer	94b2ab3556	Support/PathV2: Add identify_magic. llvm-svn: 123548	2011-01-15 20:39:36 +00:00
Benjamin Kramer	fff2517edc	Reimplement CTPOP legalization with the "best" algorithm from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter, especially when counting 64 bit population on a 32 bit target. I hope this is fast enough to replace Kernighan-style counting loops even when the input is rather sparse. llvm-svn: 123547	2011-01-15 20:30:30 +00:00
Michael J. Spencer	b587180fa7	Unittests/Support/Path: Tweak test. llvm-svn: 123546	2011-01-15 18:52:49 +00:00
Michael J. Spencer	7887466adc	Support/PathV2: Implement has_magic in terms of get_magic. llvm-svn: 123545	2011-01-15 18:52:41 +00:00
Michael J. Spencer	ee1699c362	Support/PathV2: Implement get_magic. llvm-svn: 123544	2011-01-15 18:52:33 +00:00
Nick Lewycky	4a1ff16b29	Add missing whitespace. llvm-svn: 123543	2011-01-15 18:42:52 +00:00
Nick Lewycky	0296a481f9	Make constmerge a two-pass algorithm so that it won't miss merging opporuntities. Fixes PR8978. llvm-svn: 123541	2011-01-15 18:14:21 +00:00
Oscar Fuentes	25ac830e72	Make config.h.cmake similar to config.h.in Patch by arrowdodger! llvm-svn: 123539	2011-01-15 13:35:37 +00:00
Benjamin Kramer	ed5f2e504e	Try to unbreak selfhost. llvm-svn: 123537	2011-01-15 11:25:34 +00:00
Nick Lewycky	540f9536c8	Add a cache that protects mergefunc's internals from more surprises in DenseSet. Also, replace tabs with spaces. Yes, it's 2011. llvm-svn: 123535	2011-01-15 10:16:23 +00:00
Nick Lewycky	367f98f000	Teach LazyValueInfo that allocas aren't NULL. Over all of llvm-test, this saves half a million non-local queries, each of which would otherwise have triggered a linear scan over a basic block. Also fix a fixme for memory intrinsics which dereference pointers. With this, we prove that a pointer is non-null because it was dereferenced by an intrinsic 112 times in llvm-test. llvm-svn: 123533	2011-01-15 09:16:12 +00:00
Rafael Espindola	f1ed781aea	Add a clarification about merging constants with and without unnamed_addr. llvm-svn: 123530	2011-01-15 08:20:57 +00:00
Rafael Espindola	489e505adf	Allow unnamed_addr on declarations. llvm-svn: 123529	2011-01-15 08:15:00 +00:00
Chris Lattner	af26390790	temporarily revert r123526. While working on a follow-on patch I realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527	2011-01-15 07:51:19 +00:00
Chris Lattner	8df83c4a24	fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526	2011-01-15 07:36:13 +00:00
Chris Lattner	ee588defc6	simplify code, no functionality change. llvm-svn: 123525	2011-01-15 07:29:01 +00:00
Chris Lattner	1b93be501d	Now that instruction optzns can update the iterator as they go, we can have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524	2011-01-15 07:25:29 +00:00
Chris Lattner	7a2771440f	make the current instruction iterator an ivar, allowing xforms that potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523	2011-01-15 07:14:54 +00:00
Chris Lattner	9c10d587f6	implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520	2011-01-15 06:32:33 +00:00
Chris Lattner	c23ca1f217	fix typo llvm-svn: 123519	2011-01-15 06:27:35 +00:00
Chris Lattner	76580f0ec3	Fix m_Not and m_Neg to not match random ConstantInt's. Before these would try hard to match constants by inverting the bits and recursively matching. There are two problems with this: 1) some patterns would match when we didn't want them to (theoretical) 2) this is insanely expensive to do, and most often pointless. This was apparently useful in just 2 instcombine cases, which I added code to handle explicitly. This change speeds up 'opt' time on 176.gcc by 1% and produces bitwise identical code. llvm-svn: 123518	2011-01-15 05:52:27 +00:00
Chris Lattner	e20dd530d0	one more instcombine variant that is needed to work with future changes, no functionality change currently. llvm-svn: 123517	2011-01-15 05:50:18 +00:00
Chris Lattner	497459d5fd	fix typo llvm-svn: 123516	2011-01-15 05:42:47 +00:00

1 2 3 4 5 ...

69215 Commits