llvm-project

Commit Graph

Author	SHA1	Message	Date
Jay Foad	d1b7849d49	Convert GetElementPtrInst to use ArrayRef. llvm-svn: 135904	2011-07-25 09:48:08 +00:00
Jay Foad	040dd82f44	Convert IRBuilder::CreateGEP and IRBuilder::CreateInBoundsGEP to use ArrayRef. llvm-svn: 135761	2011-07-22 08:16:57 +00:00
Eli Friedman	911e12f505	Clean up includes of llvm/Analysis/ConstantFolding.h so it's included where it's used and not included where it isn't. llvm-svn: 135628	2011-07-20 21:57:23 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Evan Cheng	b94674b325	It's not safe to fold (fptrunc (sqrt (fpext x))) to (sqrtf x) if there is another use of sqrt. rdar://9763193 llvm-svn: 135058	2011-07-13 19:08:16 +00:00
Bob Wilson	3c68b626e7	Reapply a fixed version of r133285. This tightens up checking for overflow in alloca sizes, based on feedback from Duncan and John about the change in r132926. llvm-svn: 134749	2011-07-08 22:09:33 +00:00
Chad Rosier	c76b9d8c2f	Revert r133285. Causing odd failures on Dragonegg. llvm-svn: 133301	2011-06-17 22:08:25 +00:00
Stuart Hastings	23be986a0c	Relocate NUW test to cover all binary ops in a dynamic alloca expr. Followup to 132926. rdar://problem/9265821 llvm-svn: 133285	2011-06-17 20:21:52 +00:00
Stuart Hastings	351a3f881f	Avoid fusing bitcasts with dynamic allocas if the amount-to-allocate might overflow. Re-typing the alloca to a larger type (e.g. double) hoists a shift into the alloca, potentially exposing overflow in the expression. rdar://problem/9265821 llvm-svn: 132926	2011-06-13 18:48:49 +00:00
Eli Friedman	35211c6091	Final step of instcombine debuginfo; switch a couple more places over to InsertNewInstWith, and use setDebugLoc for the cases which can't be easily handled by the automated mechanisms. llvm-svn: 132167	2011-05-27 00:19:40 +00:00
Eli Friedman	1754a25977	More instcombine simplifications towards better debug locations. llvm-svn: 131596	2011-05-18 23:11:30 +00:00
Eli Friedman	b9ed18f2cb	Use ReplaceInstUsesWith instead of replaceAllUsesWith where appropriate in instcombine. llvm-svn: 131512	2011-05-18 00:32:01 +00:00
Benjamin Kramer	50a281a871	While SimplifyDemandedBits constant folds this, we can't rely on it here. It's possible to craft an input that hits the recursion limits in a way that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits can infer which bits are zero. No test case as it depends on too many other things. Fixes PR9609. llvm-svn: 128777	2011-04-02 18:50:58 +00:00
Benjamin Kramer	8b94c295c3	Fix comment. llvm-svn: 128745	2011-04-01 22:29:18 +00:00
Benjamin Kramer	5cad45307e	Tweaks to the icmp+sext-to-shifts optimization to address Frits' comments: - Localize the check if an icmp has one use to a place where we know we're introducing something that's likely more expensive than a sext from i1. - Add an assert to make sure a case that would lead to a miscompilation is folded away earlier. - Fix a typo. llvm-svn: 128744	2011-04-01 22:22:11 +00:00
Benjamin Kramer	ac2d5657a6	Fix build. llvm-svn: 128733	2011-04-01 20:15:16 +00:00
Benjamin Kramer	d121765e64	InstCombine: Turn icmp + sext into bitwise/integer ops when the input has only one unknown bit. int test1(unsigned x) { return (x&8) ? 0 : -1; } int test3(unsigned x) { return (x&8) ? -1 : 0; } before (x86_64): _test1: andl $8, %edi cmpl $1, %edi sbbl %eax, %eax ret _test3: andl $8, %edi cmpl $1, %edi sbbl %eax, %eax notl %eax ret after: _test1: shrl $3, %edi andl $1, %edi leal -1(%rdi), %eax ret _test3: shll $28, %edi movl %edi, %eax sarl $31, %eax ret llvm-svn: 128732	2011-04-01 20:09:10 +00:00
Benjamin Kramer	398b8c5faf	InstCombine: Move (sext icmp) transforms into their own method. No intended functionality change. llvm-svn: 128731	2011-04-01 20:09:03 +00:00
Jay Foad	52131344a2	Remove PHINode::reserveOperandSpace(). Instead, add a parameter to PHINode::Create() giving the (known or expected) number of operands. llvm-svn: 128537	2011-03-30 11:28:46 +00:00
Jay Foad	e0938d8a87	(Almost) always call reserveOperandSpace() on newly created PHINodes. llvm-svn: 128535	2011-03-30 11:19:20 +00:00
Devang Patel	fbb482b314	llvm.dbg.declare intrinsic does not use any llvm::Values. It's magic! llvm-svn: 127282	2011-03-08 22:12:11 +00:00
Chris Lattner	69229316aa	convert ConstantVector::get to use ArrayRef. llvm-svn: 125537	2011-02-15 00:14:00 +00:00
Chris Lattner	34442e6ebf	revert my ConstantVector patch, it seems to have made the llvm-gcc builders unhappy. llvm-svn: 125504	2011-02-14 18:15:46 +00:00
Chris Lattner	d9f5b88548	Switch ConstantVector::get to use ArrayRef instead of a pointer+size idiom. Change various clients to simplify their code. llvm-svn: 125487	2011-02-14 07:55:32 +00:00
Chris Lattner	9c10d587f6	implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520	2011-01-15 06:32:33 +00:00
Bill Wendling	5e3605552e	Whitespace fixes. No functionality change. llvm-svn: 122110	2010-12-17 23:27:41 +00:00
Nate Begeman	7aa18bf46a	Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms. llvm-svn: 122105	2010-12-17 23:12:19 +00:00
Chris Lattner	6e27b3e004	Fix a serious performance regression introduced by r108687 on linux: turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have to delete the original sqrt as well. Not doing so causes us to do two sqrt's when building with -fmath-errno (the default on linux). llvm-svn: 113260	2010-09-07 20:01:38 +00:00
Chris Lattner	50df36ac0a	for completeness, allow undef also. llvm-svn: 112351	2010-08-28 03:36:51 +00:00
Chris Lattner	d0214f3efe	handle the constant case of vector insertion. For something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.B; A.A = 42; return A; } we now generate: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 pshufd $16, %xmm2, %xmm2 movss LCPI1_1(%rip), %xmm0 pshufd $16, %xmm0, %xmm0 unpcklps %xmm2, %xmm0 ret instead of: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 movd %xmm2, %eax shlq $32, %rax addq $1109917696, %rax ## imm = 0x42280000 movd %rax, %xmm0 ret llvm-svn: 112345	2010-08-28 01:50:57 +00:00
Chris Lattner	dd6601048e	optimize bitcasts from large integers to vector into vector element insertion from the pieces that feed into the vector. This handles a pattern that occurs frequently due to code generated for the x86-64 abi. We now compile something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.A; ++A.C; return A; } into all nice vector operations: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 12(%rax), %xmm3 pshufd $16, %xmm2, %xmm2 unpcklps %xmm2, %xmm0 addss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 pshufd $16, %xmm3, %xmm2 unpcklps %xmm2, %xmm1 ret instead of icky integer operations: _bar: ## @bar movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 movd %xmm0, %ecx movl 4(%rax), %edx movl 12(%rax), %esi shlq $32, %rdx addq %rcx, %rdx movd %rdx, %xmm0 addss 8(%rax), %xmm1 movd %xmm1, %eax shlq $32, %rsi addq %rax, %rsi movd %rsi, %xmm1 ret This resolves rdar://8360454 llvm-svn: 112343	2010-08-28 01:20:38 +00:00
Chris Lattner	18d7fc8fc6	Implement a pretty general logical shift propagation framework, which is good at ripping through bitfield operations. This generalize a bunch of the existing xforms that instcombine does, such as (x << c) >> c -> and to handle intermediate logical nodes. This is useful for ripping up the "promote to large integer" code produced by SRoA. llvm-svn: 112304	2010-08-27 22:24:38 +00:00
Chris Lattner	7398434675	teach the truncation optimization that an entire chain of computation can be truncated if it is fed by a sext/zext that doesn't have to be exactly equal to the truncation result type. llvm-svn: 112285	2010-08-27 20:32:06 +00:00
Chris Lattner	90cd746e63	Add an instcombine to clean up a common pattern produced by the SRoA "promote to large integer" code, eliminating some type conversions like this: %94 = zext i16 %93 to i32 ; <i32> [#uses=2] %96 = lshr i32 %94, 8 ; <i32> [#uses=1] %101 = trunc i32 %96 to i8 ; <i8> [#uses=1] This also unblocks other xforms from happening, now clang is able to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry pshufd $1, %xmm0, %xmm2 addss %xmm0, %xmm2 movdqa %xmm1, %xmm3 addss %xmm2, %xmm3 pshufd $1, %xmm1, %xmm0 addss %xmm3, %xmm0 ret on x86-64, instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret This seems pretty close to optimal to me, at least without using horizontal adds. This also triggers in lots of other code, including SPEC. llvm-svn: 112278	2010-08-27 18:31:05 +00:00
Chris Lattner	bfd2228182	optimize "integer extraction out of the middle of a vector" as produced by SRoA. This is part of rdar://7892780, but needs another xform to expose this. llvm-svn: 112232	2010-08-26 22:14:59 +00:00
Chris Lattner	d4ebd6df5a	optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' is a vector to be a vector element extraction. This allows clang to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax movd %eax, %xmm0 shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movd %xmm1, %rax movd %eax, %xmm1 addss %xmm2, %xmm1 shrq $32, %rax movd %eax, %xmm0 addss %xmm1, %xmm0 ret ... eliminating half of the horribleness. llvm-svn: 112227	2010-08-26 21:55:42 +00:00
Owen Anderson	84774eda4b	Tweak per Chris' comments. llvm-svn: 108736	2010-07-19 19:23:32 +00:00
Owen Anderson	32a58342ed	Reimplement r108639 in InstCombine rather than DAGCombine. llvm-svn: 108687	2010-07-19 08:09:34 +00:00
Dan Gohman	05a6555acb	Fix instcombine's handling of alloca to accept non-i32 types. llvm-svn: 104935	2010-05-28 04:33:04 +00:00
Dan Gohman	a4abd035ea	Fix a missing newline in debug output. llvm-svn: 104644	2010-05-25 21:50:35 +00:00
Chris Lattner	02b0df5338	Teach instcombine to transform a bitcast/(zext\|trunc)/bitcast sequence with a vector input and output into a shuffle vector. This sort of sequence happens when the input code stores with one type and reloads with another type and then SROA promotes to i96 integers, which make everyone sad. This fixes rdar://7896024 llvm-svn: 103354	2010-05-08 21:50:26 +00:00
Dan Gohman	eb7111b98f	Say bitcast instead of bitconvert. llvm-svn: 100720	2010-04-07 23:22:42 +00:00
Duncan Sands	19d0b47b1f	There are two ways of checking for a given type, for example isa<PointerType>(T) and T->isPointerTy(). Convert most instances of the first form to the second form. Requested by Chris. llvm-svn: 96344	2010-02-16 11:11:14 +00:00
Duncan Sands	9dff9bec31	Uniformize the names of type predicates: rather than having isFloatTy and isInteger, we now have isFloatTy and isIntegerTy. Requested by Chris! llvm-svn: 96223	2010-02-15 16:12:20 +00:00
Chris Lattner	4e8137d678	Rename ValueRequiresCast to ShouldOptimizeCast, to better reflect what it does. Enhance it to return false to optimizing vector sign extensions from vector comparisions, which is the idiom used to get a splatted vector for a vector comparison. Doing this breaks vector-casts.ll, add some compensating transformations to handle the important case they cover without depending on this canonicalization. This fixes rdar://7434900 a serious pessimization of vector compares. llvm-svn: 95855	2010-02-11 06:26:33 +00:00
Dan Gohman	949458d014	LangRef.html says that inttoptr and ptrtoint always use zero-extension when the cast is extending. llvm-svn: 95046	2010-02-02 01:44:02 +00:00
Chris Lattner	1b35bbe813	change the canonical form of "cond ? -1 : 0" to be "sext cond" instead of a select. This simplifies some instcombine code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows us to generate better code for a testcase on ppc. llvm-svn: 94339	2010-01-24 00:09:49 +00:00
Chris Lattner	43f2fa6201	my instcombine transformations to make extension elimination more aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)), make sure to transform a couple more things into that canonical form, and catch a case where we missed turning zext/shl/ashr into a single sext. llvm-svn: 93787	2010-01-18 22:19:16 +00:00
Chris Lattner	d1a3efedd8	reenable the piece that turns trunc(zext(x)) -> x even if zext has multiple uses, codegen has no apparent problem with the trunc version of this, because it turns into a simple subreg idiom llvm-svn: 93202	2010-01-11 22:49:40 +00:00
Chris Lattner	a6b1356cf9	Disable folding sext(trunc(x)) -> x (and other similar cast/cast cases) when the trunc has multiple uses. Codegen is not able to coalesce the subreg case correctly and so this leads to higher register pressure and spilling (see PR5997). This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%. llvm-svn: 93200	2010-01-11 22:45:25 +00:00

1 2

86 Commits