Jay Foad
d1b7849d49
Convert GetElementPtrInst to use ArrayRef.
...
llvm-svn: 135904
2011-07-25 09:48:08 +00:00
Jay Foad
040dd82f44
Convert IRBuilder::CreateGEP and IRBuilder::CreateInBoundsGEP to use
...
ArrayRef.
llvm-svn: 135761
2011-07-22 08:16:57 +00:00
Eli Friedman
911e12f505
Clean up includes of llvm/Analysis/ConstantFolding.h so it's included where it's used and not included where it isn't.
...
llvm-svn: 135628
2011-07-20 21:57:23 +00:00
Chris Lattner
229907cd11
land David Blaikie's patch to de-constify Type, with a few tweaks.
...
llvm-svn: 135375
2011-07-18 04:54:35 +00:00
Evan Cheng
b94674b325
It's not safe to fold (fptrunc (sqrt (fpext x))) to (sqrtf x) if there is another use of sqrt. rdar://9763193
...
llvm-svn: 135058
2011-07-13 19:08:16 +00:00
Bob Wilson
3c68b626e7
Reapply a fixed version of r133285.
...
This tightens up checking for overflow in alloca sizes, based on feedback
from Duncan and John about the change in r132926.
llvm-svn: 134749
2011-07-08 22:09:33 +00:00
Chad Rosier
c76b9d8c2f
Revert r133285. Causing odd failures on Dragonegg.
...
llvm-svn: 133301
2011-06-17 22:08:25 +00:00
Stuart Hastings
23be986a0c
Relocate NUW test to cover all binary ops in a dynamic alloca expr.
...
Followup to 132926. rdar://problem/9265821
llvm-svn: 133285
2011-06-17 20:21:52 +00:00
Stuart Hastings
351a3f881f
Avoid fusing bitcasts with dynamic allocas if the amount-to-allocate
...
might overflow. Re-typing the alloca to a larger type (e.g. double)
hoists a shift into the alloca, potentially exposing overflow in the
expression. rdar://problem/9265821
llvm-svn: 132926
2011-06-13 18:48:49 +00:00
Eli Friedman
35211c6091
Final step of instcombine debuginfo; switch a couple more places over to InsertNewInstWith, and use setDebugLoc for the cases which can't be easily handled by the automated mechanisms.
...
llvm-svn: 132167
2011-05-27 00:19:40 +00:00
Eli Friedman
1754a25977
More instcombine simplifications towards better debug locations.
...
llvm-svn: 131596
2011-05-18 23:11:30 +00:00
Eli Friedman
b9ed18f2cb
Use ReplaceInstUsesWith instead of replaceAllUsesWith where appropriate in instcombine.
...
llvm-svn: 131512
2011-05-18 00:32:01 +00:00
Benjamin Kramer
50a281a871
While SimplifyDemandedBits constant folds this, we can't rely on it here.
...
It's possible to craft an input that hits the recursion limits in a way
that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits
can infer which bits are zero.
No test case as it depends on too many other things. Fixes PR9609.
llvm-svn: 128777
2011-04-02 18:50:58 +00:00
Benjamin Kramer
8b94c295c3
Fix comment.
...
llvm-svn: 128745
2011-04-01 22:29:18 +00:00
Benjamin Kramer
5cad45307e
Tweaks to the icmp+sext-to-shifts optimization to address Frits' comments:
...
- Localize the check if an icmp has one use to a place where we know we're
introducing something that's likely more expensive than a sext from i1.
- Add an assert to make sure a case that would lead to a miscompilation is
folded away earlier.
- Fix a typo.
llvm-svn: 128744
2011-04-01 22:22:11 +00:00
Benjamin Kramer
ac2d5657a6
Fix build.
...
llvm-svn: 128733
2011-04-01 20:15:16 +00:00
Benjamin Kramer
d121765e64
InstCombine: Turn icmp + sext into bitwise/integer ops when the input has only one unknown bit.
...
int test1(unsigned x) { return (x&8) ? 0 : -1; }
int test3(unsigned x) { return (x&8) ? -1 : 0; }
before (x86_64):
_test1:
andl $8, %edi
cmpl $1, %edi
sbbl %eax, %eax
ret
_test3:
andl $8, %edi
cmpl $1, %edi
sbbl %eax, %eax
notl %eax
ret
after:
_test1:
shrl $3, %edi
andl $1, %edi
leal -1(%rdi), %eax
ret
_test3:
shll $28, %edi
movl %edi, %eax
sarl $31, %eax
ret
llvm-svn: 128732
2011-04-01 20:09:10 +00:00
Benjamin Kramer
398b8c5faf
InstCombine: Move (sext icmp) transforms into their own method. No intended functionality change.
...
llvm-svn: 128731
2011-04-01 20:09:03 +00:00
Jay Foad
52131344a2
Remove PHINode::reserveOperandSpace(). Instead, add a parameter to
...
PHINode::Create() giving the (known or expected) number of operands.
llvm-svn: 128537
2011-03-30 11:28:46 +00:00
Jay Foad
e0938d8a87
(Almost) always call reserveOperandSpace() on newly created PHINodes.
...
llvm-svn: 128535
2011-03-30 11:19:20 +00:00
Devang Patel
fbb482b314
llvm.dbg.declare intrinsic does not use any llvm::Values. It's magic!
...
llvm-svn: 127282
2011-03-08 22:12:11 +00:00
Chris Lattner
69229316aa
convert ConstantVector::get to use ArrayRef.
...
llvm-svn: 125537
2011-02-15 00:14:00 +00:00
Chris Lattner
34442e6ebf
revert my ConstantVector patch, it seems to have made the llvm-gcc
...
builders unhappy.
llvm-svn: 125504
2011-02-14 18:15:46 +00:00
Chris Lattner
d9f5b88548
Switch ConstantVector::get to use ArrayRef instead of a pointer+size
...
idiom. Change various clients to simplify their code.
llvm-svn: 125487
2011-02-14 07:55:32 +00:00
Chris Lattner
9c10d587f6
implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
...
This fixes rdar://8808586 which observed that we used to compile:
union xy {
struct x { _Bool b[15]; } x;
__attribute__((packed))
struct y {
__attribute__((packed)) unsigned long b0to7;
__attribute__((packed)) unsigned int b8to11;
__attribute__((packed)) unsigned short b12to13;
__attribute__((packed)) unsigned char b14;
} y;
};
struct x
foo(union xy *xy)
{
return xy->x;
}
into:
_foo: ## @foo
movq (%rdi), %rax
movabsq $1095216660480, %rcx ## imm = 0xFF00000000
andq %rax, %rcx
movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000
andq %rax, %rdx
movzbl %al, %esi
orq %rdx, %rsi
movq %rax, %rdx
andq $65280, %rdx ## imm = 0xFF00
orq %rsi, %rdx
movq %rax, %rsi
andq $16711680, %rsi ## imm = 0xFF0000
orq %rdx, %rsi
movl %eax, %edx
andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000
orq %rsi, %rdx
orq %rcx, %rdx
movabsq $280375465082880, %rcx ## imm = 0xFF0000000000
movq %rax, %rsi
andq %rcx, %rsi
orq %rdx, %rsi
movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000
andq %r8, %rax
orq %rsi, %rax
movzwl 12(%rdi), %edx
movzbl 14(%rdi), %esi
shlq $16, %rsi
orl %edx, %esi
movq %rsi, %r9
shlq $32, %r9
movl 8(%rdi), %edx
orq %r9, %rdx
andq %rdx, %rcx
movzbl %sil, %esi
shlq $32, %rsi
orq %rcx, %rsi
movl %edx, %ecx
andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000
orq %rsi, %rcx
movq %rdx, %rsi
andq $16711680, %rsi ## imm = 0xFF0000
orq %rcx, %rsi
movq %rdx, %rcx
andq $65280, %rcx ## imm = 0xFF00
orq %rsi, %rcx
movzbl %dl, %esi
orq %rcx, %rsi
andq %r8, %rdx
orq %rsi, %rdx
ret
We now compile this into:
_foo: ## @foo
## BB#0: ## %entry
movzwl 12(%rdi), %eax
movzbl 14(%rdi), %ecx
shlq $16, %rcx
orl %eax, %ecx
shlq $32, %rcx
movl 8(%rdi), %edx
orq %rcx, %rdx
movq (%rdi), %rax
ret
A small improvement :-)
llvm-svn: 123520
2011-01-15 06:32:33 +00:00
Bill Wendling
5e3605552e
Whitespace fixes. No functionality change.
...
llvm-svn: 122110
2010-12-17 23:27:41 +00:00
Nate Begeman
7aa18bf46a
Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms.
...
llvm-svn: 122105
2010-12-17 23:12:19 +00:00
Chris Lattner
6e27b3e004
Fix a serious performance regression introduced by r108687 on linux:
...
turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have
to delete the original sqrt as well. Not doing so causes us to do
two sqrt's when building with -fmath-errno (the default on linux).
llvm-svn: 113260
2010-09-07 20:01:38 +00:00
Chris Lattner
50df36ac0a
for completeness, allow undef also.
...
llvm-svn: 112351
2010-08-28 03:36:51 +00:00
Chris Lattner
d0214f3efe
handle the constant case of vector insertion. For something
...
like this:
struct S { float A, B, C, D; };
struct S g;
struct S bar() {
struct S A = g;
++A.B;
A.A = 42;
return A;
}
we now generate:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss 12(%rax), %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
unpcklps %xmm0, %xmm1
addss LCPI1_0(%rip), %xmm2
pshufd $16, %xmm2, %xmm2
movss LCPI1_1(%rip), %xmm0
pshufd $16, %xmm0, %xmm0
unpcklps %xmm2, %xmm0
ret
instead of:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss 12(%rax), %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
unpcklps %xmm0, %xmm1
addss LCPI1_0(%rip), %xmm2
movd %xmm2, %eax
shlq $32, %rax
addq $1109917696, %rax ## imm = 0x42280000
movd %rax, %xmm0
ret
llvm-svn: 112345
2010-08-28 01:50:57 +00:00
Chris Lattner
dd6601048e
optimize bitcasts from large integers to vector into vector
...
element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi. We now compile something like
this:
struct S { float A, B, C, D; };
struct S g;
struct S bar() {
struct S A = g;
++A.A;
++A.C;
return A;
}
into all nice vector operations:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss LCPI1_0(%rip), %xmm1
movss (%rax), %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 12(%rax), %xmm3
pshufd $16, %xmm2, %xmm2
unpcklps %xmm2, %xmm0
addss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
pshufd $16, %xmm3, %xmm2
unpcklps %xmm2, %xmm1
ret
instead of icky integer operations:
_bar: ## @bar
movq _g@GOTPCREL(%rip), %rax
movss LCPI1_0(%rip), %xmm1
movss (%rax), %xmm0
addss %xmm1, %xmm0
movd %xmm0, %ecx
movl 4(%rax), %edx
movl 12(%rax), %esi
shlq $32, %rdx
addq %rcx, %rdx
movd %rdx, %xmm0
addss 8(%rax), %xmm1
movd %xmm1, %eax
shlq $32, %rsi
addq %rax, %rsi
movd %rsi, %xmm1
ret
This resolves rdar://8360454
llvm-svn: 112343
2010-08-28 01:20:38 +00:00
Chris Lattner
18d7fc8fc6
Implement a pretty general logical shift propagation
...
framework, which is good at ripping through bitfield
operations. This generalize a bunch of the existing
xforms that instcombine does, such as
(x << c) >> c -> and
to handle intermediate logical nodes. This is useful for
ripping up the "promote to large integer" code produced by
SRoA.
llvm-svn: 112304
2010-08-27 22:24:38 +00:00
Chris Lattner
7398434675
teach the truncation optimization that an entire chain of
...
computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.
llvm-svn: 112285
2010-08-27 20:32:06 +00:00
Chris Lattner
90cd746e63
Add an instcombine to clean up a common pattern produced
...
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:
%94 = zext i16 %93 to i32 ; <i32> [#uses=2]
%96 = lshr i32 %94, 8 ; <i32> [#uses=1]
%101 = trunc i32 %96 to i8 ; <i8> [#uses=1]
This also unblocks other xforms from happening, now clang is able to compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
pshufd $1, %xmm0, %xmm2
addss %xmm0, %xmm2
movdqa %xmm1, %xmm3
addss %xmm2, %xmm3
pshufd $1, %xmm1, %xmm0
addss %xmm3, %xmm0
ret
on x86-64, instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
This seems pretty close to optimal to me, at least without
using horizontal adds. This also triggers in lots of other
code, including SPEC.
llvm-svn: 112278
2010-08-27 18:31:05 +00:00
Chris Lattner
bfd2228182
optimize "integer extraction out of the middle of a vector" as produced
...
by SRoA. This is part of rdar://7892780, but needs another xform to
expose this.
llvm-svn: 112232
2010-08-26 22:14:59 +00:00
Chris Lattner
d4ebd6df5a
optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x'
...
is a vector to be a vector element extraction. This allows clang to
compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
movd %eax, %xmm0
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movd %xmm1, %rax
movd %eax, %xmm1
addss %xmm2, %xmm1
shrq $32, %rax
movd %eax, %xmm0
addss %xmm1, %xmm0
ret
... eliminating half of the horribleness.
llvm-svn: 112227
2010-08-26 21:55:42 +00:00
Owen Anderson
84774eda4b
Tweak per Chris' comments.
...
llvm-svn: 108736
2010-07-19 19:23:32 +00:00
Owen Anderson
32a58342ed
Reimplement r108639 in InstCombine rather than DAGCombine.
...
llvm-svn: 108687
2010-07-19 08:09:34 +00:00
Dan Gohman
05a6555acb
Fix instcombine's handling of alloca to accept non-i32 types.
...
llvm-svn: 104935
2010-05-28 04:33:04 +00:00
Dan Gohman
a4abd035ea
Fix a missing newline in debug output.
...
llvm-svn: 104644
2010-05-25 21:50:35 +00:00
Chris Lattner
02b0df5338
Teach instcombine to transform a bitcast/(zext|trunc)/bitcast sequence
...
with a vector input and output into a shuffle vector. This sort of
sequence happens when the input code stores with one type and reloads
with another type and then SROA promotes to i96 integers, which make
everyone sad.
This fixes rdar://7896024
llvm-svn: 103354
2010-05-08 21:50:26 +00:00
Dan Gohman
eb7111b98f
Say bitcast instead of bitconvert.
...
llvm-svn: 100720
2010-04-07 23:22:42 +00:00
Duncan Sands
19d0b47b1f
There are two ways of checking for a given type, for example isa<PointerType>(T)
...
and T->isPointerTy(). Convert most instances of the first form to the second form.
Requested by Chris.
llvm-svn: 96344
2010-02-16 11:11:14 +00:00
Duncan Sands
9dff9bec31
Uniformize the names of type predicates: rather than having isFloatTy and
...
isInteger, we now have isFloatTy and isIntegerTy. Requested by Chris!
llvm-svn: 96223
2010-02-15 16:12:20 +00:00
Chris Lattner
4e8137d678
Rename ValueRequiresCast to ShouldOptimizeCast, to better reflect
...
what it does. Enhance it to return false to optimizing vector
sign extensions from vector comparisions, which is the idiom used
to get a splatted vector for a vector comparison.
Doing this breaks vector-casts.ll, add some compensating
transformations to handle the important case they cover without
depending on this canonicalization.
This fixes rdar://7434900 a serious pessimization of vector compares.
llvm-svn: 95855
2010-02-11 06:26:33 +00:00
Dan Gohman
949458d014
LangRef.html says that inttoptr and ptrtoint always use zero-extension
...
when the cast is extending.
llvm-svn: 95046
2010-02-02 01:44:02 +00:00
Chris Lattner
1b35bbe813
change the canonical form of "cond ? -1 : 0" to be
...
"sext cond" instead of a select. This simplifies some instcombine
code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows
us to generate better code for a testcase on ppc.
llvm-svn: 94339
2010-01-24 00:09:49 +00:00
Chris Lattner
43f2fa6201
my instcombine transformations to make extension elimination more
...
aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)),
make sure to transform a couple more things into that canonical form,
and catch a case where we missed turning zext/shl/ashr into a single sext.
llvm-svn: 93787
2010-01-18 22:19:16 +00:00
Chris Lattner
d1a3efedd8
reenable the piece that turns trunc(zext(x)) -> x even if zext has multiple uses,
...
codegen has no apparent problem with the trunc version of this, because it turns
into a simple subreg idiom
llvm-svn: 93202
2010-01-11 22:49:40 +00:00
Chris Lattner
a6b1356cf9
Disable folding sext(trunc(x)) -> x (and other similar cast/cast cases) when the
...
trunc has multiple uses. Codegen is not able to coalesce the subreg case
correctly and so this leads to higher register pressure and spilling (see PR5997).
This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%.
llvm-svn: 93200
2010-01-11 22:45:25 +00:00