create a testcase where this matters. The select+load transformation only
occurs when isSafeToLoadUnconditionally is true, and in those situations,
instcombine also changes the underlying objects to be aligned. This seems
like a good idea regardless, and I've verified that it doesn't pessimize
the subsequent realignment.
llvm-svn: 94850
(via APInt &RHSKnownZero = KnownZero, etc) seems dangerous and confusing to me: it
is easy not to notice this, and then wonder why KnownZero/RHSKnownZero changed
underneath you when you modified RHSKnownZero/KnownZero etc. So get rid of this.
No intended functionality change (tested with "make check" + llvm-gcc bootstrap).
llvm-svn: 94802
"sext cond" instead of a select. This simplifies some instcombine
code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows
us to generate better code for a testcase on ppc.
llvm-svn: 94339
missing ones are libsupport, libsystem and libvmcore. libvmcore is
currently blocked on bugpoint, which uses EH. Once it stops using
EH, we can switch it off.
This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.
llvm-svn: 94164
aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)),
make sure to transform a couple more things into that canonical form,
and catch a case where we missed turning zext/shl/ashr into a single sext.
llvm-svn: 93787
added to the FSub version. However, the original version of this xform guarded
against doing this for floating point (!Op0->getType()->isFPOrFPVector()).
This is causing LLVM to perform incorrect xforms for code like:
void func(double *rhi, double *rlo, double xh, double xl, double yh, double yl){
double mh, ml;
double c = 134217729.0;
double up, u1, u2, vp, v1, v2;
up = xh*c;
u1 = (xh - up) + up;
u2 = xh - u1;
vp = yh*c;
v1 = (yh - vp) + vp;
v2 = yh - v1;
mh = xh*yh;
ml = (((u1*v1 - mh) + (u1*v2)) + (u2*v1)) + (u2*v2);
ml += xh*yl + xl*yh;
*rhi = mh + ml;
*rlo = (mh - (*rhi)) + ml;
}
The last line was optimized away, but rl is intended to be the difference
between the infinitely precise result of mh + ml and after it has been rounded
to double precision.
llvm-svn: 93369
trunc has multiple uses. Codegen is not able to coalesce the subreg case
correctly and so this leads to higher register pressure and spilling (see PR5997).
This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%.
llvm-svn: 93200
BitsToClear case. This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216)
and test3 (random extample extracted from a spec benchmark).
clang now compiles the code in PR4216 into:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
orl $32768, %edi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
instead of:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
shrl $8, %edi
orl $128, %edi
shlq $8, %rdi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
which is still not great, but is progress.
llvm-svn: 93145
new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant. This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.
llvm-svn: 93144
the zext dest type. This allows us to handle test52/53 in cast.ll,
and allows llvm-gcc to generate much better code for PR4216 in -m64
mode:
_test_bitfield: ## @test_bitfield
orl $32962, %edi
movl %edi, %eax
andl $-25350, %eax
ret
This also fixes a bug handling vector extends, ensuring that the
mask produced is a vector constant, not an integer constant.
llvm-svn: 93127
elimination of a sign extend to be a win, which simplifies
the client of CanEvaluateSExtd, and allows us to eliminate
more casts (examples taken from real code).
llvm-svn: 93109
lshr+ashr instead of trunc+sext. We want to avoid type
conversions whenever possible, it is easier to codegen expressions
without truncates and extensions.
llvm-svn: 93107
1) don't try to optimize a sext or zext that is only used by a trunc, let
the trunc get optimized first. This avoids some pointless effort in
some common cases since instcombine scans down a block in the first pass.
2) Change the cost model for zext elimination to consider an 'and' cheaper
than a zext. This allows us to do it more aggressively, and for the next
patch to simplify the code quite a bit.
llvm-svn: 93097
commonIntCastTransforms into the callers, eliminating a switch,
and allowing the static predicate methods to be moved down to
live next to the corresponding function. No functionality
change.
llvm-svn: 93089
Previously, instcombine would only promote an expression tree to
the larger type if doing so eliminated two casts. This is because
a need to manually do the sign extend after the promoted expression
tree with two shifts. Now, we keep track of whether the result of
the computation is going to be properly sign extended already. If
so, we can unconditionally promote the expression, which allows us
to zap more sext's.
This implements rdar://6598839 (aka gcc pr38751)
llvm-svn: 92815
Eliminate the 'AddMaskingAnd' transformation, it is redundant with this
more general code right below it:
// A+B --> A|B iff A and B have no bits set in common.
llvm-svn: 92693
that got instantiated. There is no reason for instcombine
to try this hard for simple associative optimizations. Next
up, eliminate the template completely.
llvm-svn: 92692
when doing this transform if the GEP is not inbounds. No testcase because
it is very difficult to trigger this: instcombine already canonicalizes
GEP indices to pointer size, so it relies specific permutations of the
instcombine worklist.
Thanks to Duncan for pointing this possible problem out.
llvm-svn: 92495