Commit Graph

94 Commits

Author SHA1 Message Date
Duncan Sands 533c8ae79f Transform code like this
%V = mul i64 %N, 4
 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
 %t1 = getelementptr i32* %arr, i32 %N
 %t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
  int foo(int *A, int N) {
    return A[N];
  }
because gcc turns this into byte pointer arithmetic before it hits the plugin:
  D.1590_2 = (long unsigned int) N_1(D);
  D.1591_3 = D.1590_2 * 4;
  D.1592_5 = A_4(D) + D.1591_3;
  D.1589_6 = *D.1592_5;
  return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.

An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.

llvm-svn: 166474
2012-10-23 08:28:26 +00:00
Eli Friedman e0a64d83fc Fix a minor logic mistake transforming compares in instcombine. PR12514.
llvm-svn: 156600
2012-05-11 01:32:59 +00:00
Jakob Stoklund Olesen 43bcb970e5 Reapply r155136 after fixing PR12599.
Original commit message:

Defer some shl transforms to DAGCombine.

The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155362
2012-04-23 17:39:52 +00:00
Jakob Stoklund Olesen 205ee3b389 Revert r155136 "Defer some shl transforms to DAGCombine."
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.

I'll put the patch back in after fixing the SelectionDAG problem.

llvm-svn: 155181
2012-04-20 00:38:45 +00:00
Jakob Stoklund Olesen 6b6c81e6b2 Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155136
2012-04-19 16:46:26 +00:00
Nadav Rotem 5fc81ffbac Fixes following the CR by Chris and Duncan:
Optimize chained bitcasts of the form A->B->A.
Undo r138722 and change isEliminableCastPair to allow this case.

llvm-svn: 138756
2011-08-29 19:58:36 +00:00
Nadav Rotem 52600ee8c3 Bitcasts are transitive. Bitcast-Bitcast-X becomes Bitcast-X.
llvm-svn: 138722
2011-08-28 11:51:08 +00:00
Chris Lattner b90ed2233c manually upgrade a bunch of tests to modern syntax, and remove some that
are either unreduced or only test old syntax.

llvm-svn: 133228
2011-06-17 03:14:27 +00:00
Benjamin Kramer 35159c114c Simplify code. No functionality changes, name changes aside.
llvm-svn: 132896
2011-06-12 22:47:53 +00:00
Chris Lattner 6b657aed33 Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.

Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck.  I believe that this takes care of a bunch of related
code quality issues attached to PR8862.

llvm-svn: 125267
2011-02-10 05:36:31 +00:00
Owen Anderson c237a849e3 Re-apply r113679, which was reverted in r113720, which added a paid of new instcombine transforms
to expose greater opportunities for store narrowing in codegen.  This patch fixes a potential
infinite loop in instcombine caused by one of the introduced transforms being overly aggressive.

llvm-svn: 113763
2010-09-13 17:59:27 +00:00
Eric Christopher 26abd3e0c2 Revert 113679, it was causing an infinite loop in a testcase that I've sent
on to Owen.

llvm-svn: 113720
2010-09-12 06:09:23 +00:00
Owen Anderson 70f4524427 Invert and-of-or into or-of-and when doing so would allow us to clear bits of the and's mask.
This can result in increased opportunities for store narrowing in code generation.  Update a number of
tests for this change.  This fixes <rdar://problem/8285027>.

Additionally, because this inverts the order of ors and ands, some patterns for optimizing or-of-and-of-or
no longer fire in instances where they did originally.  Add a simple transform which recaptures most of these
opportunities: if we have an or-of-constant-or and have failed to fold away the inner or, commute the order 
of the two ors, to give the non-constant or a chance for simplification instead.

llvm-svn: 113679
2010-09-11 05:48:06 +00:00
Chris Lattner 25eea4db66 fix PR7311 by avoiding breaking casts when a bitcast from scalar->vector
is involved.

llvm-svn: 108117
2010-07-12 01:19:22 +00:00
Chris Lattner 02b0df5338 Teach instcombine to transform a bitcast/(zext|trunc)/bitcast sequence
with a vector input and output into a shuffle vector.  This sort of 
sequence happens when the input code stores with one type and reloads
with another type and then SROA promotes to i96 integers, which make
everyone sad.

This fixes rdar://7896024

llvm-svn: 103354
2010-05-08 21:50:26 +00:00
Chris Lattner 2d2c0a00ad disable this testcase, PR5997
llvm-svn: 93206
2010-01-11 23:18:33 +00:00
Chris Lattner 0a85420409 Extend CanEvaluateZExtd to handle and/or/xor more aggressively in the
BitsToClear case.  This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216) 
and test3 (random extample extracted from a spec benchmark).

clang now compiles the code in PR4216 into:

_test_bitfield:                                             ## @test_bitfield
	movl	%edi, %eax
	orl	$194, %eax
	movl	$4294902010, %ecx
	andq	%rax, %rcx
	orl	$32768, %edi
	andq	$39936, %rdi
	movq	%rdi, %rax
	orq	%rcx, %rax
	ret

instead of:

_test_bitfield:                                             ## @test_bitfield
	movl	%edi, %eax
	orl	$194, %eax
	movl	$4294902010, %ecx
	andq	%rax, %rcx
	shrl	$8, %edi
	orl	$128, %edi
	shlq	$8, %rdi
	andq	$39936, %rdi
	movq	%rdi, %rax
	orq	%rcx, %rax
	ret

which is still not great, but is progress.

llvm-svn: 93145
2010-01-11 04:05:13 +00:00
Chris Lattner 12bd8992b3 Remove the dead TD argument to CanEvaluateZExtd, and add a
new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant.  This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.

llvm-svn: 93144
2010-01-11 03:32:00 +00:00
Chris Lattner 7dd540ee24 teach sext optimization to handle truncs from types that are not
the dest of the sext.

llvm-svn: 93128
2010-01-10 20:30:41 +00:00
Chris Lattner 39d2daa94c teach zext optimization how to deal with truncs that don't come from
the zext dest type.  This allows us to handle test52/53 in cast.ll,
and allows llvm-gcc to generate much better code for PR4216 in -m64
mode:

_test_bitfield:                                             ## @test_bitfield
	orl	$32962, %edi
	movl	%edi, %eax
	andl	$-25350, %eax
	ret

This also fixes a bug handling vector extends, ensuring that the
mask produced is a vector constant, not an integer constant.

llvm-svn: 93127
2010-01-10 20:25:54 +00:00
Chris Lattner 2fff10c424 now that the cost model has changed, we can always consider
elimination of a sign extend to be a win, which simplifies 
the client of CanEvaluateSExtd, and allows us to eliminate
more casts (examples taken from real code).

llvm-svn: 93109
2010-01-10 07:40:50 +00:00
Chris Lattner 49d2c9764d two changes:
1) don't try to optimize a sext or zext that is only used by a trunc, let
   the trunc get optimized first.  This avoids some pointless effort in
   some common cases since instcombine scans down a block in the first pass.
2) Change the cost model for zext elimination to consider an 'and' cheaper
   than a zext.  This allows us to do it more aggressively, and for the next
   patch to simplify the code quite a bit.

llvm-svn: 93097
2010-01-10 02:39:31 +00:00
Chris Lattner f0af17dab3 enhance CanEvaluateZExtd to handle shift left and sext, allowing
more expressions to be promoted and casts eliminated.

llvm-svn: 93096
2010-01-10 02:22:12 +00:00
Chris Lattner 3057c37959 Enhance instcombine to reason more strongly about promoting computation
that feeds into a zext, similar to the patch I did yesterday for sext.
There is a lot of room for extension beyond this patch.

llvm-svn: 92962
2010-01-07 23:41:00 +00:00
Chris Lattner 98748c0964 Teach instcombine's sext elimination logic to be more aggressive.
Previously, instcombine would only promote an expression tree to
the larger type if doing so eliminated two casts.  This is because
a need to manually do the sign extend after the promoted expression
tree with two shifts.  Now, we keep track of whether the result of
the computation is going to be properly sign extended already.  If
so, we can unconditionally promote the expression, which allows us
to zap more sext's.

This implements rdar://6598839 (aka gcc pr38751)

llvm-svn: 92815
2010-01-06 01:56:21 +00:00
Chris Lattner a93c63c22d more rearrangement and cleanup, fix my test failure.
llvm-svn: 92792
2010-01-05 22:21:18 +00:00
Chris Lattner 7c0c64831c merge some tests.
llvm-svn: 92786
2010-01-05 21:54:09 +00:00
Chris Lattner dec34d8cef merge cast2 into cast.ll
llvm-svn: 92784
2010-01-05 21:48:13 +00:00
Dan Gohman 580b80d6d9 Make ConstantFoldConstantExpression recursively visit the entire
ConstantExpr, not just the top-level operator. This allows it to
fold many more constants.

Also, make GlobalOpt call ConstantFoldConstantExpression on
GlobalVariable initializers.

llvm-svn: 89659
2009-11-23 16:22:21 +00:00
Chris Lattner 1559bedcc7 unify the code that determines whether it is a good idea to change the type
of a computation.  This fixes some infinite loops when dealing with TD that
has no native types.

llvm-svn: 86670
2009-11-10 07:23:37 +00:00
Kenneth Uildriks 90fedc6ef9 Make opt default to not adding a target data string and update tests that depend on target data to supply it within the test
llvm-svn: 85900
2009-11-03 15:29:06 +00:00
Victor Hernandez c7d6a8327c Autoupgrade malloc insts to malloc calls.
Update testcases that rely on malloc insts being present.

Also prematurely remove MallocInst handling from IndMemRemoval and RaiseAllocations to help pass tests in this incremental step.

llvm-svn: 84292
2009-10-17 00:00:19 +00:00
Edward O'Callaghan cbf75a5dc3 Convert the rest of the InstCombine tests from notcast to FileCheck.
llvm-svn: 83828
2009-10-12 07:18:14 +00:00
Victor Hernandez e6ff7662b6 Revert 82694 "Auto-upgrade malloc instructions to malloc calls." because it causes regressions in the nightly tests.
llvm-svn: 82784
2009-09-25 18:11:52 +00:00
Victor Hernandez 46cd467310 Auto-upgrade malloc instructions to malloc calls.
Reviewed by Devang Patel.

llvm-svn: 82694
2009-09-24 17:47:49 +00:00
Dan Gohman 1880092722 Change tests from "opt %s" to "opt < %s" so that opt doesn't see the
input filename so that opt doesn't print the input filename in the
output so that grep lines in the tests don't unintentionally match
strings in the input filename.

llvm-svn: 81537
2009-09-11 18:01:28 +00:00
Dan Gohman 72a13d2476 Use opt -S instead of piping bitcode output through llvm-dis.
llvm-svn: 81257
2009-09-08 22:34:10 +00:00
Dan Gohman 9737a63ed8 Change these tests to feed the assembly files to opt directly, instead
of using llvm-as, now that opt supports this.

llvm-svn: 81226
2009-09-08 16:50:01 +00:00
Evan Cheng beac6f8b0c Clean up previous cast optimization a bit. Also make zext elimination a bit more aggressive: if it's not necessary to emit an AND (i.e. high bits are already zero), it's profitable to evaluate the operand at a different type.
llvm-svn: 62297
2009-01-16 02:11:43 +00:00
Chris Lattner f50aa6ae5c Implement rdar://6480391, extending of equality icmp's to avoid a truncation.
I noticed this in the code compiled for a routine using std::map, which produced
this code:
	%25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly
	%.lobit.i = lshr i32 %25, 31		; <i32> [#uses=1]
	%tmp.i = trunc i32 %.lobit.i to i8		; <i8> [#uses=1]
	%toBool = icmp eq i8 %tmp.i, 0		; <i1> [#uses=1]
	br i1 %toBool, label %bb3, label %bb4
which compiled to:

	call	L_memcmp$stub
	shrl	$31, %eax
	testb	%al, %al
	jne	LBB1_11	## 

with this change, we compile it to:

	call	L_memcmp$stub
	testl	%eax, %eax
	js	LBB1_11

This triggers all the time in common code, with patters like this:

	%169 = and i32 %ply, 1		; <i32> [#uses=1]
	%170 = trunc i32 %169 to i8		; <i8> [#uses=1]
	%toBool = icmp ne i8 %170, 0		; <i1> [#uses=1]

 	%7 = lshr i32 %6, 24		; <i32> [#uses=1]
	%9 = trunc i32 %7 to i8		; <i8> [#uses=1]
	%10 = icmp ne i8 %9, 0		; <i1> [#uses=1]

etc

llvm-svn: 61985
2009-01-09 07:47:06 +00:00
Tanya Lattner 8bf97c2324 Byebye llvm-upgrade!
llvm-svn: 48762
2008-03-25 04:26:08 +00:00
Reid Spencer ede8c3b92c For PR1319:
Make use of the END. facility on all files > 1K so that we aren't wasting CPU
cycles searching for RUN: lines that we'll never find.

llvm-svn: 36059
2007-04-15 07:38:21 +00:00
Reid Spencer 91948d4cad For PR1319:
Upgrade tests to work with new llvm.exp version of llvm_runtest.

llvm-svn: 36013
2007-04-14 20:13:02 +00:00
Reid Spencer 83b3d82672 Regression is gone, don't try to find it on clean target.
llvm-svn: 33296
2007-01-17 07:59:14 +00:00