Commit Graph

4203 Commits

Author SHA1 Message Date
Bob Wilson 70c8fe5e4e Remove check for an impossible condition: the condition of the while loop has
already checked that TmpBB->getSinglePredecessor() is non-null.

llvm-svn: 94451
2010-01-25 21:28:05 +00:00
Bob Wilson fc060e4337 Change Value::getUnderlyingObject to have the MaxLookup value specified as a
parameter with a default value, instead of just hardcoding it in the
implementation.  The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.

llvm-svn: 94433
2010-01-25 18:26:54 +00:00
Chris Lattner 823aed16f9 make -fno-rtti the default unless a directory builds with REQUIRES_RTTI.
llvm-svn: 94378
2010-01-24 20:43:08 +00:00
Chris Lattner 29b15c5cfd third bug from PR6119: the xor dupe extension allows
for arbitrary terminators in predecessors, don't assume
it is a conditional or uncond branch.  The testcase shows
an example where they can happen with switches.

llvm-svn: 94323
2010-01-23 19:21:31 +00:00
Chris Lattner ba2d0b89ff add an early out to ProcessBranchOnXOR to speed it up,
handle the case when we can infer an input to the xor
from all inputs that agree, instead of going into an
infinite loop.  Another part of PR6199

llvm-svn: 94321
2010-01-23 19:16:25 +00:00
Chris Lattner de5ab4860f fix a crash in jump threading, PR6119
llvm-svn: 94319
2010-01-23 18:56:07 +00:00
Eric Christopher ba7cd4c393 Reapply 94059 while fixing the calling convention setup
for strcpy.

llvm-svn: 94287
2010-01-23 05:29:06 +00:00
Bob Wilson 6c0c8d41b4 Revert 94059. It is breaking the MultiSource/Benchmarks/Prolangs-C/bison
test on ARM.

llvm-svn: 94198
2010-01-22 19:16:40 +00:00
Chris Lattner 7ba0661f27 Stop building RTTI information for *most* llvm libraries. Notable
missing ones are libsupport, libsystem and libvmcore.  libvmcore is
currently blocked on bugpoint, which uses EH.  Once it stops using
EH, we can switch it off.

This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.

llvm-svn: 94164
2010-01-22 06:49:46 +00:00
Dan Gohman 045f81981a Revert LoopStrengthReduce.cpp to pre-r94061 for now.
llvm-svn: 94123
2010-01-22 00:46:49 +00:00
Victor Hernandez 1df65186d1 DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up looking for them in use iterations and remove OnlyUsedByDbgInfoIntrinsics()
llvm-svn: 94111
2010-01-21 23:05:53 +00:00
Dan Gohman b1ee154b6b When inserting expressions for post-increment users which contain
loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.

llvm-svn: 94110
2010-01-21 23:01:22 +00:00
Dan Gohman cb8d577eb2 Include IVUsers information in LSR's debug output.
llvm-svn: 94108
2010-01-21 22:46:32 +00:00
Dan Gohman 29916e023d Prune the search for candidate formulae if the number of register
operands exceeds the number of registers used in the initial
solution, as that wouldn't lead to a profitable solution anyway.

llvm-svn: 94107
2010-01-21 22:42:49 +00:00
Dan Gohman c903499ff8 Add a comment.
llvm-svn: 94104
2010-01-21 21:31:09 +00:00
Dan Gohman 51ad99d2c5 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061
2010-01-21 02:09:26 +00:00
Eric Christopher fa863258d0 Add strcpy_chk -> strcpy support for "don't know" object size
answers.  This will update as object size checking gets better information.

llvm-svn: 94059
2010-01-21 01:04:38 +00:00
Dan Gohman ca19445d08 When doing address-mode sinking, expand the base register first, rather
than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.

llvm-svn: 93935
2010-01-19 22:45:06 +00:00
Bob Wilson 58d59fe394 Fix a crash in scalarrepl for memcpy/memmove where the source and destination
are the same.  I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value.  Radar 7552893.

llvm-svn: 93848
2010-01-19 04:32:48 +00:00
Owen Anderson cdea3572fa Convert some of the dynamic opcode lookups into static ones.
llvm-svn: 93693
2010-01-17 19:33:27 +00:00
Chris Lattner 573da8ac90 1) Use the new SimplifyInstructionsInBlock routine instead of the copy
in JT.

2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go.  This allows us to 
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.

llvm-svn: 93253
2010-01-12 20:41:47 +00:00
Chris Lattner af7855d571 tidy up
llvm-svn: 93222
2010-01-12 02:07:50 +00:00
Chris Lattner eb73bdb2e1 Teach jump threading to duplicate small blocks when the branch
condition is a xor with a phi node.  This eliminates nonsense
like this from 176.gcc in several places:

 LBB166_84:
        testl   %eax, %eax
-       setne   %al
-       xorb    %cl, %al
-       notb    %al
-       testb   $1, %al
-       je      LBB166_85
+       je      LBB166_69
+       jmp     LBB166_85

This is rdar://7391699

llvm-svn: 93221
2010-01-12 02:07:17 +00:00
Chris Lattner 6a19ed0b86 some cleanup, and make it obvious that ProcessJumpOnPHI only works
on branches by renaming it and checking for a branch at the call site.

llvm-svn: 93208
2010-01-11 23:41:09 +00:00
Chris Lattner ab7087ad66 only factor from expressions whose uses are empty and whose
base is the right expression type.  This fixes PR5981.

llvm-svn: 93045
2010-01-09 06:01:36 +00:00
Duncan Sands 4a8b15dc74 Suppress an unused variable warning when assertions are off;
remove some trailing whitespace while there.

llvm-svn: 93008
2010-01-08 17:51:48 +00:00
Benjamin Kramer 76e2766442 Use a do-while loop instead of while + boolean.
llvm-svn: 92912
2010-01-07 13:50:07 +00:00
Eric Christopher 2cdb806fd8 Move the object size intrinsic optimization to inst-combine and make
it work for any integer size return type.

llvm-svn: 92853
2010-01-06 20:04:44 +00:00
Mikhail Glushenkov 40d2429b28 Formatting.
llvm-svn: 92831
2010-01-06 09:20:39 +00:00
Benjamin Kramer d2564e3afb Move remaining stuff to the isInteger predicate.
llvm-svn: 92771
2010-01-05 21:05:54 +00:00
Benjamin Kramer a81a6dff0d Convert a ton of simple integer type equality tests to the new predicate.
llvm-svn: 92760
2010-01-05 20:07:06 +00:00
Dan Gohman b5358003fb Set Changed properly after calling DeleteDeadPHIs.
llvm-svn: 92735
2010-01-05 16:31:45 +00:00
Dan Gohman 28943873e6 Use do+while instead of while for loops which obviously have a
non-zero trip count. Use SmallVector's pop_back_val().

llvm-svn: 92734
2010-01-05 16:27:25 +00:00
Chris Lattner f741d72b84 fix an infinite loop in reassociate building emacs.
llvm-svn: 92679
2010-01-05 04:55:35 +00:00
David Greene 241992382e Change errs() to dbgs().
llvm-svn: 92624
2010-01-05 01:27:47 +00:00
David Greene e0b9789593 Change errs() to dbgs().
llvm-svn: 92623
2010-01-05 01:27:44 +00:00
David Greene 6bc0776343 Change errs() to dbgs().
llvm-svn: 92622
2010-01-05 01:27:39 +00:00
David Greene 3a79df0993 Change errs() to dbgs().
llvm-svn: 92620
2010-01-05 01:27:33 +00:00
David Greene 0fd862254e Change errs() to dbgs().
llvm-svn: 92619
2010-01-05 01:27:30 +00:00
David Greene d17c3916d0 Change errs() to dbgs().
llvm-svn: 92617
2010-01-05 01:27:24 +00:00
David Greene 9ddc6e2e12 Change errs() to dbgs().
llvm-svn: 92615
2010-01-05 01:27:21 +00:00
David Greene 1efdb45562 Change errs() to dbgs().
llvm-svn: 92614
2010-01-05 01:27:19 +00:00
David Greene 2e6efc441f Change errs() to dbgs().
llvm-svn: 92613
2010-01-05 01:27:17 +00:00
David Greene 389fc3b9f6 Change errs() to dbgs().
llvm-svn: 92612
2010-01-05 01:27:15 +00:00
David Greene 74e2d4917d Change errs() to dbgs().
llvm-svn: 92611
2010-01-05 01:27:11 +00:00
David Greene 48c86bedbd Change errs() to dbgs().
llvm-svn: 92610
2010-01-05 01:27:09 +00:00
David Greene 0dd384cfd0 Change errs() to dbgs().
llvm-svn: 92609
2010-01-05 01:27:06 +00:00
David Greene d9c355d590 Change errs() to dbgs().
llvm-svn: 92608
2010-01-05 01:27:04 +00:00
Devang Patel be94f23992 Remove dead debug info intrinsics.
Intrinsic::dbg_stoppoint
 Intrinsic::dbg_region_start 
 Intrinsic::dbg_region_end 
 Intrinsic::dbg_func_start
AutoUpgrade simply ignores these intrinsics now.

llvm-svn: 92557
2010-01-05 01:10:40 +00:00
Mikhail Glushenkov 6a8ac8ce8f 80-col violations, trailing whitespace.
llvm-svn: 92470
2010-01-04 07:55:25 +00:00
Chris Lattner c0e6640d3a move instcombine to its own library, it's past time.
llvm-svn: 92459
2010-01-04 06:23:24 +00:00
Chris Lattner 2d91231d82 implement an instcombine xform needed by clang's codegen
on the example in PR4216.  This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.

llvm-svn: 92458
2010-01-04 06:03:59 +00:00
Chris Lattner 48218e42cd pull my debug hooks out, I'm done with this xform for now.
llvm-svn: 92446
2010-01-03 06:58:48 +00:00
Nick Lewycky 475d3d1215 Small cleanups, refactor some duplicated code into a single method. No
functionality change.

llvm-svn: 92445
2010-01-03 04:39:07 +00:00
Chris Lattner fca0c8f93a generalize the previous transformation to handle indexing into
arrays of structs and other arrays, so long as all the subsequent
indexes are constants.  This triggers frequently for stuff like:

@divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]*> [#uses=50]

	  %623 = getelementptr inbounds [29 x [2 x i32]]* @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1]
	   %684 = icmp eq i32 %683, 999 

also for the "my_defs" table in 'gs', etc.

llvm-svn: 92444
2010-01-03 03:03:27 +00:00
Nick Lewycky ff9cd7ace7 Cleanup.
llvm-svn: 92436
2010-01-03 00:55:31 +00:00
Chris Lattner 98ad2b56cc teach instcombine to optimize idioms like A[i]&42 == 0. This
occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
is copied in multiple apps) in _sch_istable, etc.

llvm-svn: 92427
2010-01-02 22:08:28 +00:00
Chris Lattner b56bef45f8 Teach the table lookup optimization to generate range compares
when a consequtive sequence of elements all satisfies the 
predicate.  Like the double compare case, this generates better
code than the magic constant case and generalizes to more than
32/64 element array lookups.

Here are some examples where it triggers.  From 403.gcc, most
accesses to the rtx_class array are handled, e.g.:

@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
   %142 = icmp eq i8 %141, 105
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
	   %165 = icmp eq i8 %164, 60      

Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) 
optimized before are actually range compares.  This lets 32-bit
machines optimize them.

400.perlbmk has stuff like this:

400.perlbmk: PL_regkind, even for 32-bit:
@PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
	   %811 = icmp ne i8 %810, 33 

@PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
	   %12 = icmp ult i8 %10, 2
           
etc.

llvm-svn: 92426
2010-01-02 21:50:18 +00:00
Chris Lattner e199d2df80 theoretically the negate we find could be in a different function, check
for this case.

llvm-svn: 92425
2010-01-02 21:46:33 +00:00
Chris Lattner 2fa4ec70fc use enums for the over/underdefined markers for clarity. Switch
to using -2/-3 instead of -1/-2 for a future xform.

llvm-svn: 92423
2010-01-02 20:20:33 +00:00
Chris Lattner 351e22aa36 remove the random sampling framework, which is not maintained anymore.
If there is interest, it can be resurrected from SVN.  PR4912.

llvm-svn: 92422
2010-01-02 20:07:03 +00:00
Nick Lewycky a67519be12 Fix logic error in previous commit. The != case needs to become an or, not an
and.

llvm-svn: 92419
2010-01-02 16:14:56 +00:00
Nick Lewycky 357d41b3c1 Optimize pointer comparison into the typesafe form, now that the backends will
handle them efficiently. This is the opposite direction of the transformation
we used to have here.

llvm-svn: 92418
2010-01-02 15:25:44 +00:00
Chris Lattner cfda435c73 Generalize the previous xform to handle cases where exactly
two elements match or don't match with two comparisons.  For
example, the testcase compiles into:

define i1 @test5(i32 %X) {
  %1 = icmp eq i32 %X, 2                          ; <i1> [#uses=1]
  %2 = icmp eq i32 %X, 7                          ; <i1> [#uses=1]
  %R = or i1 %1, %2                               ; <i1> [#uses=1]
  ret i1 %R
}

This generalizes the previous xforms when the array is larger than
64 elements (and this case matches) and generates better code for
cases where it overlaps with the magic bitshift case.

This generalizes more cases than you might expect.  For example,
400.perlbmk has:

@PL_utf8skip = constant [256 x i8] c"\01\01\01\...
%15 = icmp ult i8 %7, 7

403.gcc has:
@rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
%18 = icmp eq i16 %16, 295 

and xalancbmk has a bunch of examples, such as 
_ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.

llvm-svn: 92417
2010-01-02 09:35:17 +00:00
Chris Lattner c6ac078423 fix a miscompilation I introduced of cdecl with a late change.
llvm-svn: 92416
2010-01-02 09:22:13 +00:00
Chris Lattner 935a4a606a enhance the compare/load/index optimization to work on *any* load
from a global with 32/64 elements or less (depending on whether
i64 is native on the target), generating a bitshift idiom to 
determine the result.  For example, on test4 we produce:

define i1 @test4(i32 %X) {
  %1 = lshr i32 933, %X                           ; <i32> [#uses=1]
  %2 = and i32 %1, 1                              ; <i32> [#uses=1]
  %R = icmp ne i32 %2, 0                          ; <i1> [#uses=1]
  ret i1 %R
}

This triggers in a number of interesting cases, for example, here's an
fp case:
@A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]*> [#uses=7]
...
	   %7 = fcmp olt double %3, 0.000000e+00

In this case we make the slen2_tab global dead, which is nice:
@slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]*> [#uses=1]
...
	   %204 = icmp eq i32 %46, 0     

Perl has a bunch of these, also on the 'Perl_regkind' array:
@Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]*> [#uses=1]
...
  %1364 = icmp eq i16 %1361, 0

186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this:
@white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]*> [#uses=2]

However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc.

go 64-bit machines :)

llvm-svn: 92415
2010-01-02 08:56:52 +00:00
Chris Lattner b1567bd584 enhance the previous optimization to work with fcmp in addition
to icmp.

llvm-svn: 92412
2010-01-02 08:20:51 +00:00
Chris Lattner a061859ccc Teach instcombine to fold compares of loads from constant
arrays with variable indices into a comparison of the index
with a constant.  The most common occurrence of this that
I see by far is stuff like:

if ("foobar"[i] == '\0') ...

which we compile into: if (i == 6), saving a load and 
materialization of the global address.  This also exposes 
loop trip count information to later passes in many cases.

This triggers hundreds of times in xalancbmk, which is where I first
noticed it, but it also triggers in many other apps.  Here are a few 
interesting ones from various apps:

@must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
  %scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
  %17 = load ...
  %18 = icmp eq i8* %17, null                     ; <i1> [#uses=1]
-> icmp eq i64 %indvar.i, 7 


@yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
  %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
   %mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
load ...
   %64 = icmp eq i8 %58, 4                         ; <i1> [#uses=1]
-> icmp eq i64 %.pn, 35             ; <i1> [#uses=0]


@gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
%scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
%425 = load %scevgep.i
%426 = icmp eq i16 %425, -32768                 ; <i1> [#uses=0]
-> false

llvm-svn: 92411
2010-01-02 08:12:04 +00:00
Chris Lattner 2e4be2c340 remove the instcombine transformations that are inserting nasty
pointer to int casts that confuse later optimizations.  See PR3351
for details.

This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
does this xform in GCC's "fold" routine as well.  Clang++ will do
better I guess.

llvm-svn: 92408
2010-01-02 00:31:05 +00:00
Chris Lattner faf1337acb add a simple instcombine xform, simplify another one to use hasAllZeroIndices()
instead of hand rolling a loop.

llvm-svn: 92403
2010-01-01 23:09:08 +00:00
Chris Lattner 30c0a2833d generalize the pointer difference optimization to handle
a constantexpr gep on the 'base' side of the expression.
This completes comment #4 in PR3351, which comes from
483.xalancbmk.

llvm-svn: 92402
2010-01-01 22:42:29 +00:00
Chris Lattner 4394f71752 teach instcombine to optimize pointer difference idioms involving constant
expressions.  This is a step towards comment #4 in PR3351.

llvm-svn: 92401
2010-01-01 22:29:12 +00:00
Chris Lattner 9d4c5414bb use 'match' to simplify some code.
llvm-svn: 92400
2010-01-01 22:12:03 +00:00
Chris Lattner 25c87e9cf9 implement the transform requested in PR5284
llvm-svn: 92398
2010-01-01 18:34:40 +00:00
Chris Lattner ee1f861d81 add missing line.
llvm-svn: 92384
2010-01-01 01:54:08 +00:00
Chris Lattner 8330daf733 add a few trivial instcombines for llvm.powi.
llvm-svn: 92383
2010-01-01 01:52:15 +00:00
Chris Lattner 0c59ac3f41 When factoring multiply expressions across adds, factor both
positive and negative forms of constants together.  This 
allows us to compile:

int foo(int x, int y) {
    return (x-y) + (x-y) + (x-y);
}

into:

_foo:                                                       ## @foo
	subl	%esi, %edi
	leal	(%rdi,%rdi,2), %eax
	ret

instead of (where the 3 and -3 were not factored):

_foo:
        imull   $-3, 8(%esp), %ecx
        imull   $3, 4(%esp), %eax
        addl    %ecx, %eax
        ret

this started out as:
    movl    12(%ebp), %ecx
    imull   $3, 8(%ebp), %eax
    subl    %ecx, %eax
    subl    %ecx, %eax
    subl    %ecx, %eax
    ret

This comes from PR5359.

llvm-svn: 92381
2010-01-01 01:13:15 +00:00
Chris Lattner a552683fd4 clean up some comments.
llvm-svn: 92377
2010-01-01 00:04:26 +00:00
Chris Lattner 17229a7cb8 switch from std::map to DenseMap for rank data structures.
llvm-svn: 92375
2010-01-01 00:01:34 +00:00
Chris Lattner fed3397654 reuse negates where possible instead of always creating them from scratch.
This allows us to optimize test12 into:

define i32 @test12(i32 %X) {
  %factor = mul i32 %X, -3                        ; <i32> [#uses=1]
  %Z = add i32 %factor, 6                         ; <i32> [#uses=1]
  ret i32 %Z
}

instead of:

define i32 @test12(i32 %X) {
  %Y = sub i32 6, %X                              ; <i32> [#uses=1]
  %C = sub i32 %Y, %X                             ; <i32> [#uses=1]
  %Z = sub i32 %C, %X                             ; <i32> [#uses=1]
  ret i32 %Z
}

llvm-svn: 92373
2009-12-31 20:34:32 +00:00
Chris Lattner 60c2ca743d we don't need a smallptrset to detect duplicates, the values are
sorted, so we can just do a linear scan.

llvm-svn: 92372
2009-12-31 19:49:01 +00:00
Chris Lattner 1d8979422a make reassociate more careful about not leaving around dead mul's
llvm-svn: 92370
2009-12-31 19:34:45 +00:00
Chris Lattner ed18917665 remove debug
llvm-svn: 92369
2009-12-31 19:25:19 +00:00
Chris Lattner 60b71b5c4d teach reassociate to factor x+x+x -> x*3. While I'm at it,
fix RemoveDeadBinaryOp to actually do something.

llvm-svn: 92368
2009-12-31 19:24:52 +00:00
Chris Lattner 38abecbad0 change reassociate to use SmallVector for its key datastructures
instead of std::vector.

llvm-svn: 92366
2009-12-31 18:40:32 +00:00
Chris Lattner ac61550504 change an if to an assert, fix comment.
llvm-svn: 92364
2009-12-31 18:18:46 +00:00
Chris Lattner 177140ad12 move the rest of the add optimization code out to OptimizeAdd,
improve some comments, simplify a bit of code.

llvm-svn: 92363
2009-12-31 18:17:13 +00:00
Chris Lattner ba1f36aa99 factor statistic updating better.
llvm-svn: 92362
2009-12-31 17:51:05 +00:00
Chris Lattner 4e3a5678af simple fix for an incorrect factoring which causes a
miscompilation, PR5458.

llvm-svn: 92354
2009-12-31 08:33:49 +00:00
Chris Lattner 5f8a005d38 factor code out into helper functions.
llvm-svn: 92347
2009-12-31 07:59:34 +00:00
Chris Lattner f5c2b8b8d7 switch some std::vector's to smallvector. Reduce nesting.
llvm-svn: 92346
2009-12-31 07:48:51 +00:00
Chris Lattner 9039ff8912 use more modern datastructures.
llvm-svn: 92344
2009-12-31 07:33:14 +00:00
Chris Lattner bc1512c8d1 clean up -debug output.
llvm-svn: 92343
2009-12-31 07:17:37 +00:00
Chris Lattner 17079fc0fa split code that doesn't need to be templated out of IRBuilder into a new
non-templated IRBuilderBase class.  Move that large CreateGlobalString
out of line, eliminating the need to #include GlobalVariable.h in IRBuilder.h

llvm-svn: 92227
2009-12-28 21:28:46 +00:00
Chris Lattner f8d22fc77d Metadata.h doesn't need to include ValueHandle.h anymore.
llvm-svn: 92211
2009-12-28 08:20:46 +00:00
Chris Lattner 1a32ede6fd move an optimization for memcmp out of simplifylibcalls and into
SDISel.  This optimization was causing simplifylibcalls to 
introduce type-unsafe nastiness.  This is the first step, I'll be 
expanding the memcmp optimizations shortly, covering things that
we really really wouldn't want simplifylibcalls to do.

llvm-svn: 92098
2009-12-24 00:37:38 +00:00
Chris Lattner efebb234b7 reorder to follow a normal fall-through style, no functionality change.
llvm-svn: 92084
2009-12-23 23:24:51 +00:00
David Greene 2330f78075 Remove dump routine and the associated Debug.h from a header. Patch up
other files to compensate.

llvm-svn: 92075
2009-12-23 22:58:38 +00:00
Eric Christopher fdb33458fc Update objectsize intrinsic and associated dependencies. Fix
lowering code and update testcases.

llvm-svn: 91979
2009-12-23 02:51:48 +00:00
Chris Lattner c0f6402a94 Fix the Convert to scalar to not insert dead loads in the store case. The
load is needed when we have a small store into a large alloca (at which 
point we get a load/insert/store sequence), but when you do a full-sized
store, this load ends up being dead.

This dead load is bad in really large nasty testcases where the load ends
up causing mem2reg to insert large chains of dependent phi nodes which only
ADCE can delete.  Instead of doing this, just don't insert the dead load.

This fixes rdar://6864035

llvm-svn: 91917
2009-12-22 19:33:28 +00:00
Chris Lattner fda3b559e6 fix some fixme's by using twines
llvm-svn: 91916
2009-12-22 19:23:33 +00:00
Bob Wilson 62a84ea8e3 Generalize SROA to allow the first index of a GEP to be non-zero. Add a
missing check that an array reference doesn't go past the end of the array,
and remove some redundant checks for in-bound array and vector references
that are no longer needed.

llvm-svn: 91897
2009-12-22 06:57:14 +00:00
Chris Lattner f21a220bcd Implement PR5795 by merging duplicated return blocks. This could go further
by merging all returns in a function into a single one, but simplifycfg 
currently likes to duplicate the return (an unfortunate choice!)

llvm-svn: 91890
2009-12-22 06:07:30 +00:00
Chris Lattner 9b7d99eb76 The phi translated pointer can be computed when returning a partially cached result
instead of stored.  This reduces memdep memory usage, and also eliminates a bunch of
weakvh's.  This speeds up gvn on gcc.c-torture/20001226-1.c from 23.9s to 8.45s (2.8x)
on a different machine than earlier.

llvm-svn: 91885
2009-12-22 04:25:02 +00:00
Eric Christopher ab6a0d60d5 Whitespace fixes.
llvm-svn: 91875
2009-12-22 01:23:51 +00:00
Daniel Dunbar c661a2d4d8 Add suggested parentheses.
llvm-svn: 91853
2009-12-21 23:27:57 +00:00
Chris Lattner bf20018423 Add a fastpath to Load GVN to special case when we have exactly one dominating
load to avoid even messing around with SSAUpdate at all.  In this case (which
is very common, we can just use the input value directly).

This speeds up GVN time on gcc.c-torture/20001226-1.c from 36.4s to 16.3s,
which still isn't great, but substantially better and this is a simple speedup
that applies to lots of different cases.

llvm-svn: 91851
2009-12-21 23:15:48 +00:00
Chris Lattner 927b0ac4b2 refactor some code out to a new helper method.
llvm-svn: 91849
2009-12-21 23:04:33 +00:00
Bob Wilson 88a0598fe8 Remove special-case SROA optimization of variable indexes to one-element and
two-element arrays.  After restructuring the SROA code, it was not safe to
do this without adding more checking.  It is not clear that this special-case
has really been useful, and removing this simplifies the code quite a bit.

llvm-svn: 91828
2009-12-21 18:39:47 +00:00
Chris Lattner d4fb4296df give instcombine some helper functions for matching MIN and MAX, and
implement some optimizations for MIN(MIN()) and MAX(MAX()) and 
MIN(MAX()) etc.  This substantially improves the code in PR5822 but
doesn't kick in much elsewhere.  2 max's were optimized in 
pairlocalalign and one in smg2000.

llvm-svn: 91814
2009-12-21 06:03:05 +00:00
Chris Lattner ffbd02829c enhance x-(-A) -> x+A to preserve NUW/NSW.
Use the presence of NSW/NUW to fold "icmp (x+cst), x" to a constant in
cases where it would otherwise be undefined behavior.

Surprisingly (to me at least), this triggers hundreds of the times in
a few benchmarks: lencode, ldecode, and 466.h264ref seem to *really*
like this.

llvm-svn: 91812
2009-12-21 04:04:05 +00:00
Chris Lattner 900ce231f9 Optimize all cases of "icmp (X+Cst), X" to something simpler. This triggers
a bunch in lencode, ldecod, spass, 176.gcc, 252.eon, among others.  It is 
also the first part of PR5822

llvm-svn: 91811
2009-12-21 03:19:28 +00:00
Chris Lattner 4ad5eba568 fix PR5827 by disabling the phi slicing transformation in a case
where instcombine would have to split a critical edge due to a
phi node of an invoke.  Since instcombine can't change the CFG,
it has to bail out from doing the transformation.

llvm-svn: 91763
2009-12-19 07:01:15 +00:00
Bob Wilson c16811b575 Update my SROA changes in response to review.
* change FindElementAndOffset to return a uint64_t instead of unsigned, and
  to identify the type to be used for that result in a GEP instruction.
* move "isa<ConstantInt>" to be first in conditional.
* replace some dyn_casts with casts.
* add a comment about handling mem intrinsics.

llvm-svn: 91762
2009-12-19 06:53:17 +00:00
Bob Wilson 532cd232fb Reapply 91459 with a simple fix for the problem that broke the x86_64-darwin
bootstrap.  This also replaces the WeakVH references that Chris objected to
with normal Value references.

llvm-svn: 91711
2009-12-18 20:14:40 +00:00
Eli Friedman 86b9d75dc8 Optimize icmp of null and select of two constants even if the select has
multiple uses.  (The construct in question was found in gcc.)

llvm-svn: 91675
2009-12-18 08:22:35 +00:00
Dan Gohman 57e808628c Eliminte unnecessary uses of <cstdio>.
llvm-svn: 91666
2009-12-18 03:25:51 +00:00
Dan Gohman 18fa5686f6 Add Loop contains utility methods for testing whether a loop
contains another loop, or an instruction. The loop form is
substantially more efficient on large loops than the typical
code it replaces.

llvm-svn: 91654
2009-12-18 01:24:09 +00:00
Dan Gohman fd7231f1fe Minor code simplification.
llvm-svn: 91653
2009-12-18 01:20:44 +00:00
Dan Gohman b1924e8a0f Don't pass const pointers by reference.
llvm-svn: 91647
2009-12-18 00:38:08 +00:00
Dan Gohman 92c3696524 Reapply LoopStrengthReduce and IVUsers cleanups, excluding the part
of 91296 that caused trouble -- the Processed list needs to be
preserved for the livetime of the pass, as AddUsersIfInteresting
is called from other passes.

llvm-svn: 91641
2009-12-18 00:06:20 +00:00
Eli Friedman 250b119d98 Allow instcombine to combine "sext(a) >u const" to "a >u trunc(const)".
llvm-svn: 91631
2009-12-17 22:42:29 +00:00
Eli Friedman 7cc86b4cc6 Make the ptrtoint comparison simplification work if one side is a global.
llvm-svn: 91624
2009-12-17 21:27:47 +00:00
Eli Friedman 5842c9968a Slightly generalize transformation of memmove(a,a,n) so that it also applies
to memcpy. (Such a memcpy is technically illegal, but in practice is safe
and is generated by struct self-assignment in C code.)

llvm-svn: 91621
2009-12-17 21:07:31 +00:00
Bob Wilson f3927b7994 Re-revert 91459. It's breaking the x86_64 darwin bootstrap.
llvm-svn: 91607
2009-12-17 18:34:24 +00:00
Evan Cheng 090ac0865a Revert 91280-91283, 91286-91289, 91291, 91293, 91295-91296. It apparently introduced a non-deterministic behavior in the optimizer somewhere.
llvm-svn: 91598
2009-12-17 09:39:49 +00:00
Daniel Dunbar ab42d42390 Reapply r91459, it was only unmasking the bug, and since TOT is still broken having it reverted does no good.
llvm-svn: 91559
2009-12-16 20:09:53 +00:00
Daniel Dunbar 133efc317e Revert "Reapply 91184 with fixes and an addition to the testcase to cover the
problem", this broke llvm-gcc bootstrap for release builds on
x86_64-apple-darwin10.

This reverts commit db22309800b224a9f5f51baf76071d7a93ce59c9.

llvm-svn: 91534
2009-12-16 10:56:17 +00:00
Chris Lattner f278addbdc reapply my strstr optimization. I have reproduced the x86-64 bootstrap
miscompile (i386.o miscompares) but it happens both with and without
this patch.

llvm-svn: 91532
2009-12-16 09:32:05 +00:00
Chris Lattner 177be32334 revert my strstr optimization, I'm told it breaks x86-64 bootstrap.
Will reapply with a fix when I get a chance.

llvm-svn: 91486
2009-12-16 00:46:02 +00:00
Bob Wilson e44756d7c2 Reapply 91184 with fixes and an addition to the testcase to cover the problem
found last time.  Instead of trying to modify the IR while iterating over it,
I've change it to keep a list of WeakVH references to dead instructions, and
then delete those instructions later.  I also added some special case code to
detect and handle the situation when both operands of a memcpy intrinsic are
referencing the same alloca.

llvm-svn: 91459
2009-12-15 22:00:51 +00:00
Chris Lattner 26ab363361 optimize strstr, PR5783
llvm-svn: 91438
2009-12-15 19:14:40 +00:00
Dan Gohman 265ce318b8 Delete an unused function.
llvm-svn: 91432
2009-12-15 16:30:09 +00:00
Chris Lattner 24aba42d04 add some other xforms that should be done as part of PR5783
llvm-svn: 91428
2009-12-15 09:05:13 +00:00
Chris Lattner 45d040bd85 Remove isPod() from DenseMapInfo, splitting it out to its own
isPodLike type trait.  This is a generally useful type trait for
more than just DenseMap, and we really care about whether something
acts like a pod, not whether it really is a pod.

llvm-svn: 91421
2009-12-15 07:26:43 +00:00
Dan Gohman fbeec7270c Fix a thinko; isNotAlreadyContainedIn had a built-in negative, so the
condition was inverted when the code was converted to contains().

llvm-svn: 91295
2009-12-14 17:31:01 +00:00
Dan Gohman 416d5b7361 Remove unnecessary #includes.
llvm-svn: 91293
2009-12-14 17:19:06 +00:00
Dan Gohman 163fb26927 Instead of having a ScalarEvolution pointer member in BasedUser, just pass
the ScalarEvolution pointer into the functions which need it.

llvm-svn: 91289
2009-12-14 17:12:51 +00:00
Dan Gohman 8dbd4e3d16 Don't bother cleaning up if there's nothing to clean up.
llvm-svn: 91288
2009-12-14 17:10:44 +00:00
Dan Gohman 88c7e61c5b Delete an unused variable.
llvm-svn: 91287
2009-12-14 17:08:09 +00:00
Dan Gohman 838f604543 LSR itself doesn't need LoopInfo.
llvm-svn: 91283
2009-12-14 17:02:34 +00:00
Dan Gohman 273e692952 LSR itself doesn't need DominatorTree.
llvm-svn: 91282
2009-12-14 16:57:08 +00:00
Dan Gohman c3513095cf Remove the code in LSR that manually hoists expansions out of loops;
SCEVExpander does this automatically.

llvm-svn: 91281
2009-12-14 16:52:55 +00:00
Dan Gohman ec2a7c58e8 Minor code cleanups.
llvm-svn: 91280
2009-12-14 16:37:29 +00:00
Chris Lattner aaa6ac10a6 revert r91184, because it causes a crash on a .bc file I just
sent to Bob.

llvm-svn: 91268
2009-12-14 05:11:02 +00:00
Bob Wilson 895f364ae6 Revise scalar replacement to be more flexible about handle bitcasts and GEPs.
While scanning through the uses of an alloca, keep track of the current offset
relative to the start of the alloca, and check memory references to see if
the offset & size correspond to a component within the alloca.  This has the
nice benefit of unifying much of the code from isSafeUseOfAllocation,
isSafeElementUse, and isSafeUseOfBitCastedAllocation.  The code to rewrite
the uses of a promoted alloca, after it is determined to be safe, is
reorganized in the same way.

Also, when rewriting GEP instructions, mark them as "in-bounds" since all the
indices are known to be safe.

llvm-svn: 91184
2009-12-11 23:47:40 +00:00
Eric Christopher 22889c049d Make sure the immediate dominator isn't NULL through iterations
of the loop. We could get to this condition via indirect
branches.

llvm-svn: 91009
2009-12-10 00:25:41 +00:00
Chris Lattner 9ccc879006 Fix PR5744, a case where we were getting the pointer size instead of the
value size.  This only manifested when memdep inprecisely returns clobber,
which is do to a caching issue in the PR5744 testcase.  We can 'efficiently
emulate' this by using '-no-aa'

llvm-svn: 91004
2009-12-10 00:11:45 +00:00
Chris Lattner 3ddf804f78 allow this to build when the #if 0's are enabled. No functionality change.
llvm-svn: 90999
2009-12-10 00:04:46 +00:00
Dan Gohman 72c367fb52 Dereference loopHeader after checking for null rather than before.
llvm-svn: 90990
2009-12-09 22:55:01 +00:00
Chris Lattner ca5f9cb18b fix hte last remaining known (by me) phi translation bug. When we reanalyze
clobbers to forward pieces of large stores to small loads, we need to consider
the properly phi translated pointer in the store block.

llvm-svn: 90978
2009-12-09 18:21:46 +00:00
Chris Lattner f8ba1253f1 change GetStoreValueForLoad to use IRBuilder, which is cleaner and
implicitly constant folds.

llvm-svn: 90977
2009-12-09 18:13:28 +00:00
Bob Wilson 1c5a6fb299 Fix a comment.
llvm-svn: 90975
2009-12-09 18:05:27 +00:00
Chris Lattner 07df9efb35 change AnalyzeLoadFromClobberingMemInst/AnalyzeLoadFromClobberingStore
to require the load ty/ptr to be passed in, no functionality change.

llvm-svn: 90960
2009-12-09 07:37:07 +00:00
Chris Lattner 0def861ee9 change AnalyzeLoadFromClobberingWrite and clients to pass in type
and pointer instead of the load.  No functionality change.

llvm-svn: 90959
2009-12-09 07:34:10 +00:00
Chris Lattner 0c31547168 change NonLocalDepEntry from being a typedef for an std::pair to be its
own small class.  No functionality change.

llvm-svn: 90956
2009-12-09 07:08:01 +00:00
Chris Lattner 946b58dd90 add some aborts to #if 0's.
llvm-svn: 90929
2009-12-09 02:41:54 +00:00
Chris Lattner 972e6d8d00 Switch GVN and memdep to use PHITransAddr, which correctly handles
phi translation of complex expressions like &A[i+1].  This has the
following benefits:

1. The phi translation logic is all contained in its own class with
   a strong interface and verification that it is self consistent.

2. The logic is more correct than before.  Previously, if intermediate
   expressions got PHI translated, we'd miss the update and scan for
   the wrong pointers in predecessor blocks.  @phi_trans2 is a testcase
   for this.

3. We have a lot less code in memdep.

We can handle phi translation across blocks of things like @phi_trans3,
which is pretty insane :).

This patch should fix the miscompiles of 255.vortex, and I tested it 
with a bootstrap of llvm-gcc, llvm-test and dejagnu of course.

llvm-svn: 90926
2009-12-09 01:59:31 +00:00
Bob Wilson c5d082fd5d Some superficial cleanups.
llvm-svn: 90866
2009-12-08 18:27:03 +00:00
Bob Wilson 2029ea04f9 Clean up dead operands left around after SROA replaces a mem intrinsic.
I'm not aware that this does anything significant on its own, but it's
needed for another patch that I'm working on.

llvm-svn: 90864
2009-12-08 18:22:03 +00:00
Nick Lewycky 8bca014d7f Remove unnecessary #include "llvm/LLVMContext.h".
llvm-svn: 90836
2009-12-08 05:45:41 +00:00
Chris Lattner 6d6f10fe91 fix PR5698
llvm-svn: 90708
2009-12-06 17:17:23 +00:00
Chris Lattner 778cb92235 constant fold loads from memcpy's from global constants. This is important
because clang lowers nontrivial automatic struct/array inits to memcpy from
a global array.

llvm-svn: 90698
2009-12-06 05:29:56 +00:00
Chris Lattner 93236ba327 add support for forwarding mem intrinsic values to non-local loads.
llvm-svn: 90697
2009-12-06 04:54:31 +00:00
Chris Lattner 42376066eb Handle forwarding local memsets to loads. For example, we optimize this:
short x(short *A) {
  memset(A, 1, sizeof(*A)*100);
  return A[42];
}

to 'return 257' instead of doing the load.  

llvm-svn: 90695
2009-12-06 01:57:02 +00:00
Nick Lewycky a0e9d700dc Generalize this optimization to work on equality comparisons between any two
integers that are constant except for a single bit (the same n-th bit in each).

llvm-svn: 90646
2009-12-05 05:00:00 +00:00
Bob Wilson 050b812fe7 Fix up some comments.
llvm-svn: 90603
2009-12-04 21:57:37 +00:00
Bob Wilson 5ca37b274c Fix 80-column violations.
llvm-svn: 90601
2009-12-04 21:51:35 +00:00
Bob Wilson 53bdae3802 Fix a comment typo.
llvm-svn: 90487
2009-12-03 21:47:07 +00:00
Owen Anderson 0b6e260066 Fix this crasher, and add a FIXME for a missed optimization.
llvm-svn: 90408
2009-12-03 03:43:29 +00:00
Chris Lattner a48f44d9ee improve portability to avoid conflicting with std::next in c++'0x.
Patch by Howard Hinnant!

llvm-svn: 90365
2009-12-03 00:50:42 +00:00
Owen Anderson b9878ee6b6 Cleanup/remove some parts of the lifetime region handling code in memdep and GVN,
per Chris' comments.  Adjust testcases to match.

llvm-svn: 90304
2009-12-02 07:35:19 +00:00
Chris Lattner c468025ac9 factor some code better.
llvm-svn: 90299
2009-12-02 06:44:58 +00:00
Chris Lattner 2764b4dc55 formatting cleanups.
llvm-svn: 90298
2009-12-02 06:35:55 +00:00
Chris Lattner eea42c7b51 tidy up, remove dependence on order of evaluation of function args from EmitMemCpy.
llvm-svn: 90297
2009-12-02 06:05:42 +00:00
Chris Lattner 3c9aca9079 fix PR5640 by tracking whether a block is the header of a loop more
precisely, which prevents us from infinitely peeling the loop.

llvm-svn: 90211
2009-12-01 06:04:43 +00:00
Benjamin Kramer 3efc050ac4 Revert r90089 for now, it's breaking selfhost.
llvm-svn: 90097
2009-11-29 21:17:48 +00:00
Benjamin Kramer bfa993ab20 Fix two FIXMEs.
llvm-svn: 90089
2009-11-29 20:29:30 +00:00
Chris Lattner 1cc4cca193 add testcases for the foo_with_overflow op xforms added recently and
fix bugs exposed by the tests.  Testcases from Alastair Lynn!

llvm-svn: 90056
2009-11-29 02:57:29 +00:00
Chris Lattner cd261c9c26 Implement PR5634.
llvm-svn: 90046
2009-11-29 00:51:17 +00:00
Chris Lattner 32140312ca reenable load address insertion in load pre. This allows us to
handle cases like this:
void test(int N, double* G) {
  long j;
  for (j = 1; j < N - 1; j++)
      G[j+1] = G[j] + G[j+1];
}

where G[1] isn't live into the loop.

llvm-svn: 90041
2009-11-28 16:08:18 +00:00
Chris Lattner 44da5bd837 Enhance InsertPHITranslatedPointer to be able to return a list of newly
inserted instructions.  No functionality change until someone starts using it.

llvm-svn: 90039
2009-11-28 15:39:14 +00:00
Chris Lattner cf0b198827 disable value insertion for now, I need to figure out how
to inform GVN about the newly inserted values.  This fixes 
PR5631.

llvm-svn: 90022
2009-11-27 22:50:07 +00:00
Chris Lattner 2be52e72ae Rework InsertPHITranslatedPointer to handle the recursive case, this
fixes PR5630 and sets the stage for the next phase of goodness (testcase
pending).

llvm-svn: 90019
2009-11-27 22:05:15 +00:00
Chris Lattner 3d9823b9cf factor some logic out of instcombine into a new SimplifyAddInst method.
llvm-svn: 90011
2009-11-27 17:42:22 +00:00
Chris Lattner 2226db66ab fix PR5436 by making the 'simple' case of SRoA not promote out of range
array indexes.  The "complex" case of SRoA still handles them, and correctly.

This fixes a weirdness where we'd correctly avoid transforming A[0][42] if
the 42 was too large, but we'd only do it if it was one gep, not two separate
ones.

llvm-svn: 90007
2009-11-27 16:37:41 +00:00
Chris Lattner 25be93dfed teach GVN's load PRE to insert computations of the address in predecessors
where it is not available.  It's unclear how to get this inserted 
computation into GVN's scalar availability sets, Owen, help? :)

llvm-svn: 89997
2009-11-27 08:25:10 +00:00
Chris Lattner a9a76ccf56 Fix phi translation in load PRE to agree with the phi
translation done by memdep, and reenable gep translation 
again.

llvm-svn: 89992
2009-11-27 06:31:14 +00:00
Chris Lattner 8574aba4ea factor some instcombine simplifications for getelementptr out to a new
SimplifyGEPInst method in InstructionSimplify.h.  No functionality change.

llvm-svn: 89980
2009-11-27 00:29:05 +00:00
Chris Lattner a5bc618a91 fix crash on Transforms/InstCombine/intrinsics.ll introduced by r89970
llvm-svn: 89972
2009-11-26 22:08:06 +00:00
Chris Lattner a73ecf0b00 Fix PR5471 by removing an instcombine xform. Some pieces of the code
generates store to undef and some generates store to null as the idiom
for undefined behavior.  Since simplifycfg zaps both, don't remove the
undefined behavior in instcombine.

llvm-svn: 89971
2009-11-26 22:04:42 +00:00
Chris Lattner 5b83ba215d implement a bunch of xforms for overflow intrinsics, based on a patch
by Alastair Lynn.

llvm-svn: 89970
2009-11-26 21:42:47 +00:00
Edward O'Callaghan 2b8fed15e0 Reverting patch in revision 89758, initial attempt at fixing PR5373 has proven to be bogus.
llvm-svn: 89844
2009-11-25 05:38:41 +00:00
Edward O'Callaghan 5fd452d596 Fix for PR5373, Credit to Jakub Staszak.
llvm-svn: 89758
2009-11-24 11:51:52 +00:00
Dan Gohman 1f522d98f8 Fix a use of an invalidated iterator in the case where there are multiple
adjacent uses of a dead basic block from the same user. This fixes PR5596.

llvm-svn: 89658
2009-11-23 16:13:39 +00:00
Nick Lewycky 15a1287c1f Pull LLVMContext out of PromoteMemToReg.
llvm-svn: 89645
2009-11-23 03:50:44 +00:00
Nick Lewycky 621fe5614e Remove LLVMContext and its include.
llvm-svn: 89644
2009-11-23 03:34:29 +00:00
Nick Lewycky 922d4ab574 Reapply r88830 with a bugfix: this transform only applies to icmp eq/ne. This
fixes part of PR5438.

llvm-svn: 89639
2009-11-23 03:17:33 +00:00
Eric Christopher 0c7bd96de2 Add more optimizations for object size checking, enable handling of
object size intrinsic and verify return type is correct. Collect various
code in one place.

llvm-svn: 89523
2009-11-21 01:01:30 +00:00
Dan Gohman d15302afa0 Fix IPSCCP's code for deleting dead blocks to tolerate outstanding
blockaddress users. This fixes PR5569.

llvm-svn: 89483
2009-11-20 20:19:14 +00:00
Daniel Dunbar f87c75706f Revert "Add some rough optimizations for checking routines.", it buildeth not.
llvm-svn: 89482
2009-11-20 20:17:30 +00:00
Eric Christopher cf97d01dff Add some rough optimizations for checking routines.
llvm-svn: 89479
2009-11-20 19:57:37 +00:00
Duncan Sands 9e26aac773 Fix PR5563, an expensive checks failure when running on
tests/Transforms/InstCombine/shufflemask-undef.ll.  If
anyone cares, the use of 2*e here (and the equivalent
all over the place in instcombine) seems wrong, though
harmless: it should really be twice the length of the
input vector.  I think shufflevector used to require
that the mask have the same length as the input, but I
don't think that's true any more.  I don't care enough
about vectors to do anything about this...

llvm-svn: 89456
2009-11-20 13:19:51 +00:00
Dan Gohman cbc6ebb6fd Enable hoisting of loads from constant memory by default. In cases where
they are lowered to instruction sequences more complex than a simple
load, such that CodeGen cannot rematerialize them, a reload from a
spill slot is likely to be cheaper than the complex sequence.

llvm-svn: 89374
2009-11-19 19:00:10 +00:00
Jim Grosbach 6bf5305f5d grammar
llvm-svn: 89145
2009-11-17 21:37:04 +00:00
Jim Grosbach e4e018ae67 80-column violations
llvm-svn: 89123
2009-11-17 19:05:35 +00:00
Evan Cheng ba4e5da727 Generalize OptimizeLoopTermCond to optimize more loop terminating icmp to use postinc iv.
llvm-svn: 89116
2009-11-17 18:10:11 +00:00
Jim Grosbach 60f4854c76 Remove trailing whitespace
llvm-svn: 89110
2009-11-17 17:53:56 +00:00
David Greene a3ce7828b2 Fix an expensive-checks error.
The Mask and LHSMask may not be of the same size, so don't do the
transformation if they're different.

llvm-svn: 88972
2009-11-16 21:52:23 +00:00
Duncan Sands e5de4a9ad6 CreateIntCast takes an "isSigned" parameter. Pass "true" for it, rather than
a name.

llvm-svn: 88908
2009-11-16 12:32:28 +00:00
Chris Lattner 9d9812a636 make PRE of loads preserve the alignment of the moved load instruction.
llvm-svn: 88865
2009-11-15 19:58:31 +00:00
Chris Lattner 5f037b6439 fix a bug handling 'not x' when x is undef.
llvm-svn: 88864
2009-11-15 19:57:43 +00:00
Nick Lewycky 95148689c9 Revert r88830 and r88831 which appear to have caused a selfhost buildbot some
grief. I suspect this patch merely exposed a bug else.

llvm-svn: 88841
2009-11-15 07:47:32 +00:00
Nick Lewycky e29fa4c7a1 Teach instcombine to look for booleans in wider integers when it encounters a
zext(icmp). It may be able to optimize that away. This fixes one of the cases
in PR5438.

llvm-svn: 88830
2009-11-15 05:55:17 +00:00
Nick Lewycky 7935bcb0fe Remove LLVMContext from reassociate. It was threaded through every function but
ultimately never used.

llvm-svn: 88763
2009-11-14 07:25:54 +00:00
Dan Gohman 81132465d3 Add an option for running GVN with redundant load processing disabled.
llvm-svn: 88742
2009-11-14 02:27:51 +00:00
Owen Anderson e96b2111b1 Re-enable this code, since redundant PHIs are now being better nuked.
llvm-svn: 87042
2009-11-12 23:22:41 +00:00
Evan Cheng 85a9f430e9 - Teach LSR to avoid changing cmp iv stride if it will create an immediate that
cannot be folded into target cmp instruction.
- Avoid a phase ordering issue where early cmp optimization would prevent the
  later count-to-zero optimization.
- Add missing checks which could cause LSR to reuse stride that does not have
  users.
- Fix a bug in count-to-zero optimization code which failed to find the pre-inc
  iv's phi node.
- Remove, tighten, loosen some incorrect checks disable valid transformations.
- Quite a bit of code clean up.

llvm-svn: 86969
2009-11-12 07:35:05 +00:00
Chris Lattner 5f6b8b2bcb use getPredicateOnEdge to fold comparisons through PHI nodes,
which implements GCC PR18046.  This also gets us 360 more
jump threads on 176.gcc.

llvm-svn: 86953
2009-11-12 05:24:05 +00:00
Chris Lattner 22db4b5e0c various fixes to the lattice transfer functions.
llvm-svn: 86952
2009-11-12 04:57:13 +00:00
Chris Lattner c893c4ed10 switch jump threading to use getPredicateOnEdge in one place
making the new LVI stuff smart enough to subsume some special
cases in the old code.  Disable them when LVI is around, the
testcase still passes.

llvm-svn: 86951
2009-11-12 04:37:50 +00:00
Chris Lattner ba45616958 with the new code we can thread non-instruction values. This
allows us to handle the test10 testcase.

llvm-svn: 86924
2009-11-12 01:41:34 +00:00
Chris Lattner 3f80d85191 this argument can be an arbitrary value, it doesn't need to be an instruction.
llvm-svn: 86923
2009-11-12 01:37:43 +00:00
Chris Lattner d5e25436a1 expose edge information and switch j-t to use it.
llvm-svn: 86920
2009-11-12 01:29:10 +00:00
Chris Lattner 67146695b6 pass TD into a SimplifyCmpInst call. Add another case that
uses LVI info when -enable-jump-threading-lvi is passed.

llvm-svn: 86886
2009-11-11 22:31:38 +00:00
Chris Lattner 852f2653c4 remove the now dead condprop pass, PR3906.
llvm-svn: 86810
2009-11-11 05:56:35 +00:00
Chris Lattner fde1f8d0d8 stub out some LazyValueInfo interfaces, and have JumpThreading
start using them in a trivial way when -enable-jump-threading-lvi
is passed.  enable-jump-threading-lvi will be my playground for 
awhile.

llvm-svn: 86789
2009-11-11 02:08:33 +00:00
Chris Lattner 3a2ae908fe add a fixme
llvm-svn: 86766
2009-11-11 00:21:58 +00:00
Evan Cheng 12f146d8f7 Block terminator may be a switch.
llvm-svn: 86761
2009-11-11 00:00:21 +00:00
Chris Lattner 9518fbb54e implement a TODO by teaching jump threading about "xor x, 1".
llvm-svn: 86739
2009-11-10 22:39:16 +00:00
Chris Lattner 852d6d64ff move some generally useful functions out of jump threading
into libanalysis and transformutils.

llvm-svn: 86735
2009-11-10 22:26:15 +00:00
Chris Lattner 02e2cee7dc fix a crash in SCCP handling extractvalue of an array, pointed out and
tracked down by Stephan Reiter!

llvm-svn: 86726
2009-11-10 22:02:09 +00:00
Chris Lattner 40b15f220d improve comment.
llvm-svn: 86723
2009-11-10 21:45:09 +00:00
Chris Lattner 80e7e5a429 Make jump threading eliminate blocks that just contain phi nodes,
debug intrinsics, and an unconditional branch when possible.  This
reuses the TryToSimplifyUncondBranchFromEmptyBlock function split
out of simplifycfg.

llvm-svn: 86722
2009-11-10 21:40:01 +00:00
Evan Cheng 87fe40b32d Generalize lsr code that optimize loop to count down towards zero.
llvm-svn: 86715
2009-11-10 21:14:05 +00:00
Duncan Sands 23344095de Add defensive break.
llvm-svn: 86705
2009-11-10 19:36:40 +00:00
Duncan Sands 8d4cde2b55 Fix obvious typo.
llvm-svn: 86694
2009-11-10 18:21:37 +00:00
Chris Lattner b8f79ba10e clarify logic.
llvm-svn: 86689
2009-11-10 17:00:47 +00:00
Duncan Sands 1925d3a1d1 Teach DSE to eliminate useless trampolines.
llvm-svn: 86683
2009-11-10 13:49:50 +00:00
Duncan Sands 04e0c95248 Add brackets to make gcc-4.4 happy.
llvm-svn: 86681
2009-11-10 09:32:10 +00:00
Chris Lattner 1559bedcc7 unify the code that determines whether it is a good idea to change the type
of a computation.  This fixes some infinite loops when dealing with TD that
has no native types.

llvm-svn: 86670
2009-11-10 07:23:37 +00:00
Nick Lewycky 5b3def9b86 Simplify.
llvm-svn: 86668
2009-11-10 07:00:43 +00:00
Nick Lewycky 9027147fb1 Reapply r86359, "Teach dead store elimination that certain intrinsics write to
memory just like a store" with bug fixed (partial-overwrite.ll is the
regression test).

llvm-svn: 86667
2009-11-10 06:46:40 +00:00
Chris Lattner 38c44ea6b0 make jump threading recursively simplify expressions instead of doing it
just one level deep.  On the testcase we go from getting this:

F1:                                               ; preds = %T2
  %F = and i1 true, %cond                         ; <i1> [#uses=1]
  br i1 %F, label %X, label %Y

to a fully threaded:

F1:                                               ; preds = %T2
  br label %Y


This changes gets us to the point where we're forming (too many) switch 
instructions on doug's strswitch testcase.

llvm-svn: 86646
2009-11-10 01:57:31 +00:00
Chris Lattner be11db6894 don't invalidate PN, rewrite of this code is in progress anyway.
llvm-svn: 86639
2009-11-10 01:19:06 +00:00
Chris Lattner fb7f87d5a3 add a new SimplifyInstruction API, which is like ConstantFoldInstruction,
except that the result may not be a constant.  Switch jump threading to 
use it so that it gets things like (X & 0) -> 0, which occur when phi preds
are deleted and the remaining phi pred was a zero.

llvm-svn: 86637
2009-11-10 01:08:51 +00:00
Jeffrey Yasskin b40d3f76a0 Fix DenseMap iterator constness.
This patch forbids implicit conversion of DenseMap::const_iterator to
DenseMap::iterator which was possible because DenseMapIterator inherited
(publicly) from DenseMapConstIterator. Conversion the other way around is now
allowed as one may expect.

The template DenseMapConstIterator is removed and the template parameter
IsConst which specifies whether the iterator is constant is added to
DenseMapIterator.

Actually IsConst parameter is not necessary since the constness can be
determined from KeyT but this is not relevant to the fix and can be addressed
later.

Patch by Victor Zverovich!

llvm-svn: 86636
2009-11-10 01:02:17 +00:00
Chris Lattner a71e9d61be factor simplification logic for AND and OR out to InstSimplify from instcombine.
llvm-svn: 86635
2009-11-10 00:55:12 +00:00
Chris Lattner ccfdceb22c pull a bunch of logic out of instcombine into instsimplify for compare
simplification, this handles the foldable fcmp x,x cases among many others.

llvm-svn: 86627
2009-11-09 23:55:12 +00:00
Chris Lattner beadc6e8c7 inline a simple function.
llvm-svn: 86625
2009-11-09 23:31:49 +00:00