Commit Graph

2730 Commits

Author SHA1 Message Date
Evan Cheng 398f70292c RewriteExpr, either the new PHI node of induction variable or the
post-increment value, should be first cast to the appropriated type (to the
type of the common expr). Otherwise, the rewrite of a use based on (common +
iv) may end up with an incorrect type.

llvm-svn: 28735
2006-06-09 00:12:42 +00:00
Owen Anderson 5d029264ec Update some comments, and expose LCSSAID in preparation for having other passes
require LCSSA.

llvm-svn: 28734
2006-06-08 20:02:53 +00:00
Reid Spencer d4b795902c Fix a spello in a comment.
llvm-svn: 28714
2006-06-07 21:24:10 +00:00
Chris Lattner 95cebb082f Fix a bug in a recent patch. This fixes UnitTests/Vector/Altivec/casts.c on
PPC/altivec

llvm-svn: 28698
2006-06-06 22:26:02 +00:00
Owen Anderson ac601b4c4b Fix some formatting, and use inLoop() when appropriate.
llvm-svn: 28694
2006-06-06 04:36:36 +00:00
Owen Anderson 9e81c1bb03 Stop a memory leak, and update some comments.
llvm-svn: 28693
2006-06-06 04:28:30 +00:00
Owen Anderson 766f90b08e Some more clean-up, and squash an IDF-Phi related bug.
llvm-svn: 28680
2006-06-04 00:55:19 +00:00
Owen Anderson eb33815f1b Various clean-ups suggested by Chris.
llvm-svn: 28678
2006-06-04 00:02:23 +00:00
Owen Anderson d00eacc4f9 Fix a bug in Phi-noded insertion. Also, update some comments to reflect what's
actually going on.

llvm-svn: 28677
2006-06-03 23:22:50 +00:00
Chris Lattner 540886f0ae Remove unneeded hook. Patch by Anton K. Thanks!
llvm-svn: 28664
2006-06-02 19:11:46 +00:00
Chris Lattner 02e0b4ddb7 Force anything that #includes llvm/Transforms/Utils/UnifyFunctionExitNodes.h
to link in the implementation.  Thanks to Anton Korobeynikov for figuring out
what was going on here.

llvm-svn: 28660
2006-06-02 18:40:06 +00:00
Chris Lattner cdf2b1fc30 Remove dead #include
llvm-svn: 28642
2006-06-01 20:02:28 +00:00
Chris Lattner cc340c02a4 Make the "pruning cloner" smarter. As it propagates constants through the
code (while cloning) it often gets the branch/switch instructions.  Since it
knows that edges of the CFG are dead, it need not clone (or even look) at
the obviously dead blocks.  This should speed up the inliner substantially on
code where there are lots of inlinable calls to functions with constant
arguments.  On C++ code in particular, this kicks in.

llvm-svn: 28641
2006-06-01 19:19:23 +00:00
Chris Lattner f905a7b994 Silence a -pedantic warning.
llvm-svn: 28632
2006-06-01 17:16:21 +00:00
Owen Anderson 619e4ba57f Remove a FIXME that was fixed with my last patch.
llvm-svn: 28619
2006-06-01 06:07:40 +00:00
Owen Anderson cd76fa04a1 More cleanups. Also, add a special case for updating PHI nodes, and
reimplement getValueDominatingFunction to walk the DominanceTree rather than
just searching blindly.

llvm-svn: 28618
2006-06-01 06:05:47 +00:00
Chris Lattner 1df0e98ac2 Swap the order of operands created here. For +&|^, the order doesn't matter,
but for sub, it really does!  Fix fixes a miscompilation of fibheap_cut in
llvmgcc4.

llvm-svn: 28600
2006-05-31 21:14:00 +00:00
Owen Anderson dad8c57340 Extract a huge loop into a helper method. Fix a few iterator-invalidation bugs.
llvm-svn: 28599
2006-05-31 20:55:06 +00:00
Owen Anderson 8a8f278f15 Add Use replacement. Assuming there is nothing horribly wrong with this, LCSSA
is now theoretically feature-complete.  It has not, however, been thoroughly
test, and is still considered experimental.

llvm-svn: 28529
2006-05-29 01:00:00 +00:00
Owen Anderson 152d063ccb Major think-o. Iterate over all live out-of-loop values, and perform the
other calculations on each individually, rather than trying to delay it and do
them all at the end.

llvm-svn: 28527
2006-05-28 19:33:28 +00:00
Owen Anderson 1310e42803 Make LCSSA insert proper Phi nodes throughout the rest of the CFG by computing
the iterated Dominance Frontier of the loop-closure Phi's.  This is the
second phase of the LCSSA pass.  The third phase (coming soon) will be to
update all uses of loop variables to use the loop-closure Phi's instead.

llvm-svn: 28524
2006-05-27 18:47:11 +00:00
Chris Lattner 67c424e010 Fix some regression from the inliner patch I committed last night. This fixes
ldecod, lencod, and SPASS.

llvm-svn: 28523
2006-05-27 17:28:13 +00:00
Chris Lattner be853d77e9 Switch the inliner over to using CloneAndPruneFunctionInto. This effectively
makes it so that it constant folds instructions on the fly.  This is good
for several reasons:

0. Many instructions are constant foldable after inlining, particularly if
   inlining a call with constant arguments.
1. Without this, the inliner has to allocate memory for all of the instructions
   that can be constant folded, then a subsequent pass has to delete them.  This
   gets the job done without this extra work.
2. This makes the inliner *pass* a bit more aggressive: in particular, it
   partially solves a phase order issue where the inliner would inline lots
   of code that folds away to nothing, but think that the resultant function
   is big because of this code that will be gone.  Now the code never exists.

This is the first part of a 2-step process.  The second part will be smart
enough to see when this implicit constant folding propagates a constant into
a branch or switch instruction, making CFG edges dead.

This implements Transforms/Inline/inline_constprop.ll

llvm-svn: 28521
2006-05-27 01:28:04 +00:00
Chris Lattner 3df13f4f22 Implement a new method, CloneAndPruneFunctionInto, as documented.
llvm-svn: 28519
2006-05-27 01:22:24 +00:00
Chris Lattner bc3c879fcf Refactor some code to expose an interface to constant fold and instruction given it's opcode, typeand operands.
llvm-svn: 28517
2006-05-27 01:18:04 +00:00
Owen Anderson b4e16996f1 A few small clean-ups, and the addition of an LCSSA statistic.
llvm-svn: 28512
2006-05-27 00:31:37 +00:00
Owen Anderson 6e047ab8fc Fix a copy-and-paste-o that would break some compilers.
llvm-svn: 28507
2006-05-26 21:19:17 +00:00
Owen Anderson f3dd3e2bfd Clean up and refactor LCSSA a bunch. It should also run faster now, though
there's still a lot of work to be done on it.

llvm-svn: 28506
2006-05-26 21:11:53 +00:00
Chris Lattner dab43b2b0e Implement Transforms/InstCombine/store.ll:test2.
llvm-svn: 28503
2006-05-26 19:19:20 +00:00
Owen Anderson 8eca8910b6 Skeletal LCSSA pass. This is currently non-functional. Expect functionality
and documentation updates soo.

llvm-svn: 28495
2006-05-26 13:58:26 +00:00
Chris Lattner 0e47716e69 Transform things like (splat(splat)) -> splat
llvm-svn: 28490
2006-05-26 00:29:06 +00:00
Chris Lattner 12249be286 Introduce a helper function that simplifies interpretation of shuffle masks.
No functionality change.

llvm-svn: 28489
2006-05-25 23:48:38 +00:00
Chris Lattner 99155be33f Turn (cast (shuffle (cast)) -> shuffle (cast) if it reduces the # casts in
the program.  This exposes more opportunities for the instcombiner, and implements
vec_shuffle.ll:test6

llvm-svn: 28487
2006-05-25 23:24:33 +00:00
Chris Lattner 83f6578b0c extract element from a shuffle vector can be trivially turned into an
extractelement from the SV's source.  This implement vec_shuffle.ll:test[45]

llvm-svn: 28485
2006-05-25 22:53:38 +00:00
Chris Lattner 0853700582 Revert a patch that is unsafe, due to out of range array accesses in inner
array scopes possibly accessing valid memory in outer subscripts.

llvm-svn: 28478
2006-05-25 21:25:12 +00:00
Chris Lattner a643d528bd Patch for a new instcombine xform, patch contributed by Nick Lewycky!
This implements Transforms/InstCombine/2006-05-10-InvalidIndexUndef.ll

llvm-svn: 28450
2006-05-24 17:34:30 +00:00
Chris Lattner aa2372562e Patches to make the LLVM sources more -pedantic clean. Patch provided
by Anton Korobeynikov!  This is a step towards closing PR786.

llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Chris Lattner d0622b6894 Silence a bogus gcc warning
llvm-svn: 28422
2006-05-20 23:14:03 +00:00
Reid Spencer 2452c94df4 Fix a doxygen problem and break lines at 80 columns
llvm-svn: 28395
2006-05-19 19:09:46 +00:00
Chris Lattner e4cb4768fa Declare that lowerinvoke doesn't interact with other lowering passes.
Patch written by Domagoj Babic!

llvm-svn: 28367
2006-05-17 21:05:27 +00:00
Chris Lattner 2e266807c3 Add a CloneModule call that exposes the mapping of values from the old module
to the new module.  Patch provided by Nick Lewycky!

llvm-svn: 28349
2006-05-17 18:05:35 +00:00
Chris Lattner 35515557c7 remove some dead code identified by coverity
llvm-svn: 28289
2006-05-14 18:45:44 +00:00
Chris Lattner 3237da073e remove dead variables
llvm-svn: 28286
2006-05-14 18:33:57 +00:00
Evan Cheng 18d0438148 Backing out last check-in for now. It's causing an infinite loop gccas lencode.
llvm-svn: 28284
2006-05-14 06:46:03 +00:00
Chris Lattner 3987a8532d Add/Sub/Mul are safe to promote here as well. Incrementing a single-bit
bitfield now gives this code:

_plus:
        lwz r2, 0(r3)
        rlwimi r2, r2, 0, 1, 31
        xoris r2, r2, 32768
        stw r2, 0(r3)
        blr

instead of this:

_plus:
        lwz r2, 0(r3)
        srwi r4, r2, 31
        slwi r4, r4, 31
        addis r4, r4, -32768
        rlwimi r2, r4, 0, 0, 0
        stw r2, 0(r3)
        blr

this can obviously still be improved.

llvm-svn: 28275
2006-05-13 02:16:08 +00:00
Chris Lattner 1ebbe6a22e Implement simple promotion for cast elimination in instcombine. This is
currently very limited, but can be extended in the future.  For example,
we now compile:

uint %test30(uint %c1) {
        %c2 = cast uint %c1 to ubyte
        %c3 = xor ubyte %c2, 1
        %c4 = cast ubyte %c3 to uint
        ret uint %c4
}

to:

_xor:
        movzbl 4(%esp), %eax
        xorl $1, %eax
        ret

instead of:

_xor:
        movb $1, %al
        xorb 4(%esp), %al
        movzbl %al, %eax
        ret

More impressively, we now compile:

struct B { unsigned bit : 1; };
void xor(struct B *b) { b->bit = b->bit ^ 1; }

To (X86/PPC):

_xor:
        movl 4(%esp), %eax
        xorl $-2147483648, (%eax)
        ret
_xor:
        lwz r2, 0(r3)
        xoris r2, r2, 32768
        stw r2, 0(r3)
        blr

instead of (X86/PPC):

_xor:
        movl 4(%esp), %eax
        movl (%eax), %ecx
        movl %ecx, %edx
        shrl $31, %edx
        # TRUNCATE movb %dl, %dl
        xorb $1, %dl
        movzbl %dl, %edx
        andl $2147483647, %ecx
        shll $31, %edx
        orl %ecx, %edx
        movl %edx, (%eax)
        ret

_xor:
        lwz r2, 0(r3)
        srwi r4, r2, 31
        xori r4, r4, 1
        rlwimi r2, r4, 31, 0, 0
        stw r2, 0(r3)
        blr

This implements InstCombine/cast.ll:test30.

llvm-svn: 28273
2006-05-13 02:06:03 +00:00
Chris Lattner cd60d38b30 Remove some dead variables.
Fix a nasty bug in the memcmp optimizer where we used the wrong variable!

llvm-svn: 28269
2006-05-12 23:35:26 +00:00
Chris Lattner 94acc47654 Remove dead stuff
llvm-svn: 28268
2006-05-12 23:32:01 +00:00
Chris Lattner 1443bc52be Refactor some code, making it simpler.
When doing the initial pass of constant folding, if we get a constantexpr,
simplify the constant expr like we would do if the constant is folded in the
normal loop.

This fixes the missed-optimization regression in
Transforms/InstCombine/getelementptr.ll last night.

llvm-svn: 28224
2006-05-11 17:11:52 +00:00
Chris Lattner a36ee4ea34 Two changes:
1. Implement InstCombine/deadcode.ll by not adding instructions in unreachable
   blocks (due to constants in conditional branches/switches) to the worklist.
   This causes them to be deleted before instcombine starts up, leading to
   better optimization.

2. In the prepass over instructions, do trivial constprop/dce as we go.  This
   has the effect of improving the effectiveness of #1.  In addition, it
   *significantly* speeds up instcombine on test cases with large amounts of
   constant folding code (for example, that produced by code specialization
   or partial evaluation).  In one example, it speeds up instcombine from
   0.0589s to 0.0224s with a release build (a 2.6x speedup).

llvm-svn: 28215
2006-05-10 19:00:36 +00:00
Chris Lattner 4fe87d67c4 Patch to make some xforms preserve each other. Patch contributed by
Domagoj Babic!

llvm-svn: 28181
2006-05-09 04:13:41 +00:00
Chris Lattner 1d441adfbf Move some code around.
Make the "fold (and (cast A), (cast B)) -> (cast (and A, B))" transformation
only apply when both casts really will cause code to be generated.  If one or
both doesn't, then this xform doesn't remove a cast.

This fixes Transforms/InstCombine/2006-05-06-Infloop.ll

llvm-svn: 28141
2006-05-06 09:00:16 +00:00
Chris Lattner e745c7de0e Fix an infinite loop compiling oggenc last night.
llvm-svn: 28128
2006-05-05 20:51:30 +00:00
Chris Lattner 3af1053488 Implement InstCombine/cast.ll:test29
llvm-svn: 28126
2006-05-05 06:39:07 +00:00
Chris Lattner fb29692055 Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll
llvm-svn: 28101
2006-05-04 17:33:35 +00:00
Chris Lattner 2d3a02725d Add pass ID's for various passes, so they can be AddRequiredID. Patch by
Domagoj Babic!

llvm-svn: 28048
2006-05-02 04:24:36 +00:00
Chris Lattner 655d08fda8 Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll
llvm-svn: 28019
2006-04-28 22:21:41 +00:00
Chris Lattner e63d808b6e Fix Transforms/Reassociate/2006-04-27-ReassociateVector.ll
llvm-svn: 28007
2006-04-28 04:14:49 +00:00
Chris Lattner b6cb64b7e6 Add support for inserting undef into a vector. This implements
Transforms/InstCombine/vec_insert_to_shuffle.ll

llvm-svn: 27997
2006-04-27 21:14:21 +00:00
Chris Lattner f98b4aa2e7 Fix some nondeterminstic behavior in the mem2reg pass that (in addition to
nondeterminism being bad) could cause some trivial missed optimizations (dead
phi nodes being left around for later passes to clean up).

With this, llvm-gcc4 now bootstraps and correctly compares.  I don't know
why I never tried to do it before... :)

llvm-svn: 27984
2006-04-27 01:14:43 +00:00
Chris Lattner dae49df407 Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll
llvm-svn: 27912
2006-04-20 20:48:50 +00:00
Andrew Lenharth f89e630b2f Make code match cvs commit message :)
llvm-svn: 27881
2006-04-20 15:41:37 +00:00
Andrew Lenharth 61eae29ad6 If we can convert the return pointer type into an integer that IntPtrType
can be converted to losslessly, we can continue the conversion to a direct call.

llvm-svn: 27880
2006-04-20 14:56:47 +00:00
Chris Lattner 36dd7c98d1 Turn x86 unaligned load/store intrinsics into aligned load/store instructions
if the pointer is known aligned.

llvm-svn: 27781
2006-04-17 22:26:56 +00:00
Chris Lattner 9095186deb Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform
Make the insert/extract elt -> shuffle code more aggressive.

This fixes CodeGen/PowerPC/vec_shuffle.ll

llvm-svn: 27728
2006-04-16 00:51:47 +00:00
Chris Lattner 34cebe785d Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask').
llvm-svn: 27727
2006-04-16 00:03:56 +00:00
Chris Lattner 39fac448d6 significant cleanups to code that uses insert/extractelt heavily. This builds
maximal shuffles out of them where possible.

llvm-svn: 27717
2006-04-15 01:39:45 +00:00
Chris Lattner 3323ce165d Teach scalarrepl to promote unions of vectors and floats, producing
insert/extractelement operations.  This implements
Transforms/ScalarRepl/vector_promote.ll

llvm-svn: 27710
2006-04-14 21:42:41 +00:00
Andrew Lenharth 92cf71f6d7 linear -> constant time
llvm-svn: 27652
2006-04-13 13:43:31 +00:00
Reid Spencer 13a1a7a4a6 Get rid of a signed/unsigned compare warning.
llvm-svn: 27625
2006-04-12 19:28:15 +00:00
Chris Lattner b19a5c661b Turn casts into getelementptr's when possible. This enables SROA to be more
aggressive in some cases where LLVMGCC 4 is inserting casts for no reason.

This implements InstCombine/cast.ll:test27/28.

llvm-svn: 27620
2006-04-12 18:09:35 +00:00
Chris Lattner 2d37f920ad Implement vec_shuffle.ll:test3
llvm-svn: 27573
2006-04-10 23:06:36 +00:00
Chris Lattner fbb77a408b Implement InstCombine/vec_shuffle.ll:test[12]
llvm-svn: 27571
2006-04-10 22:45:52 +00:00
Andrew Lenharth a9cdcca3c3 Add a simple pass to make sure that all (non-library) calls to malloc and free
are visible to analysis as intrinsics.  That is, make sure someone doesn't pass
free around by address in some struct (as happens in say 176.gcc).

This doesn't get rid of any indirect calls, just ensure calls to free and malloc
are always direct.

llvm-svn: 27560
2006-04-10 19:26:09 +00:00
Chris Lattner 17bd60588c Add supprot for shufflevector
llvm-svn: 27513
2006-04-08 01:19:12 +00:00
Chris Lattner 8ec0205de4 Fix inlining of insert/extract element constantexprs
llvm-svn: 27478
2006-04-07 04:41:03 +00:00
Chris Lattner e79d249c29 Lower vperm(x,y, mask) -> shuffle(x,y,mask) if mask is constant. This allows
us to compile oh-so-realistic stuff like this:

 vec_vperm(A, B, (vector unsigned char){14});

to:
        vspltb v0, v0, 14

instead of:

        vspltisb v0, 14
        vperm v0, v2, v1, v0

llvm-svn: 27452
2006-04-06 19:19:17 +00:00
Chris Lattner caba72b6ff vector casts of casts are eliminable. Transform this:
%tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]
        %tmp = cast <4 x int> %tmp to <4 x float>               ; <<4 x float>> [#uses=1]

into:

        %tmp = cast <4 x uint> %tmp to <4 x float>              ; <<4 x float>> [#uses=1]

llvm-svn: 27355
2006-04-02 05:43:13 +00:00
Chris Lattner ebca476b27 Allow transforming this:
%tmp = cast <4 x uint>* %testData to <4 x int>*         ; <<4 x int>*> [#uses=1]
        %tmp = load <4 x int>* %tmp             ; <<4 x int>> [#uses=1]

to this:

        %tmp = load <4 x uint>* %testData               ; <<4 x uint>> [#uses=1]
        %tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]

llvm-svn: 27353
2006-04-02 05:37:12 +00:00
Chris Lattner f42d0aeda1 Turn altivec lvx/stvx intrinsics into loads and stores. This allows the
elimination of one load from this:

int AreSecondAndThirdElementsBothNegative( vector float *in ) {
#define QNaN 0x7FC00000
const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN );
vector float test = vec_ld( 0, (float*) &testData );
return ! vec_any_ge( test, *in );
}

Now generating:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        addi r6, r1, -16
        lvx v0, r5, r4
        stvx v0, 0, r6
        lvx v1, 0, r3
        vcmpgefp. v0, v0, v1
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27352
2006-04-02 05:30:25 +00:00
Chris Lattner 70ec96fa32 Adjust to change in Intrinsics.gen interface.
llvm-svn: 27344
2006-04-02 03:35:01 +00:00
Chris Lattner 1b2436a624 add valuemapper support for inline asm
llvm-svn: 27332
2006-04-01 23:17:11 +00:00
Chris Lattner 6cf4914fd4 Fix InstCombine/2006-04-01-InfLoop.ll
llvm-svn: 27330
2006-04-01 22:05:01 +00:00
Chris Lattner dcd0792622 Fold A^(B&A) -> (B&A)^A
Fold (B&A)^A == ~B & A

This implements InstCombine/xor.ll:test2[56]

llvm-svn: 27328
2006-04-01 08:03:55 +00:00
Chris Lattner 8d1d8d364c If we can look through vector operations to find the scalar version of an
extract_element'd value, do so.

llvm-svn: 27323
2006-03-31 23:01:56 +00:00
Chris Lattner 92346c315e extractelement(undef,x) -> undef
llvm-svn: 27300
2006-03-31 18:25:14 +00:00
Chris Lattner 612fa8e6f3 Fix Transforms/InstCombine/2006-03-30-ExtractElement.ll
llvm-svn: 27261
2006-03-30 22:02:40 +00:00
Chris Lattner 42e0ba09aa teach the inliner to work with packed constants
llvm-svn: 27161
2006-03-27 05:50:18 +00:00
Chris Lattner d70d9f5b24 Don't crash on packed logical ops
llvm-svn: 27125
2006-03-25 21:58:26 +00:00
Chris Lattner f365f5f0c1 Fix spello
llvm-svn: 27052
2006-03-24 07:14:34 +00:00
Chris Lattner 5821a6a17a add the actual cost to the debug info
llvm-svn: 27051
2006-03-24 07:14:00 +00:00
Jim Laskey 8f64426f5c Strip changes to llvm.dbg intrinsics.
llvm-svn: 26993
2006-03-23 18:11:33 +00:00
Jim Laskey 83f99115db Can't combine anymore - we don't have a chain through llvm.dbg intrinsics.
llvm-svn: 26992
2006-03-23 18:10:42 +00:00
Chris Lattner 7d80b4f366 silence a bogus gcc warning
llvm-svn: 26953
2006-03-22 17:27:24 +00:00
Chris Lattner d783c76c18 Teach cee to propagate through switch statements. This implements
Transforms/CorrelatedExprs/switch.ll

Patch contributed by Eric Kidd!

llvm-svn: 26872
2006-03-19 19:37:24 +00:00
Evan Cheng c28282bd87 - Fixed a bogus if condition.
- Added more debugging info.
- Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride.

llvm-svn: 26841
2006-03-18 08:03:12 +00:00
Evan Cheng f09f0ebd48 Sort StrideOrder so we can process the smallest strides first. This allows
for more IV reuses.

llvm-svn: 26837
2006-03-18 00:44:49 +00:00
Evan Cheng 4520698820 Allow users of iv / stride to be rewritten with expression that is a multiply
of a smaller stride even if they have a common loop invariant expression part.

llvm-svn: 26828
2006-03-17 19:52:23 +00:00
Evan Cheng 3df447d354 For each loop, keep track of all the IV expressions inserted indexed by
stride. For a set of uses of the IV of a stride which is a multiple
of another stride, do not insert a new IV expression. Rather, reuse the
previous IV and rewrite the uses as uses of IV expression multiplied by
the factor.

e.g.
x = 0 ...; x ++
y = 0 ...; y += 4
then use of y can be rewritten as use of 4*x for x86.

llvm-svn: 26803
2006-03-16 21:53:05 +00:00
Chris Lattner 6d6084fd04 Teach the strip pass to strip type names in addition to value names. This
is fallout from the type/value split in the symtab long long ago :)

llvm-svn: 26785
2006-03-15 19:22:41 +00:00
Chris Lattner c5f866bb4a Implement a FIXME, recusively reassociating
A*A*B + A*A*C   -->   A*(A*B+A*C)   -->   A*(A*(B+C))

This implements Reassociate/mul-factor3.ll

llvm-svn: 26757
2006-03-14 16:04:29 +00:00
Chris Lattner 2fc319d444 extract some code into a method, no functionality change
llvm-svn: 26755
2006-03-14 07:11:11 +00:00
Chris Lattner d6bde46d85 Promote shifts by a constant to multiplies so that we can reassociate
(x<<1)+(y<<1) -> (X+Y)<<1.  This implements
Transforms/Reassociate/shift-factor.ll

llvm-svn: 26753
2006-03-14 06:55:18 +00:00
Evan Cheng c567c4efbb Added target lowering hooks which LSR consults to make more intelligent
transformation decisions.

llvm-svn: 26738
2006-03-13 23:14:23 +00:00
Jim Laskey acb6e34277 Handle the removal of the debug chain.
llvm-svn: 26729
2006-03-13 13:07:37 +00:00
Chris Lattner 60f6833376 use autogenerated side-effect information
llvm-svn: 26673
2006-03-09 22:38:10 +00:00
Chris Lattner 6b7847a5bc fix a pasto
llvm-svn: 26627
2006-03-09 06:09:41 +00:00
Chris Lattner fc34f8bb48 Fix a miscompilation of 188.ammp with the new CFE. 188.ammp is accessing
arrays out of range in a horrible way, but we shouldn't break it anyway.
Details in the comments.

llvm-svn: 26606
2006-03-08 01:05:29 +00:00
Jim Laskey 69effa2325 Switch to using a numeric id for anchors.
llvm-svn: 26598
2006-03-07 20:53:47 +00:00
Chris Lattner 7b87fd53f9 Fix ConstantMerge/2006-03-07-DontMergeDiffSections.ll, a problem Jim
hypotheticalized about, where we would incorrectly merge two globals in
different sections.

llvm-svn: 26597
2006-03-07 17:56:59 +00:00
Chris Lattner 53ef5a032c Teach the alignment handling code to look through constant expr casts and GEPs
llvm-svn: 26580
2006-03-07 01:28:57 +00:00
Chris Lattner 82f2ef20b6 Teach instcombine to increase the alignment of memset/memcpy/memmove when
the pointer is known to come from either a global variable, alloca or
malloc.  This allows us to compile this:

  P = malloc(28);
  memset(P, 0, 28);

into explicit stores on PPC instead of a memset call.

llvm-svn: 26577
2006-03-06 20:18:44 +00:00
Chris Lattner 6bc98653c2 Make vector narrowing more effective, implementing
Transforms/InstCombine/vec_narrow.ll.  This add support for narrowing
extract_element(insertelement) also.

llvm-svn: 26538
2006-03-05 00:22:33 +00:00
Chris Lattner 4c065091d8 Add factoring of multiplications, e.g. turning A*A+A*B into A*(A+B).
Testcase here: Transforms/Reassociate/mulfactor.ll

llvm-svn: 26524
2006-03-04 09:31:13 +00:00
Chris Lattner 32c01df299 Canonicalize (X+C1)*C2 -> X*C2+C1*C2
This implements Transforms/InstCombine/add.ll:test31

llvm-svn: 26519
2006-03-04 06:04:02 +00:00
Chris Lattner 681ef2f083 Change this to work with renamed intrinsics.
llvm-svn: 26484
2006-03-03 01:34:17 +00:00
Chris Lattner ea7986aeca Make this work with renamed intrinsics.
llvm-svn: 26482
2006-03-03 01:30:23 +00:00
Chris Lattner 85dda9a2bd Generalize the REM folding code to handle another case Nick Lewycky
pointed out: realize the AND can provide factors and look through Casts.

llvm-svn: 26469
2006-03-02 06:50:58 +00:00
Chris Lattner c5b6c9a12a Fix a regression in a patch from a couple of days ago. This fixes
Transforms/InstCombine/2006-02-28-Crash.ll

llvm-svn: 26427
2006-02-28 19:47:20 +00:00
Chris Lattner b70f141893 Implement rem.ll:test[7-9] and PR712
llvm-svn: 26415
2006-02-28 05:49:21 +00:00
Chris Lattner 2a7c7b8bab Simplify some code now that the RHS of a rem can't be 0
llvm-svn: 26413
2006-02-28 05:40:55 +00:00
Chris Lattner 0de4a8d7b7 Rearrange some code, fold "rem X, 0", implementing rem.ll:test6
llvm-svn: 26411
2006-02-28 05:30:45 +00:00
Chris Lattner c7bfed0f7b Merge two almost-identical pieces of code.
Make this code more powerful by using ComputeMaskedBits instead of looking
for an AND operand.  This lets us fold this:

int %test23(int %a) {
        %tmp.1 = and int %a, 1
        %tmp.2 = seteq int %tmp.1, 0
        %tmp.3 = cast bool %tmp.2 to int  ;; xor tmp1, 1
        ret int %tmp.3
}

into: xor (and a, 1), 1
llvm-svn: 26396
2006-02-27 02:38:23 +00:00
Chris Lattner f5c8a0b83f Fold (A^B) == A -> B == 0
and  (A-B) == A  ->  B == 0

llvm-svn: 26394
2006-02-27 01:44:11 +00:00
Chris Lattner f78df7c14d Fold (X|C1)^C2 -> X^(C1|C2) when possible. This implements
InstCombine/or.ll:test23.

llvm-svn: 26385
2006-02-26 19:57:54 +00:00
Chris Lattner b580d26e7d Fix a problem that Nate noticed that boils down to an over conservative check
in the code that does "select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y)))".
We now compile this loop:

LBB1_1: ; no_exit
        add r6, r2, r3
        subf r3, r2, r3
        cmpwi cr0, r2, 0
        addi r7, r5, 4
        lwz r2, 0(r5)
        addi r4, r4, 1
        blt cr0, LBB1_4 ; no_exit
LBB1_3: ; no_exit
        mr r3, r6
LBB1_4: ; no_exit
        cmpwi cr0, r4, 16
        mr r5, r7
        bne cr0, LBB1_1 ; no_exit

into this instead:

LBB1_1: ; no_exit
        srawi r6, r2, 31
        add r2, r2, r6
        xor r6, r2, r6
        addi r7, r5, 4
        lwz r2, 0(r5)
        addi r4, r4, 1
        add r3, r3, r6
        cmpwi cr0, r4, 16
        mr r5, r7
        bne cr0, LBB1_1 ; no_exit

llvm-svn: 26356
2006-02-24 18:05:58 +00:00
Chris Lattner e5521db5bc Fix Regression/Transforms/LoopUnswitch/2006-02-22-UnswitchCrash.ll, which
caused SPASS to fail building last night.

We can't trivially unswitch a loop if the exit block has phi nodes in it,
because we don't know which predecessor to use.

llvm-svn: 26320
2006-02-22 23:55:00 +00:00
Chris Lattner 8a5a324dac Add some comments, simplify some code, and fix a bug that caused rewriting
to rewrite with the wrong value.

llvm-svn: 26311
2006-02-22 06:37:14 +00:00
Chris Lattner c2e3a7a4ce improved support for branch folding, still not enabled.
llvm-svn: 26289
2006-02-18 07:57:38 +00:00
Jeff Cohen 0add83e969 Fix bugs identified by VC++.
llvm-svn: 26287
2006-02-18 03:20:33 +00:00
Chris Lattner 19fa8ac938 Implement deletion of dead blocks, currently disabled.
llvm-svn: 26285
2006-02-18 02:42:34 +00:00
Chris Lattner cb853de534 a previous patch completely disabled trivial unswitching, this fixees it.
Thanks to nate for pointing this out :)

llvm-svn: 26280
2006-02-18 01:32:04 +00:00
Chris Lattner 29f771ba21 initial trivial support for folding branches that have now-constant destinations.
llvm-svn: 26279
2006-02-18 01:27:45 +00:00
Chris Lattner 8e44ff50b0 When unswitching a loop, make sure to update loop info with exit blocks in
the right loop.

llvm-svn: 26277
2006-02-18 00:55:32 +00:00
Chris Lattner d95665188b Fix Transforms/SimplifyCFG/2006-02-17-InfiniteUnroll.ll
llvm-svn: 26275
2006-02-18 00:33:17 +00:00
Chris Lattner baddba41c7 Fix loops where the header has an exit, fixing a loop-unswitch crash on crafty
llvm-svn: 26258
2006-02-17 06:39:56 +00:00
Chris Lattner 6fd136239b start of some new simplification code, not thoroughly tested, use at your own
risk :)

llvm-svn: 26248
2006-02-17 00:31:07 +00:00
Nate Begeman 8a77efe4f7 Rework the SelectionDAG-based implementations of SimplifyDemandedBits
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.

llvm-svn: 26238
2006-02-16 21:11:51 +00:00
Chris Lattner fa335f6083 Change SplitBlock to increment a BasicBlock::iterator, not an Instruction*. Apparently they do different things :)
This fixes a testcase that nate reduced from spass.

Also included are a couple minor code changes that don't affect the generated
code at all.

llvm-svn: 26235
2006-02-16 19:36:22 +00:00
Jeff Cohen 55f63f1b53 Fix VC++ warning.
llvm-svn: 26228
2006-02-16 04:07:37 +00:00
Chris Lattner ff42e81028 fix a bug where we unswitched the wrong way
llvm-svn: 26225
2006-02-16 01:24:41 +00:00
Chris Lattner fdff0bb43e Implement trivial unswitching for switch stmts. This allows us to trivial
unswitch this loop on 2 before sweating to unswitch on 1/3.

void test4(int N, int i, int C, int*P, int*Q) {
  int j;
  for (j = 0; j < N; ++j) {
    switch (C) {                // general unswitching.
    default: P[i+j] = 0; break;
    case 1: Q[i+j] = 0; break;
    case 3: P[i+j] = Q[i+j]; break;
    case 2: break;              //  TRIVIAL UNSWITCH on C==2
    }
  }
}

llvm-svn: 26223
2006-02-15 22:52:05 +00:00
Chris Lattner e5cb76d744 make "trivial" unswitching significantly more general. It can now handle
this for example:

  for (j = 0; j < N; ++j) {     // trivial unswitch
    if (C)
      P[i+j] = 0;
  }

turning it into the obvious code without bothering to duplicate an empty loop.

llvm-svn: 26220
2006-02-15 22:03:36 +00:00
Andrew Lenharth 47da60130a fix a bunch of alpha regressions. see bug 709
llvm-svn: 26218
2006-02-15 21:13:37 +00:00
Chris Lattner 65152d80ec Checking the wrong value. This caused us to emit silly code like
Y = seteq bool X, true
instead of just using X :)

llvm-svn: 26215
2006-02-15 19:05:52 +00:00
Chris Lattner 01db04efb0 more refactoring, no functionality change.
llvm-svn: 26194
2006-02-15 01:44:42 +00:00
Chris Lattner b0cbe7106e pull some code out into a function
llvm-svn: 26191
2006-02-15 00:07:43 +00:00
Chris Lattner 9c5693fb2a Canonicalize inner loops before outer loops. Inner loop canonicalization
can provide work for the outer loop to canonicalize.

This fixes a case that breaks unswitching.

llvm-svn: 26189
2006-02-14 23:06:02 +00:00
Chris Lattner cffbbee8d1 When splitting exit edges to canonicalize loops, make sure to put the new
block in the appropriate loop nest.

Third time is the charm, right?

llvm-svn: 26187
2006-02-14 22:34:08 +00:00
Chris Lattner 0b8ec1a132 Use statistics to keep track of what flavors of loops we are unswitching
llvm-svn: 26157
2006-02-14 01:01:41 +00:00
Chris Lattner 8b10ab3002 Implement Instcombine/and.ll:test34
llvm-svn: 26155
2006-02-13 23:07:23 +00:00
Chris Lattner 7d8522884b If any of the sign extended bits are demanded, the input sign bit is demanded
for a sign extension.

This fixes InstCombine/2006-02-13-DemandedMiscompile.ll and Ptrdist/bc.

llvm-svn: 26152
2006-02-13 22:41:07 +00:00
Chris Lattner 68e7475777 Be careful not to request or look at bits shifted in from outside the size
of the input.  This fixes the mediabench/gsm/toast failure last night.

llvm-svn: 26138
2006-02-13 06:09:08 +00:00
Chris Lattner f5b4ef7f58 remove some more dead special case code
llvm-svn: 26135
2006-02-12 08:07:37 +00:00
Chris Lattner 5b2edb1fca Eliminate special case hacks that are superceded by general purpose hacks
llvm-svn: 26134
2006-02-12 08:02:11 +00:00
Chris Lattner ee0f280743 Three changes:
1. Teach GetConstantInType to handle boolean constants.
2. Teach instcombine to fold (compare X, CST) when X has known 0/1 bits.
   Testcase here: set.ll:test22
3. Improve the "(X >> c1) & C2 == 0" folding code to allow a noop cast
   between the shift and and.  More aggressive bitfolding for other reasons
   was turning signed shr's into unsigned shr's, leaving the noop cast in
   the way.

llvm-svn: 26131
2006-02-12 02:07:56 +00:00
Chris Lattner 02f53ad3a2 Revert my last patch. It too breaks stuff
llvm-svn: 26128
2006-02-12 01:59:10 +00:00
Chris Lattner 35248e06bc Fix for my previously reverted patch
llvm-svn: 26126
2006-02-11 21:24:54 +00:00
Chris Lattner 0157e7f55b Port the recent innovations in ComputeMaskedBits to SimplifyDemandedBits.
This allows us to simplify on conditions where bits are not known, but they
are not demanded either!  This also fixes a couple of bugs in
ComputeMaskedBits that were exposed during this work.

In the future, swaths of instcombine should be removed, as this code
subsumes a bunch of ad-hockery.

llvm-svn: 26122
2006-02-11 09:31:47 +00:00
Chris Lattner b24ce3a2a8 revert my previous change, it exposed other problems.
llvm-svn: 26121
2006-02-11 08:47:47 +00:00
Chris Lattner 05bf90dddf Make this check stricter. Disallow loop exit blocks from being shared by
loops and their subloops.

llvm-svn: 26118
2006-02-11 02:13:17 +00:00
Chris Lattner a6ae101afa remove dead expr
llvm-svn: 26116
2006-02-11 01:43:37 +00:00
Chris Lattner fbadd7e1ee implement unswitching of loops with switch stmts and selects in them
llvm-svn: 26114
2006-02-11 00:43:37 +00:00
Chris Lattner f1b151684d Update PHI nodes in successors of exit blocks.
llvm-svn: 26113
2006-02-10 23:26:14 +00:00
Chris Lattner fe4151efe7 Reform the unswitching code in terms of edge splitting, not block splitting.
llvm-svn: 26112
2006-02-10 23:16:39 +00:00
Chris Lattner ec6b40a093 Fix a case where UnswitchTrivialCondition broke critical edges with
phi's in the successors

llvm-svn: 26108
2006-02-10 19:08:15 +00:00
Chris Lattner 6e263155a6 add some notes, move some code around. Implement unswitching of loops
with branches on partially invariant computations.

llvm-svn: 26104
2006-02-10 02:30:37 +00:00
Chris Lattner 4935417a84 Move code around to be more logical, no functionality change.
llvm-svn: 26103
2006-02-10 02:01:22 +00:00
Chris Lattner 3fc3148b85 When unswitching a trivial loop, do admit we are doing it! :)
llvm-svn: 26102
2006-02-10 01:36:35 +00:00
Chris Lattner ed7a67b0de Implement unconditional unswitching of 'trivial' loops, those loops that contain
branches in their entry block that control whether or not the loop is a noop or not.

llvm-svn: 26101
2006-02-10 01:24:09 +00:00
Chris Lattner 4f0e66df6a Simplify control flow a bit, note that unswitch preserves canonical loop form
llvm-svn: 26098
2006-02-09 22:15:42 +00:00
Chris Lattner 8976219850 Make the threshold a parameter
llvm-svn: 26093
2006-02-09 20:15:48 +00:00
Chris Lattner 2826e0511b Simplify the loop-unswitch pass, by not even trying to unswitch loops with
uses of loop values outside the loop.  We need loop-closed SSA form to do
this right, or to use SSA rewriting if we really care.

llvm-svn: 26089
2006-02-09 19:14:52 +00:00
Chris Lattner 24cd2fa269 Fix 80-column violations
llvm-svn: 26088
2006-02-09 07:41:14 +00:00
Chris Lattner 4534dd59a3 Enhance MVIZ in three ways:
1. Teach it new tricks: in particular how to propagate through signed shr and sexts.
2. Teach it to return a bitset of known-1 and known-0 bits, instead of just zero.
3. Teach instcombine (AND X, C) to fold when we know all C bits of X.

This implements Regression/Transforms/InstCombine/bittest.ll, and allows
future things to be simplified.

llvm-svn: 26087
2006-02-09 07:38:58 +00:00
Chris Lattner ab2dc4d70d Simplify some code, reducing calls to MaskedValueIsZero. Implement a minor
optimization where we reduce the number of bits in AND masks when possible.

llvm-svn: 26056
2006-02-08 07:34:50 +00:00
Chris Lattner 5997cf9381 Use EraseInstFromFunction in a few cases to put the uses of the removed
instruction onto the worklist (in case they are now dead).

Add a really trivial local DSE implementation to help out bitfield code.
We now fold this:

struct S {
    unsigned char a : 1, b : 1, c : 1, d : 2, e : 3;
    S();
};

S::S() : a(0), b(0), c(1), d(0), e(6) {}

to this:

void %_ZN1SC1Ev(%struct.S* %this) {
entry:
        %tmp.1 = getelementptr %struct.S* %this, int 0, uint 0
        store ubyte 38, ubyte* %tmp.1
        ret void
}

much earlier (in gccas instead of only in gccld after DSE runs).

llvm-svn: 26050
2006-02-08 03:25:32 +00:00
Chris Lattner 06a0ed1ee0 Implement some more interesting select sccp cases. This implements:
test/Regression/Transforms/SCCP/select.ll

llvm-svn: 26049
2006-02-08 02:38:11 +00:00
Chris Lattner ddba3289b5 Fix a problem in my patch yesterday, causing a miscompilation of 176.gcc
llvm-svn: 26045
2006-02-08 01:20:23 +00:00
Chris Lattner 44314827d6 Fix Transforms/InstCombine/2006-02-07-SextZextCrash.ll
llvm-svn: 26040
2006-02-07 19:07:40 +00:00
Chris Lattner 92a6865321 Generalize MaskedValueIsZero into a ComputeMaskedNonZeroBits function, which
is just as efficient as MVIZ and is also more general.

Fix a few minor bugs introduced in recent patches

llvm-svn: 26036
2006-02-07 08:05:22 +00:00
Chris Lattner c3ebf40031 Make MaskedValueIsZero take a uint64_t instead of a ConstantIntegral as a
mask.  This allows the code to be simpler and more efficient.

Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive.

llvm-svn: 26035
2006-02-07 07:27:52 +00:00
Chris Lattner 77defbae0a Use Type::getIntegralTypeMask() to simplify some code
llvm-svn: 26034
2006-02-07 07:00:41 +00:00
Chris Lattner 2590e511d8 Implement the beginnings of a facility for simplifying expressions based on
'demanded bits', inspired by Nate's work in the dag combiner.  This isn't
complete, but needs to unrelated instcombiner changes to continue.

llvm-svn: 26033
2006-02-07 06:56:34 +00:00
Chris Lattner 2e90b732fa Turn A % (C << N), where C is 2^k, into A & ((C << N)-1) [urem only].
Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only].

Tested with: rem.ll:test5, div.ll:test10

llvm-svn: 26003
2006-02-05 07:54:04 +00:00
Chris Lattner d30c4991a1 Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces
#LLVM LOC, and auto-cse's cast instructions.

llvm-svn: 25974
2006-02-04 09:52:43 +00:00
Chris Lattner 2959f0003e Fix two significant bugs in LSR:
1. When rewriting code in outer loops, sometimes we would insert code into
   inner loops that is invariant in that loop.
2. Notice that 4*(2+x) is 8+4*x and use that to simplify expressions.

This is a performance neutral change.

llvm-svn: 25964
2006-02-04 07:36:50 +00:00
Jeff Cohen 15a8c15a1f Improve compatibility with VC2005, patch by Morten Ofstad!
llvm-svn: 25661
2006-01-26 20:41:32 +00:00
Chris Lattner 120f31b1fd teach the cloner to handle inline asms
llvm-svn: 25633
2006-01-26 01:55:22 +00:00
Chris Lattner c0f633a598 Fix Regression/Transforms/ScalarRepl/2006-01-24-IllegalUnionPromoteCrash.ll
llvm-svn: 25587
2006-01-24 19:36:27 +00:00
Chris Lattner 00fcdfef0d rename method
llvm-svn: 25572
2006-01-24 04:16:34 +00:00
Chris Lattner 37992b34c2 When cloning a module, clone the inline asm.
llvm-svn: 25559
2006-01-23 23:06:28 +00:00
Chris Lattner 5774040c09 add a bunch more optimizations for unary double math functions
llvm-svn: 25530
2006-01-23 06:24:46 +00:00
Chris Lattner 57a2863cbb Refactor/genericize this, no functionality change
llvm-svn: 25525
2006-01-23 05:57:36 +00:00
Chris Lattner c597b8a55e Make iostream #inclusion explicit
llvm-svn: 25514
2006-01-22 23:32:06 +00:00
Chris Lattner 33081b4648 Make this more efficient in the following ways:
1. Do not statically construct a map when the program starts up, this
   is expensive and cannot be optimized.  Instead, create a list.
2. Do not insert entries for all function in the module into a hashmap
   that lives the full life of the compiler.

llvm-svn: 25512
2006-01-22 23:10:26 +00:00
Chris Lattner 469640e506 Add explicit #includes of <iostream>
llvm-svn: 25509
2006-01-22 22:53:01 +00:00
Chris Lattner 0d4ebfc15b Several non-functionality changing changes:
1. Use the varargs version of getOrInsertFunction to simplify code.
2. remove #include
3. Reduce the number of #ifdef's.
4. remove extraneous vertical whitespace.

llvm-svn: 25508
2006-01-22 22:35:08 +00:00
Robert Bocchino 027c18da98 ConstantFoldLoadThroughGEPConstantExpr wasn't handling pointers to
packed types correctly.

llvm-svn: 25470
2006-01-19 23:53:23 +00:00
Reid Spencer ade182125f For PR696:
Don't do floor->floorf conversion if floorf is not available. This checks
the compiler's host, not its target, which is incorrect for cross-compilers
Not sure that's important as we don't build many cross-compilers.

llvm-svn: 25456
2006-01-19 08:36:56 +00:00
Chris Lattner e154abf9b3 Implement casts.ll:test26: a cast from float -> double -> integer, doesn't
need the float->double part.

llvm-svn: 25452
2006-01-19 07:40:22 +00:00
Chris Lattner 7be2203c9f If not internalizing, don't mark llvm.global[cd]tors const, as a fix for a
hypothetical future boog.

llvm-svn: 25430
2006-01-19 00:46:54 +00:00
Chris Lattner d693b7943a Don't internalize llvm.global[cd]tor unless there are uses of it. This
unbreaks front-ends that don't use __main (like the new CFE).

llvm-svn: 25429
2006-01-19 00:40:39 +00:00
Chris Lattner b98282d2d6 Make sure that cloning a module clones its target triple and dependent
library list as well.  This should help bugpoint.

llvm-svn: 25424
2006-01-18 21:32:45 +00:00
Robert Bocchino e6336a9b69 Constant folding support for the insertelement operation.
llvm-svn: 25407
2006-01-17 20:07:07 +00:00
Robert Bocchino 6dce25019d Lowerpacked and SCCP support for the insertelement operation.
llvm-svn: 25406
2006-01-17 20:06:55 +00:00
Chris Lattner 801f47512d Clean up the FFS optimization code, and make it correctly create the appropriate
unsigned llvm.cttz.* intrinsic, fixing the 2005-05-11-Popcount-ffs-fls regression
last night.

llvm-svn: 25398
2006-01-17 18:27:17 +00:00
Reid Spencer b4f9a6f110 For PR411:
This patch is an incremental step towards supporting a flat symbol table.
It de-overloads the intrinsic functions by providing type-specific intrinsics
and arranging for automatically upgrading from the old overloaded name to
the new non-overloaded name. Specifically:
  llvm.isunordered -> llvm.isunordered.f32, llvm.isunordered.f64
  llvm.sqrt -> llvm.sqrt.f32, llvm.sqrt.f64
  llvm.ctpop -> llvm.ctpop.i8, llvm.ctpop.i16, llvm.ctpop.i32, llvm.ctpop.i64
  llvm.ctlz -> llvm.ctlz.i8, llvm.ctlz.i16, llvm.ctlz.i32, llvm.ctlz.i64
  llvm.cttz -> llvm.cttz.i8, llvm.cttz.i16, llvm.cttz.i32, llvm.cttz.i64
New code should not use the overloaded intrinsic names. Warnings will be
emitted if they are used.

llvm-svn: 25366
2006-01-16 21:12:35 +00:00
Chris Lattner 307b7ea15f fix a crash due to missing parens
llvm-svn: 25363
2006-01-16 19:47:21 +00:00
Chris Lattner 0de2c7d3d8 This pass has never worked correctly. Remove.
llvm-svn: 25349
2006-01-16 01:06:00 +00:00
Chris Lattner f6d6823f09 Let the inliner update the callgraph to reflect the changes it makes, instead
of doing it ourselves.  This fixes Transforms/Inline/2006-01-14-CallGraphUpdate.ll

llvm-svn: 25321
2006-01-14 20:09:18 +00:00
Chris Lattner 0841fb1d4c Teach the inliner to update the CallGraph itself, and have it add edges to
llvm.stacksave/restore when it inserts calls to them.

llvm-svn: 25320
2006-01-14 20:07:50 +00:00
Chris Lattner ef530c24c1 FunctionPass's cannot do IPO things.
llvm-svn: 25315
2006-01-14 19:30:35 +00:00
Nate Begeman 82049eba2c Add bswap intrinsics as documented in the Language Reference
llvm-svn: 25309
2006-01-14 01:25:24 +00:00
Robert Bocchino a83529678e Added instcombine support for extractelement.
llvm-svn: 25299
2006-01-13 22:48:06 +00:00
Chris Lattner 5fba6e6696 it is ok to dce stacksave.
llvm-svn: 25295
2006-01-13 21:31:54 +00:00
Chris Lattner 503221f5c5 Do a simple instcombine xforms to delete llvm.stackrestore cases.
llvm-svn: 25294
2006-01-13 21:28:09 +00:00
Chris Lattner c66b223b28 Simplify this a tiny bit by using the new IntrinsicInst functionality.
llvm-svn: 25292
2006-01-13 20:11:04 +00:00
Chris Lattner 45406c0c53 Permit inlining functions that contain dynamic allocations now that
InlineFunction handles this case safely.  This implements
Transforms/Inline/dynamic_alloca_test.ll.

llvm-svn: 25288
2006-01-13 19:35:43 +00:00
Chris Lattner 2be0607a8d If inlining a call to a function that contains dynamic allocas, wrap the
resultant code with llvm.stacksave/llvm.stackrestore intrinsics.

llvm-svn: 25286
2006-01-13 19:34:14 +00:00
Chris Lattner e24f79a032 Use ClonedCodeInfo to avoid another walk over the inlined code, this this
time in common C cases.

llvm-svn: 25285
2006-01-13 19:18:11 +00:00
Chris Lattner 19e6a08d78 Use the ClonedCodeInfo object to avoid scans of the inlined code when
it doesn't contain any calls.  This is a fairly common case for C++ code,
so it will probably speed up the inliner marginally in these cases.

llvm-svn: 25284
2006-01-13 19:15:15 +00:00
Chris Lattner 908d79556d Refactor a bunch of invoke handling stuff out into a new function
"HandleInlinedInvoke".  No functionality change.

llvm-svn: 25283
2006-01-13 19:05:59 +00:00
Chris Lattner edad1288fd Allow the code cloning interfaces to capture some important info about the
code being cloned if the client wants.

llvm-svn: 25281
2006-01-13 18:39:17 +00:00
Chris Lattner 257492c0ab Fix a bug I noticed by inspection: if the first instruction in the inlined
function was not an alloca, we wouldn't check the entry block for any allocas,
leading to increased stack space in some cases.  In practice, allocas are almost
always at the top of the block, so this was never noticed.

llvm-svn: 25280
2006-01-13 18:16:48 +00:00
Chris Lattner 49c4d536bd Fix 80 column violations
llvm-svn: 25279
2006-01-13 18:06:56 +00:00
Chris Lattner 0770d8e326 Preserve and update ETForest. Patch by Daniel Berlin
llvm-svn: 25203
2006-01-11 05:11:13 +00:00
Chris Lattner cb36710ff9 Switch these to using ETForest instead of DominatorSet to compute itself.
Patch written by Daniel Berlin!

llvm-svn: 25202
2006-01-11 05:10:20 +00:00
Chris Lattner 48e4a2ebd8 Switch this to using ETForest instead of DominatorSet to compute itself.
Patch written by Daniel Berlin!

llvm-svn: 25201
2006-01-11 05:09:40 +00:00
Robert Bocchino 230044839d Added support for the extractelement operation.
llvm-svn: 25181
2006-01-10 19:05:34 +00:00
Robert Bocchino bd518d153b Added lower packed support for the extractelement operation.
llvm-svn: 25180
2006-01-10 19:05:05 +00:00
Chris Lattner cda4aa6eb4 Teach loopsimplify to update et-forest. Patch contributed by Daniel Berlin!
llvm-svn: 25153
2006-01-09 08:03:08 +00:00
Chris Lattner 9cbfbc21bb fix some 176.gcc miscompilation from my previous patch.
llvm-svn: 25137
2006-01-07 01:32:28 +00:00
Chris Lattner 330628a6d8 silence some bogus gcc warnings on fenris
llvm-svn: 25130
2006-01-06 17:59:59 +00:00
Chris Lattner eb372a0276 Enhance the shift-shift folding code to allow a no-op cast to occur in between
the shifts.

This allows us to fold this (which is the 'integer add a constant' sequence
from cozmic's scheme compmiler):

int %x(uint %anf-temporary776) {
        %anf-temporary777 = shr uint %anf-temporary776, ubyte 1
        %anf-temporary800 = cast uint %anf-temporary777 to int
        %anf-temporary804 = shl int %anf-temporary800, ubyte 1
        %anf-temporary805 = add int %anf-temporary804, -2
        %anf-temporary806 = or int %anf-temporary805, 1
        ret int %anf-temporary806
}

into this:

int %x(uint %anf-temporary776) {
        %anf-temporary776 = cast uint %anf-temporary776 to int
        %anf-temporary776.mask1 = add int %anf-temporary776, -2
        %anf-temporary805 = or int %anf-temporary776.mask1, 1
        ret int %anf-temporary805
}

note that instcombine already knew how to eliminate the AND that the two
shifts fold into.  This is tested by InstCombine/shift.ll:test26

-Chris

llvm-svn: 25128
2006-01-06 07:52:12 +00:00
Chris Lattner b330939d90 Simplify the code a bit more
llvm-svn: 25126
2006-01-06 07:22:22 +00:00
Chris Lattner 145539343f Extract a bunch of code out of visitShiftInst into FoldShiftByConstant. No
functionality changes.

llvm-svn: 25125
2006-01-06 07:12:35 +00:00
Chris Lattner 8cdc773748 Pull inline methods out of the pass class definition to make it easier to
read the code.

Do not internalize debugger anchors.

llvm-svn: 25067
2006-01-03 19:13:17 +00:00
Duraid Madina 7a3ad6cae2 getting there...
llvm-svn: 25021
2005-12-26 13:48:44 +00:00
Chris Lattner 8c9e14620f Fix Transforms/ScalarRepl/2005-12-14-UnionPromoteCrash.ll, a crash on undefined
behavior in 126.gcc on big-endian systems.

llvm-svn: 24708
2005-12-14 17:23:59 +00:00
Reid Spencer 175613adf6 Improve ResolveFunctions to:
a) use better local variable names (OldMT -> OldFT) where "M" is used to
   mean "Function" (perhaps it was previously "Method"?)
b) print out the module identifier in a warning message so that it is
   possible to track down in which module the error occurred.

llvm-svn: 24698
2005-12-13 19:56:51 +00:00
Chris Lattner 3b0a62d8a5 Implement a little hack for parity with GCC on crafty. This speeds up
186.crafty by about 16% (from 15.109s to 13.045s) on my system.

This turns allocas with unions/casts into scalars.  For example crafty has
something like this:

    union doub {
      unsigned short i[4];
      long long d;
    };
int f(long long a) {
  return ((union doub){.d=a}).i[1];
}

Instead of generating loads and stores to an alloca, we now promote the
whole thing to a scalar long value.

This implements: Transforms/ScalarRepl/AggregatePromote.ll

llvm-svn: 24667
2005-12-12 07:19:13 +00:00
Chris Lattner 077200737c getRawValue zero extens for unsigned values, use getsextvalue so that we
know that small negative values fit into the immediate field of addressing
modes.

llvm-svn: 24608
2005-12-05 18:23:57 +00:00
Chris Lattner 165998207e Wrap a long line, never internalize llvm.used.
llvm-svn: 24602
2005-12-05 05:07:38 +00:00
Chris Lattner 2820b8c855 Fix SimplifyCFG/2005-12-03-IncorrectPHIFold.ll
llvm-svn: 24581
2005-12-03 18:25:58 +00:00
Chris Lattner dc4ffef633 Fix a bug where we didn't realize that vaarg reads memory. This fixes
Transforms/DeadStoreElimination/2005-11-30-vaarg.ll

llvm-svn: 24545
2005-11-30 19:38:22 +00:00
Andrew Lenharth d251192910 a few more comments on the interfaces and functions
llvm-svn: 24500
2005-11-28 18:10:59 +00:00
Andrew Lenharth 517caef495 Added documented rsprofiler interface. Also remove new profiler passes, the
old ones have been updated to implement the interface.

llvm-svn: 24499
2005-11-28 18:00:38 +00:00
Jeff Cohen 7ff44ec372 Fix VC++ warning.
llvm-svn: 24496
2005-11-28 06:45:57 +00:00
Andrew Lenharth 93e59f6032 Random sampling (aka Arnold and Ryder) profiling. This is still preliminary, but it works on spec on x86 and alpha. The idea is to allow profiling passes to remember what profiling they inserted, then a random sampling framework is inserted which consists of duplicated basic blocks (without profiling), such that at each backedge in the program and entry into every function, the framework chooses whether to use the instrumented code or the instrumentation free code. The goal of such a framework is to make it reasonably cheap to do random sampling of very expensive profiling products (such as load-value profiling).
The code is organized into 3 parts (2 passes)
1) a linked set of profiling passes, which implement an analysis group (linked, like alias analysis are).  These insert profiling into the program, and remember what they inserted, so that at a later time they can be queried about any instruction.

2) a pass that handles inserting the random sampling framework.  This also has options to control how random samples are choosen.  Currently implemented are Global counters, register allocated global counters, and read cycle counter (see? there was a reason for it).

The profiling passes are almost identical to the existing ones (block, function, and null profiling is supported right now), and they are valid passes without the sampling framework (hence the existing passes can be unified with the new ones, not done yet).

Some things are a bit ugly still, but that should be fixed up soon enough.

Other todo? making the counter values not "magic 2^16 -1" values, but dynamically choosable.

llvm-svn: 24493
2005-11-28 00:58:09 +00:00
Andrew Lenharth 5fc3794e71 since reg2mem requires it, might as well mention that it preserves it
llvm-svn: 24491
2005-11-25 16:04:54 +00:00
Andrew Lenharth 061029dee2 Reg2Mem is something a pass may depend on, so allow that
llvm-svn: 24488
2005-11-22 22:14:23 +00:00
Andrew Lenharth 71b09bbb07 turns out, demotion and invokes and critical edges don't mix
llvm-svn: 24487
2005-11-22 21:45:19 +00:00
Chris Lattner 9c37f23645 Fix a crash building 176.gcc due to my recent patch, which only fixed
half the problem.

llvm-svn: 24414
2005-11-18 18:30:47 +00:00
Chris Lattner 3e9e8bd25c Implement a refinement to the mem2reg algorithm for cases where an alloca
has a single def.  In this case, look for uses that are dominated by the def
and attempt to rewrite them to directly use the stored value.

This speeds up mem2reg on these values and reduces the number of phi nodes
inserted.  This should address PR665.

llvm-svn: 24411
2005-11-18 07:31:42 +00:00
Chris Lattner 31dc3827d3 This needs proper dominance
llvm-svn: 24410
2005-11-18 07:29:44 +00:00
Chris Lattner bca0be812d This was checking the wrong GEP expression. Fixing this fixes a gccas crash
compiling mysql reported by Ted Kremenek.

llvm-svn: 24402
2005-11-17 19:35:42 +00:00
Andrew Lenharth d9c13b1336 the pain isn't gone unless the phinodes are spilled too
llvm-svn: 24288
2005-11-10 19:39:09 +00:00
Andrew Lenharth 8e66c0c8a9 this works with backedges to the existing entry block alot better
llvm-svn: 24270
2005-11-10 17:35:34 +00:00
Andrew Lenharth 4130a4f061 The pass everyone has been waiting for!
Reg2Mem

for fun you can opt -reg2mem -mem2reg

llvm-svn: 24267
2005-11-10 01:58:38 +00:00
Nate Begeman 848622f87f Add support alignment of allocation instructions.
Add support for specifying alignment and size of setjmp jmpbufs.

No targets currently do anything with this information, nor is it presrved
in the bytecode representation.  That's coming up next.

llvm-svn: 24196
2005-11-05 09:21:28 +00:00
Chris Lattner 16b29e9562 Implement Transforms/TailCallElim/return-undef.ll, a trivial case
that has been sitting in my inbox since May 18. :)

llvm-svn: 24194
2005-11-05 08:21:11 +00:00
Chris Lattner dd0c174082 Turn sdiv into udiv if both operands have a clear sign bit. This occurs
a few times in crafty:

OLD:    %tmp.36 = div int %tmp.35, 8            ; <int> [#uses=1]
NEW:    %tmp.36 = div uint %tmp.35, 8           ; <uint> [#uses=0]
OLD:    %tmp.19 = div int %tmp.18, 8            ; <int> [#uses=1]
NEW:    %tmp.19 = div uint %tmp.18, 8           ; <uint> [#uses=0]
OLD:    %tmp.117 = div int %tmp.116, 8          ; <int> [#uses=1]
NEW:    %tmp.117 = div uint %tmp.116, 8         ; <uint> [#uses=0]
OLD:    %tmp.92 = div int %tmp.91, 8            ; <int> [#uses=1]
NEW:    %tmp.92 = div uint %tmp.91, 8           ; <uint> [#uses=0]

Which all turn into shrs.

llvm-svn: 24190
2005-11-05 07:40:31 +00:00
Chris Lattner e9ff0eaf5b Turn srem -> urem when neither input has their sign bit set. This triggers
8 times in vortex, allowing the srems to be turned into shrs:

OLD:    %tmp.104 = rem int %tmp.5.i37, 16               ; <int> [#uses=1]
NEW:    %tmp.104 = rem uint %tmp.5.i37, 16              ; <uint> [#uses=0]
OLD:    %tmp.98 = rem int %tmp.5.i24, 16                ; <int> [#uses=1]
NEW:    %tmp.98 = rem uint %tmp.5.i24, 16               ; <uint> [#uses=0]
OLD:    %tmp.91 = rem int %tmp.5.i19, 8         ; <int> [#uses=1]
NEW:    %tmp.91 = rem uint %tmp.5.i19, 8                ; <uint> [#uses=0]
OLD:    %tmp.88 = rem int %tmp.5.i14, 8         ; <int> [#uses=1]
NEW:    %tmp.88 = rem uint %tmp.5.i14, 8                ; <uint> [#uses=0]
OLD:    %tmp.85 = rem int %tmp.5.i9, 1024               ; <int> [#uses=2]
NEW:    %tmp.85 = rem uint %tmp.5.i9, 1024              ; <uint> [#uses=0]
OLD:    %tmp.82 = rem int %tmp.5.i, 512         ; <int> [#uses=2]
NEW:    %tmp.82 = rem uint %tmp.5.i1, 512               ; <uint> [#uses=0]
OLD:    %tmp.48.i = rem int %tmp.5.i.i161, 4            ; <int> [#uses=1]
NEW:    %tmp.48.i = rem uint %tmp.5.i.i161, 4           ; <uint> [#uses=0]
OLD:    %tmp.20.i2 = rem int %tmp.5.i.i, 4              ; <int> [#uses=1]
NEW:    %tmp.20.i2 = rem uint %tmp.5.i.i, 4             ; <uint> [#uses=0]

it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61)
so the payoff isn't as great.

llvm-svn: 24189
2005-11-05 07:28:37 +00:00
Andrew Lenharth 662295587d make this 64 bit clean, fixed test30 of /Regression/Transforms/InstCombine/add.ll
llvm-svn: 24158
2005-11-02 18:35:40 +00:00
Chris Lattner 09efd4e5b6 Limit the search depth of MaskedValueIsZero to 6 instructions, to avoid
bad cases.  This fixes Markus's second testcase in PR639, and should
seal it for good.

llvm-svn: 24123
2005-10-31 18:35:52 +00:00
Chris Lattner 27d351f159 This pass is now obsolete since all targets have moved to the SelectionDAG
infrastructure and the simple isels have been removed.

llvm-svn: 24090
2005-10-29 05:33:46 +00:00
Chris Lattner 752717d4ec Remove dead #include
llvm-svn: 24083
2005-10-29 04:41:30 +00:00
Chris Lattner ceb9d5adaa Now that instcombine does this xform, remove it from the -raise pass
llvm-svn: 24082
2005-10-29 04:40:23 +00:00
Chris Lattner 8f663e8bbc Pull some code out into a function, give it the ability to see through +.
This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1)

llvm-svn: 24081
2005-10-29 04:36:15 +00:00
Chris Lattner 8270c33606 Remove a special case, allowing the general case to handle it. No functionality
change.

llvm-svn: 24076
2005-10-29 03:19:53 +00:00
Chris Lattner b9d3ca5c3c Fix a bit of backwards logic that broke exptree and smg2000
llvm-svn: 24056
2005-10-28 16:27:35 +00:00
Chris Lattner c4f67e67d2 Do not sink any instruction with side effects, including vaarg. This fixes
PR640

llvm-svn: 24046
2005-10-27 17:13:11 +00:00
Chris Lattner 479911f971 Fix #include order
llvm-svn: 24044
2005-10-27 16:34:00 +00:00
John Criswell fe5f33b120 Move some constant folding code shared by Analysis and Transform passes
into the LLVMAnalysis library.
This allows LLVMTranform and LLVMTransformUtils to be archives and linked
with LLVMAnalysis.a, which provides any missing definitions.

llvm-svn: 24036
2005-10-27 15:54:34 +00:00
Chris Lattner c6372cca78 Fix typo
llvm-svn: 24033
2005-10-27 06:26:26 +00:00
Chris Lattner 0fe7551bc0 Teach instcombine to promote stuff like (cast (malloc sbyte, 8*X) to int*)
into: malloc int, (2*X)

llvm-svn: 24032
2005-10-27 06:24:46 +00:00
Chris Lattner b3ecf96900 Promote cases like cast (malloc sbyte, 100) to int* into
(malloc [25 x int]) directly without having to convert to
(malloc [100 x sbyte]) first.

llvm-svn: 24031
2005-10-27 06:12:00 +00:00
Chris Lattner bb17180a23 Minor change to this file to support obscure cases with constant array amounts
llvm-svn: 24030
2005-10-27 05:53:56 +00:00
John Criswell 94b7bea733 1. Remove libraries no longer created from the list of libraries linked into the
SparcV9 JIT.
2. Make LLVMTransformUtils a relinked object file and always link it before
   LLVMAnalysis.a.  These two libraries have circular dependencies on each
   other which creates problem when building the SparcV9 JIT.  This change
   fixes the dependency on all platforms problems with a minimum of fuss.

llvm-svn: 24023
2005-10-26 20:35:13 +00:00
Chris Lattner 38a1b00a0f fold nested and's early to avoid inefficiencies in MaskedValueIsZero. This
fixes a very slow compile in PR639.

llvm-svn: 24011
2005-10-26 17:18:16 +00:00
Jeff Cohen 2b8cbf319c Update Visual Studio projects to reflect moved file.
llvm-svn: 23998
2005-10-26 05:36:51 +00:00
Alkis Evlogimenos cb67b650b5 Stop using deprecated types
llvm-svn: 23973
2005-10-25 11:18:06 +00:00
Chris Lattner 46705b2f2d Handle allocations that, even after removing dead uses, still have more than
one use (but one is a cast).  This handles the very common case of:

 X = alloc [n x byte]
 Y = cast X to somethingbetter
 seteq X, null

In order to avoid infinite looping when there are multiple casts, we only
allow this if the xform is strictly increasing the alignment of the
allocation.

llvm-svn: 23961
2005-10-24 06:35:18 +00:00
Chris Lattner 355ecc09f8 Fix a bug where we would 'promote' an allocation from one type to another
where the second has less alignment required.  If we had explicit alignment
support in the IR, we could handle this case, but we can't until we do.

llvm-svn: 23960
2005-10-24 06:26:18 +00:00
Chris Lattner ac87beb03a Before promoting a malloc type, remove dead uses. This makes instcombine
more effective at promoting these allocations, catching them earlier in the
compile process.

llvm-svn: 23959
2005-10-24 06:22:12 +00:00
Chris Lattner 216be91817 Pull some code out into a function, no functionality change
llvm-svn: 23958
2005-10-24 06:03:58 +00:00
Chris Lattner b37336978f Remove some beta code that no longer has an owner.
llvm-svn: 23944
2005-10-24 02:32:41 +00:00
Chris Lattner f9998d9704 Do not build the ProfilePaths directory anymore
llvm-svn: 23943
2005-10-24 02:31:49 +00:00
Chris Lattner bde3845548 DONT_BUILD_RELINKED is gone and implied by BUILD_ARCHIVE now
llvm-svn: 23940
2005-10-24 02:26:13 +00:00
Chris Lattner 8c087e962c Only build .a file versions of these libraries, instead of .a and .o versions.
This should speed up build times.

llvm-svn: 23933
2005-10-24 01:59:48 +00:00
Chris Lattner bd77fac034 Make sure that anything using the ADCE pass pulls in the UnifyFunctionExitNodes
code

llvm-svn: 23931
2005-10-24 01:40:23 +00:00
Jeff Cohen 11e26b52b2 When a function takes a variable number of pointer arguments, with a zero
pointer marking the end of the list, the zero *must* be cast to the pointer
type.  An un-cast zero is a 32-bit int, and at least on x86_64, gcc will
not extend the zero to 64 bits, thus allowing the upper 32 bits to be
random junk.

The new END_WITH_NULL macro may be used to annotate a such a function
so that GCC (version 4 or newer) will detect the use of un-casted zero
at compile time.

llvm-svn: 23888
2005-10-23 04:37:20 +00:00
Chris Lattner 5df0e36e98 My previous patch was too conservative. Reject FP and void types, but do
allow pointer types.

llvm-svn: 23859
2005-10-21 05:45:41 +00:00
Chris Lattner 0c0b38bb4c Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an
inner loop like this:

LBB_RateConvertMono8AltiVec_2:  ; no_exit
        lis r2, ha16(.CPI_RateConvertMono8AltiVec_0)
        lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2)
        fmr f3, f3
        fadd f0, f2, f0
        fadd f3, f0, f3
        fcmpu cr0, f3, f1
        bge cr0, LBB_RateConvertMono8AltiVec_2  ; no_exit

to an inner loop like this:

LBB_RateConvertMono8AltiVec_1:  ; no_exit
        fsub f2, f2, f1
        fcmpu cr0, f2, f1
        fmr f0, f2
        bge cr0, LBB_RateConvertMono8AltiVec_1  ; no_exit

Doh! good catch!

llvm-svn: 23838
2005-10-20 04:47:10 +00:00
Chris Lattner 45517baf9f Add an option to this pass. If it is set, we are allowed to internalize
all but main.  If it's not set, we can still internalize, but only if an
explicit symbol list is provided.

llvm-svn: 23783
2005-10-18 06:29:22 +00:00
Chris Lattner da1b152c43 Make this work for FP constantexprs
llvm-svn: 23773
2005-10-17 20:18:38 +00:00
Chris Lattner 7fde91e365 Oops, X+0.0 isn't foldable, but X+-0.0 is.
llvm-svn: 23772
2005-10-17 17:56:38 +00:00
Chris Lattner 32979336a7 relax this a bit, as we only support the default rounding mode
llvm-svn: 23771
2005-10-17 17:49:32 +00:00
Chris Lattner 192cd18f53 Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling
out CSE's of base expressions it could build a result whose order was
nondet.

llvm-svn: 23698
2005-10-11 18:41:04 +00:00
Chris Lattner 5c9d63da31 Fix another problem where LSR was being nondeterminstic. Also remove elements
from the end of a vector instead of the beginning

llvm-svn: 23697
2005-10-11 18:30:57 +00:00
Chris Lattner b7a3894e7c Fix another lsr-is-nondeterministic case
llvm-svn: 23695
2005-10-11 18:17:57 +00:00
Chris Lattner 03b9eb506c Make MaskedValueIsZero a bit more aggressive
llvm-svn: 23677
2005-10-09 22:08:50 +00:00
Chris Lattner 62010c450f Fix funky xcode indentation
llvm-svn: 23674
2005-10-09 06:36:35 +00:00
Chris Lattner eb4be8b942 Hrm, you didn't see this.
llvm-svn: 23673
2005-10-09 06:24:02 +00:00
Chris Lattner 4ea0a3eaac Fix a source of non-determinism in the backend: the order of processing
IV strides dependend on the pointer order of the strides in memory.
Non-determinism is bad.

llvm-svn: 23672
2005-10-09 06:20:55 +00:00
Jeff Cohen 572910c9a2 Remove useless variable.
llvm-svn: 23656
2005-10-07 05:28:29 +00:00
Chris Lattner 20b0754c41 Fix DemoteRegToStack on an invoke. This fixes PR634.
llvm-svn: 23618
2005-10-04 00:44:01 +00:00
Chris Lattner 4c3b2b536c Clean up the code a bit. Use isInstructionTriviallyDead to be more aggressive
and more correct than use_empty().  This fixes PR635 and
SimplifyCFG/2005-10-02-InvokeSimplify.ll

llvm-svn: 23616
2005-10-03 23:43:43 +00:00
Chris Lattner f07a587c79 Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In
particular, it should realize that phi's use their values in the pred block
not the phi block itself.  This change turns our em3d loop from this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_6    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; endif.loopexit.loopexit_crit_edge
        addi r3, r2, 1
        blr
LBB_test_6:     ; loopexit
        or r3, r2, r2
        blr

into:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r6, r6
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        or r2, r6, r6
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r2, r2
        blr


Unfortunately, this is actually worse code, because the register coallescer
is getting confused somehow.  If it were doing its job right, it could turn the
code into this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r6, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r6, r6
        blr

... which I'll work on next. :)

llvm-svn: 23604
2005-10-03 02:50:05 +00:00
Chris Lattner e4ed42a426 Refactor some code into a function
llvm-svn: 23603
2005-10-03 01:04:44 +00:00
Chris Lattner 360928dbed This break is bogus and I have no idea why it was there. Basically it prevents
memoizing code when IV's are used by phinodes outside of loops.  In a simple
example, we were getting this code before (note that r6 and r7 are isomorphic
IV's):

        li r6, 0
        or r7, r6, r6
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r7, r7
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r2, r7, 1
        addi r7, r7, 1
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

Now we get:

        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

this was noticed in em3d.

llvm-svn: 23602
2005-10-03 00:37:33 +00:00
Chris Lattner 8fcce170cf when checking if we should move a split edge block outside of a loop,
check the presplit pred, not the post-split pred.  This was causing us
to make the wrong decision in some cases, leaving the critical edge block
in the loop.

llvm-svn: 23601
2005-10-03 00:31:52 +00:00
Jeff Cohen f8a5e5ae6e Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner a554c9470b Insert stores after phi nodes in the normal dest. This fixes
LowerInvoke/2005-08-03-InvokeWithPHI.ll

llvm-svn: 23525
2005-09-29 17:44:20 +00:00
Chris Lattner 87ef943a4c Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%,
bringing the LLC time down to the CBE time.

llvm-svn: 23521
2005-09-29 06:17:27 +00:00
Chris Lattner 5f6035feb0 remove a bunch of unneeded stuff, or self evident comments
llvm-svn: 23519
2005-09-29 06:16:11 +00:00
Chris Lattner c244e7c178 Implement a couple of memcmp folds from the todo list
llvm-svn: 23517
2005-09-29 04:54:20 +00:00
Chris Lattner ea7214b23d Constant fold llvm.sqrt
llvm-svn: 23487
2005-09-28 01:34:32 +00:00
Chris Lattner 3b63bb375c add a note about a way to improve this code further, that I won't be getting
to right now.

llvm-svn: 23485
2005-09-27 22:44:59 +00:00
Chris Lattner eb953f0ef8 Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll
and PR632.

llvm-svn: 23484
2005-09-27 22:28:11 +00:00
Chris Lattner e285f5ed8f Avoid spilling stack slots... to stack slots.
llvm-svn: 23478
2005-09-27 21:33:12 +00:00
Chris Lattner 87eb249300 Completely rewrite 'correct' eh support. This changes how setjmp insertion
is performed so it is only at most once per function that contains an invoke
instead of once per invoke in the function.  This patch has the following perks:

1. It fixes PR631, which complains about slowness.
2. If fixes PR240, which complains about non-volatile vars being live across
   setjmp/longjmps.
3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not
   forcing the jmpbufs to always be 8-bytes off the alignment of the structure.
4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us
   now about 4% faster than GCC.

Further improvements are also possible.

llvm-svn: 23477
2005-09-27 21:18:17 +00:00
Chris Lattner 92233d2175 Make the pass name simpler
llvm-svn: 23476
2005-09-27 21:10:32 +00:00
Chris Lattner 16cd356fb2 allow demotion to volatile values, add support for invoke
llvm-svn: 23473
2005-09-27 19:39:00 +00:00
Chris Lattner 3d27e7f27f Add support for external calls that we know how to constant fold. This implements
ctor-list-opt.ll:CTOR8

llvm-svn: 23465
2005-09-27 05:02:43 +00:00
Chris Lattner 29b2780c8a Fix a bug where we would evaluate stores into linkonce objects which could be
potentially replaced at link-time.

llvm-svn: 23463
2005-09-27 04:50:03 +00:00
Chris Lattner 65a3a0918f Implement support for static constructors with calls in them. This is useful
because gccas runs globalopt before inlining.

This implements ctor-list-opt.ll:CTOR7

llvm-svn: 23462
2005-09-27 04:45:34 +00:00
Chris Lattner da1889b778 Refactor this code a bit, no functionality changes.
llvm-svn: 23460
2005-09-27 04:27:01 +00:00
Chris Lattner f2f89af69a Remove some dead code. ctor evaluation subsumes empty ctor elim
llvm-svn: 23453
2005-09-26 20:38:20 +00:00
Chris Lattner 6bf2cd5735 Add support for alloca, implementing ctor-list-opt.ll:CTOR6
llvm-svn: 23452
2005-09-26 17:07:09 +00:00
Chris Lattner 46d9ff081d Add a debug printout, fix a crash on kc++
llvm-svn: 23450
2005-09-26 07:34:35 +00:00
Chris Lattner 46af55e0e4 Implement loads/stores through GEP's of globals. This implements
ctor-list-opt.ll:CTOR5.

llvm-svn: 23449
2005-09-26 06:52:44 +00:00
Chris Lattner 61ff32cd70 Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr
llvm-svn: 23447
2005-09-26 05:34:07 +00:00
Chris Lattner 02ae21e1e0 Eliminate GetGEPGlobalInitializer in favor of the more powerful
ConstantFoldLoadThroughGEPConstantExpr function in the utils lib.

llvm-svn: 23446
2005-09-26 05:28:52 +00:00
Chris Lattner 0b011ec8e2 Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils
as ConstantFoldLoadThroughGEPConstantExpr.

llvm-svn: 23445
2005-09-26 05:28:06 +00:00
Chris Lattner c13c7b9376 Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine
pass.

llvm-svn: 23444
2005-09-26 05:27:10 +00:00
Chris Lattner b009663e27 add a comment
llvm-svn: 23442
2005-09-26 05:16:34 +00:00
Chris Lattner 4b05c322d5 Add support for getelementptr, load, and correctly reject volatile stores.
llvm-svn: 23441
2005-09-26 05:15:37 +00:00
Chris Lattner 3e9ea5ffec Add support for br/brcond/switch and phi
llvm-svn: 23439
2005-09-26 04:57:38 +00:00
Chris Lattner 99e23fa74c Add a simple interpreter to this code, allowing us to statically evaluate
global ctors that are simple enough.  This implements ctor-list-opt.ll:CTOR2.

llvm-svn: 23437
2005-09-26 04:44:35 +00:00
Chris Lattner 696beefabb factor some code into a InstallGlobalCtors method, add comments. No functionality change.
llvm-svn: 23435
2005-09-26 02:31:18 +00:00
Chris Lattner 838bdc1836 Make the global opt optimizer work on modules with a null terminator, by
accepting the null even with a non-65535 init prio

llvm-svn: 23434
2005-09-26 02:19:27 +00:00
Chris Lattner 41b6a5a693 Factor this code out into a few methods.
Implement the start of global ctor optimization.  It is currently smart
enough to remove the global ctor for cases like this:

struct foo {
  foo() {}
} x;

... saving a bit of startup time for the program.

llvm-svn: 23433
2005-09-26 01:43:45 +00:00
Chris Lattner f487768062 Fix some logic I broke that caused a regression on
SimplifyLibCalls/2005-05-20-sprintf-crash.ll

llvm-svn: 23430
2005-09-25 07:06:48 +00:00
Chris Lattner 0b3557f54a Move MaskedValueIsZero up.
Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll

llvm-svn: 23428
2005-09-24 23:43:33 +00:00
Chris Lattner 175463a165 Simplify this code a bit by relying on recursive simplification. Support
sprintf("%s", P)'s that have uses.

s/hasNUses(0)/use_empty()/

llvm-svn: 23425
2005-09-24 22:17:06 +00:00
Chris Lattner 499e33646e remove some debugging code
llvm-svn: 23411
2005-09-23 18:49:09 +00:00
Chris Lattner c59a371d45 Fold two consequtive branches that share a common destination between them.
This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy
code

llvm-svn: 23410
2005-09-23 18:47:20 +00:00
Chris Lattner 3a978bf66d simplify some logic further
llvm-svn: 23408
2005-09-23 07:23:18 +00:00