Commit Graph

5622 Commits

Author SHA1 Message Date
Owen Anderson bb754826c9 Use PredIteratorCache in LCSSA, which gives a 37% overall speedup on the testcase from PR3549. More improvements to come.
llvm-svn: 69788
2009-04-22 08:09:13 +00:00
Chris Lattner 58be2d4413 use predicate instead of hand-rolled loop
llvm-svn: 69752
2009-04-21 23:37:18 +00:00
Chris Lattner 69223bb7f5 fix a crash on a pointless but valid zero-length memset, rdar://6808691
llvm-svn: 69680
2009-04-21 16:52:12 +00:00
Dan Gohman 4860db61be Factor out a common base class from SCEVTruncateExpr, SCEVZeroExtendExpr,
and SCEVSignExtendExpr.

llvm-svn: 69649
2009-04-21 01:25:57 +00:00
Dan Gohman b397e1a7a2 Introduce encapsulation for ScalarEvolution's TargetData object, and refactor
the code to minimize dependencies on TargetData.

llvm-svn: 69644
2009-04-21 01:07:12 +00:00
Dale Johannesen 1238220473 Adjust loop size estimate for full unrolling;
GEP's don't usually become instructions.

llvm-svn: 69631
2009-04-20 22:19:33 +00:00
Sanjiv Gupta 428d490332 Before trying to introduce/eliminate cast/ext/trunc to make indices type as
pointer type, make sure that the pointer size is a valid sequential index type.

llvm-svn: 69574
2009-04-20 06:05:54 +00:00
Dan Gohman 056857aa21 Use more const qualifiers with SCEV interfaces.
llvm-svn: 69450
2009-04-18 17:56:28 +00:00
Jim Grosbach 8d62763779 remove trailing whitespace
llvm-svn: 69402
2009-04-17 23:30:55 +00:00
David Greene 22fa407ed7 Use a safer iterator interface and get rid of std C++ library misuse.
This fixes a --enable-expensive-checks problem.

llvm-svn: 69353
2009-04-17 14:56:18 +00:00
Dan Gohman d2d6fd806c Don't create ConstantInts with pointer type. This fixes a
regression in 403.gcc in PIC_CODEGEN=1 and DISABLE_LTO=1
mode.

llvm-svn: 69344
2009-04-17 02:02:52 +00:00
Dan Gohman fec1d086e0 Use TargetData::getTypeSizeInBits instead of getPrimitiveSizeInBits()
to get the correct answer for pointer types.

llvm-svn: 69321
2009-04-16 22:35:57 +00:00
Eli Friedman 929207fd1d Fix for PR3944: make mem2reg O(N) instead of O(N^2) in the number of
incoming edges for a block with many predecessors.

llvm-svn: 69312
2009-04-16 21:40:28 +00:00
Dan Gohman 8b6ebb1112 Minor code simplifications. Don't attempt LSR on theoretical
targets with pointers larger than 64 bits, due to the code not
yet being APInt clean.

llvm-svn: 69296
2009-04-16 16:49:48 +00:00
Dan Gohman e2ead2c328 LSR is no longer a GEP optimizer. It is now an IV expression
optimizer, which just happen to frequently involve optimizing GEPs.

llvm-svn: 69295
2009-04-16 16:46:01 +00:00
Dan Gohman a8be04b2db Use ConstantExpr::getIntToPtr instead of SCEVExpander::InsertCastOfTo,
since the operand is always a constant.

llvm-svn: 69291
2009-04-16 15:48:38 +00:00
Dan Gohman 71bccd3e0e Use a SCEV expression cast instead of immediately inserting a
new instruction with SCEVExpander::InsertCastOfTo.

llvm-svn: 69290
2009-04-16 15:47:35 +00:00
Dan Gohman 0a40ad93a9 Expand GEPs in ScalarEvolution expressions. SCEV expressions can now
have pointer types, though in contrast to C pointer types, SCEV
addition is never implicitly scaled. This not only eliminates the
need for special code like IndVars' EliminatePointerRecurrence
and LSR's own GEP expansion code, it also does a better job because
it lets the normal optimizations handle pointer expressions just
like integer expressions.

Also, since LLVM IR GEPs can't directly index into multi-dimensional
VLAs, moving the GEP analysis out of client code and into the SCEV
framework makes it easier for clients to handle multi-dimensional
VLAs the same way as other arrays.

Some existing regression tests show improved optimization.
test/CodeGen/ARM/2007-03-13-InstrSched.ll in particular improved to
the point where if-conversion started kicking in; I turned it off
for this test to preserve the intent of the test.

llvm-svn: 69258
2009-04-16 03:18:22 +00:00
Dale Johannesen a71daa83c6 Eliminate zext over (iv | const) or (signed iv),
and sext over (iv | const), if a longer iv is
available.  Allow expressions to have more than
one zext/sext parent.  All from OpenSSL.

llvm-svn: 69241
2009-04-15 23:31:51 +00:00
Dale Johannesen 82230b5b17 Eliminate zext over (iv & const) or ((iv+const)&const)
if a longer iv is available.  These subscript forms are
not common; they're a bottleneck in OpenSSL.

llvm-svn: 69215
2009-04-15 20:41:02 +00:00
Dale Johannesen 7ffb7d5728 Enhance induction variable code to remove the
sext around sext(shorter IV + constant), using a
longer IV instead, when it can figure out the
add can't overflow.  This comes up a lot in
subscripting; mainly affects 64 bit.

llvm-svn: 69123
2009-04-15 01:10:12 +00:00
Evan Cheng ffb83a155e Avoid making the transformation enabled by my last patch if the new destinations have phi nodes.
llvm-svn: 69121
2009-04-15 00:43:54 +00:00
Devang Patel 046bf624b9 While inlining, clone llvm.dbg.func.start intrinsic and adjust
llvm.dbg.region.end instrinsic. This nested llvm.dbg.func.start/llvm.dbg.region.end pair now enables DW_TAG_inlined_subroutine support in code generator.

llvm-svn: 69118
2009-04-15 00:17:06 +00:00
Evan Cheng 5ebf2acd84 Optimize conditional branch on i1 phis with non-constant inputs.
This turns:

eq:
        %3 = icmp eq i32 %1, %2
        br label %join

ne:
        %4 = icmp ne i32 %1, %2
        br label %join

join:
        %5 = phi i1 [%3, %eq], [%4, %ne]
        br i1 %5, label %yes, label %no

=>

eq:
        %3 = icmp eq i32 %1, %2
        br i1 %3, label %yes, label %no

ne:
        %4 = icmp ne i32 %1, %2
        br i1 %4, label %yes, label %no

llvm-svn: 69102
2009-04-14 23:40:03 +00:00
Owen Anderson a1902318e3 LoopIndexSplit needs to inform the loop pass manager of the instructions it is
deleting, not just the basic block.

llvm-svn: 69011
2009-04-14 01:04:19 +00:00
Chris Lattner 836e77d161 eliminate unneeded parens.
llvm-svn: 68939
2009-04-13 05:38:23 +00:00
Chris Lattner 6cd82fb430 "There was a typo in my previous patch which leads to miscompilation of
strncat :(

strncat(foo, "bar", 99)
would be optimized to
memcpy(foo+strlen(foo), "bar", 100, 1)
instead of
memcpy(foo+strlen(foo), "bar", 4, 1)"

Patch by Benjamin Kramer!

llvm-svn: 68905
2009-04-12 18:22:33 +00:00
Chris Lattner 91b6af24ac add some optimizations for strncpy/strncat and factor some
code.  Patch by Benjamin Kramer!

llvm-svn: 68885
2009-04-12 05:06:39 +00:00
Chris Lattner eb510d6b3d Instcombine should not promote whole computation trees to "strange"
integer types, unless they are already strange.  This prevents it from
turning the code produced by SROA into crazy libcalls and stuff that 
the code generator can't handle.  In the attached example, the result
was an i96 multiply that caused the x86 backend to assert.

Note that if TargetData had an idea of what the legal types are for
a target that this could be used to stop instcombine from introducing
i64 muls, as Scott wanted.

llvm-svn: 68598
2009-04-08 05:41:03 +00:00
Chris Lattner 321741af5f fix rdar://6762290, a crash compiling cxx filt with clang.
llvm-svn: 68500
2009-04-07 05:03:34 +00:00
Chris Lattner 47d6e7b93e remove empty section
llvm-svn: 68485
2009-04-07 02:55:53 +00:00
Ed Schouten 01aa6ec97a Let the strcat optimizer return the pointer to the start of the buffer,
instead of the place where it started to perform the string copy.

- PR3661
- Patch by Benjamin Kramer!

llvm-svn: 68443
2009-04-06 13:06:48 +00:00
Owen Anderson 98f912bf13 Reapply r68211, with the miscompilations it caused fixed.
llvm-svn: 68262
2009-04-01 23:53:49 +00:00
Dan Gohman c4971721ea Revert r68172. It caused regressions in
Applications/Burg/burg
  Applications/ClamAV/clamscan
and many other tests.

llvm-svn: 68211
2009-04-01 16:37:47 +00:00
Owen Anderson ff5961b46c Enhance GVN to propagate simple conditionals. This fixes PR3921.
llvm-svn: 68172
2009-04-01 01:20:45 +00:00
Chris Lattner f72ce6ea8b Make the key of ValueRankMap an AssertingVH, so that we die violently
if it dangles.

llvm-svn: 68150
2009-03-31 22:13:29 +00:00
Evan Cheng 826b6f0f7c Throttle back "fold select into operand" transformation. InstCombine should not generate selects of two constants unless they are selects of 0 and 1.
e.g.
define i32 @t1(i32 %c, i32 %x) nounwind {
       %t1 = icmp eq i32 %c, 0
       %t2 = lshr i32 %x, 18
       %t3 = select i1 %t1, i32 %t2, i32 %x
       ret i32 %t3
}

was turned into

define i32 @t2(i32 %c, i32 %x) nounwind {
       %t1 = icmp eq i32 %c, 0
       %t2 = select i1 %t1, i32 18, i32 0
       %t3 = lshr i32 %x, %t2
       ret i32 %t3
}

For most targets, that means materializing two constants and then a select. e.g. On x86-64

movl    %esi, %eax
shrl    $18, %eax
testl   %edi, %edi
cmovne  %esi, %eax
ret

=>

xorl    %eax, %eax
testl   %edi, %edi
movl    $18, %ecx
cmovne  %eax, %ecx
movl    %esi, %eax
shrl    %cl, %eax
ret

Also, the optimizer and codegen can reason about shl / and / add, etc. by a constant. This optimization will hinder optimizations using ComputeMaskedBits.

llvm-svn: 68142
2009-03-31 20:42:45 +00:00
Devang Patel 4ce6e69022 Update call graph after inlining invoke.
Patch by Jay Foad.

llvm-svn: 68120
2009-03-31 17:36:12 +00:00
Devang Patel 6e68bd007a Loop Index Split can eliminate a loop if it can determin if loop body is executed only once. There was a bug in determining IV based value of the iteration for which the loop body is executed. Fix it.
llvm-svn: 68071
2009-03-30 22:24:10 +00:00
Duncan Sands 3241b74f69 Revert r67798: it breaks llvm-gcc bootstrap on x86-64-linux, presumably due to
a miscompilation.

make[4]: Entering directory `gcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/include'
if [ ! -d "./x86_64-unknown-linux-gnu/bits/stdtr1c++.h.gch" ]; then \
          mkdir -p ./x86_64-unknown-linux-gnu/bits/stdtr1c++.h.gch; \
        fi; \
        gcc-4.2.llvm-objects/./gcc/xgcc -shared-libgcc -Bgcc-4.2.llvm-objects/./gcc -nostdinc++ 
-Lgcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/src -Lgcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs 
-B/usr/local/gnat-llvm/x86_64-unknown-linux-gnu/bin/ -B/usr/local/gnat-llvm/x86_64-unknown-linux-gnu/lib/ -isystem 
/usr/local/gnat-llvm/x86_64-unknown-linux-gnu/include -isystem /usr/local/gnat-llvm/x86_64-unknown-linux-gnu/sys-include -Winvalid-pch -Wno-deprecated -x 
c++-header -g -O2  -D_GNU_SOURCE -Igcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu 
-Igcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/include -Igcc-4.2.llvm/libstdc++-v3/libsupc++ -O2 -g 
gcc-4.2.llvm/libstdc++-v3/include/precompiled/stdtr1c++.h -o x86_64-unknown-linux-gnu/bits/stdtr1c++.h.gch/O2g.gch
In file included from gcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/include/tr1/repeat.h:247,
                 from gcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/include/tr1/functional:1098,
                 from gcc-4.2.llvm/libstdc++-v3/include/precompiled/stdtr1c++.h:53:
gcc-4.2.llvm-objects/x86_64-unknown-linux-gnu/libstdc++-v3/include/tr1/functional_iterate.h:417: internal compiler error: in ggc_recalculate_in_use_p, at 
ggc-page.c:1602
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://llvm.org/bugs/> for instructions.
make[4]: *** [x86_64-unknown-linux-gnu/bits/stdtr1c++.h.gch/O2g.gch] Error 1

llvm-svn: 67839
2009-03-27 14:56:47 +00:00
Dale Johannesen 4026b041ce One more place to skip debug info.
llvm-svn: 67811
2009-03-27 01:13:37 +00:00
Devang Patel fe7c0492a0 While hoisting an instruction, update alias info set tracker.
llvm-svn: 67798
2009-03-26 23:48:52 +00:00
Dale Johannesen db90560c1c Skip debug info one more place. (This one gets
called from llc, not opt, but it's an IR level
optimization nevertheless.)

llvm-svn: 67724
2009-03-26 01:15:07 +00:00
Devang Patel 4555618854 Before deleting a basic block, give other loop passes a chance cleanup analysis values, related to the instructions in the basic block.
llvm-svn: 67719
2009-03-25 23:57:48 +00:00
Chris Lattner c3b2111d97 Fix PR3874 by restoring a condition I removed, but making it more
precise than it used to be.

llvm-svn: 67662
2009-03-25 00:28:58 +00:00
Chris Lattner 9e94538005 oops, I intended to remove this, not comment it out. Thanks Duncan!
llvm-svn: 67657
2009-03-24 23:48:25 +00:00
Chris Lattner 306813cbbb canonicalize inttoptr and ptrtoint instructions which cast pointers
to/from integer types that are not intptr_t to convert to intptr_t
then do an integer conversion to the dest type.  This exposes the
cast to the optimizer.

llvm-svn: 67638
2009-03-24 18:35:40 +00:00
Chris Lattner d9eb41177a two changes:
1. Make instcombine always canonicalize trunc x to i1 into an icmp(x&1).  This 
   exposes the AND to other instcombine xforms and is more of what the code
   generator expects.
2. Rewrite the remaining trunc pattern match to use 'match', which 
   simplifies it a lot.
   

llvm-svn: 67635
2009-03-24 18:15:30 +00:00
Dale Johannesen 32dfb35281 Use a SmallPtrSet instead of std::set.
llvm-svn: 67578
2009-03-23 23:39:20 +00:00
Dan Gohman 4f2fea1a21 Now that errs() is properly non-buffered, there's no need to
explicitly flush it.

llvm-svn: 67526
2009-03-23 15:57:19 +00:00
Duncan Sands 1f15ca7c7a Factorize out a concept - no functionality change.
llvm-svn: 67454
2009-03-21 21:27:31 +00:00
Chris Lattner 0a981d1d36 Fix instcombine to not introduce undefined shifts when merging two
shifts together.  This fixes PR3851.

llvm-svn: 67411
2009-03-20 22:41:15 +00:00
Duncan Sands a09e0afe74 Don't load values out of global constants with weak
linkage: the value may be replaced with something
different at link time.  (Frontends that want to
allow values to be loaded out of weak constants can
give their constants weak_odr linkage).

llvm-svn: 67407
2009-03-20 21:53:29 +00:00
Dale Johannesen 2050968df9 Clear the cached cost when removing a function in
the inliner; prevents nondeterministic behavior
when the same address is reallocated.
Don't build call graph nodes for debug intrinsic calls;
they're useless, and there were typically a lot of them.

llvm-svn: 67311
2009-03-19 18:03:56 +00:00
Dale Johannesen e4f361212b Fix comment typo.
llvm-svn: 67307
2009-03-19 17:23:29 +00:00
Dale Johannesen 52bc2aac8a This pass keeps a map of Instructions to Rank numbers,
and was deleting Instructions without clearing the
corresponding map entry.  This led to nondeterministic
behavior if the same address got allocated to another
Instruction within a short time.

llvm-svn: 67306
2009-03-19 17:22:53 +00:00
Nick Lewycky bfd4ad67c7 Remove strange extra semicolons.
llvm-svn: 67287
2009-03-19 05:51:39 +00:00
Chris Lattner 514fc5b143 aha, DAE does have to think about PHI nodes. Many thanks to "Dr Evil" (aka Duncan)
for pointing this out :)

llvm-svn: 67212
2009-03-18 16:48:45 +00:00
Chris Lattner 595923ff75 Fix PR3826 - InstComb assert with vector shift, by not calling ComputeNumSignBits on a vector.
llvm-svn: 67211
2009-03-18 16:32:19 +00:00
Chris Lattner ab8022055a add an assertion to make it clear that PHI nodes are not allowed.
llvm-svn: 67210
2009-03-18 16:23:56 +00:00
Zhou Sheng 4e2af3cb55 Explicitly check for StoreInst, do not lose the chance to delete
unused loads or bitcasts.

llvm-svn: 67202
2009-03-18 12:48:48 +00:00
Zhou Sheng 05bea906c1 Revert my previous change on Local.cpp, instead, fix the bug on scalarrepl.
If the instruction has no users, it is also not only used by debug info 
and should not be deleted.

llvm-svn: 67194
2009-03-18 10:13:08 +00:00
Zhou Sheng 64a6a092b1 Fix a bug.
If I->use_empty(), this method should return false.

llvm-svn: 67180
2009-03-18 07:56:13 +00:00
Chris Lattner a15ce21135 Fix PR3807 by inserting 'insertelement' instructions in the normal dest of
an invoke instead of after the invoke (in its block), which is invalid.

llvm-svn: 67139
2009-03-18 00:31:45 +00:00
Chris Lattner 42e9ca42ce LSR shouldn't ever try to hack on integer IV's larger than 64-bits. Right now
it is not APInt clean, but even when it is it needs to be evaluated carefully
to determine whether it is actually profitable.

This fixes a crash on PR3806

llvm-svn: 67134
2009-03-17 23:58:30 +00:00
Chris Lattner e549493a55 Remove a condition which is always true.
llvm-svn: 67089
2009-03-17 17:55:15 +00:00
Dale Johannesen 87077356be Fix a debug info dependency in jump threading.
llvm-svn: 67064
2009-03-17 00:38:24 +00:00
Dale Johannesen a4ac735531 Fix -strip-debug-declare to work when there are
llvm.global.variable's but no llvm.declare's.

llvm-svn: 66977
2009-03-13 22:59:47 +00:00
Evan Cheng 94419d6fdd Fix PR3784: If the source of a phi comes from a bb ended with an invoke, make sure the copy is inserted before the try range (unless it's used as an input to the invoke, then insert it after the last use), not at the end of the bb.
Also re-apply r66140 which was disabled as a workaround.

llvm-svn: 66976
2009-03-13 22:59:14 +00:00
Bill Wendling 4bb96e9a50 Revert r66920. It was causing failures in the self-hosting buildbot (in release
mode).

Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/dg.exp ...
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/crash-narrowfunctiontest.ll
Failed with signal(SIGBUS) at line 1
while running: bugpoint /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/crash-narrowfunctiontest.ll -bugpoint-crashcalls -silence-passes > /dev/null
0   bugpoint          0x0035dd25 llvm::sys::SetInterruptFunction(void (*)()) + 85
1   bugpoint          0x0035e382 llvm::sys::RemoveFileOnSignal(llvm::sys::Path const&, std::string*) + 706
2   libSystem.B.dylib 0x92f112bb _sigtramp + 43
3   libSystem.B.dylib 0xffffffff _sigtramp + 1829694831
4   bugpoint          0x00021d1c main + 92
5   bugpoint          0x00002106 start + 54
6   bugpoint          0x00000004 start + 18446744073709543220
Stack dump:
0.    Program arguments: bugpoint /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/crash-narrowfunctiontest.ll -bugpoint-crashcalls -silence-passes 

FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/misopt-basictest.ll
Failed with signal(SIGBUS) at line 1
while running: bugpoint /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/misopt-basictest.ll -dce -bugpoint-deletecalls -simplifycfg -silence-passes
0   bugpoint          0x0035dd25 llvm::sys::SetInterruptFunction(void (*)()) + 85
1   bugpoint          0x0035e382 llvm::sys::RemoveFileOnSignal(llvm::sys::Path const&, std::string*) + 706
2   libSystem.B.dylib 0x92f112bb _sigtramp + 43
3   libSystem.B.dylib 0xffffffff _sigtramp + 1829694831
4   bugpoint          0x00021d1c main + 92
5   bugpoint          0x00002106 start + 54
6   bugpoint          0x00000006 start + 18446744073709543222
Stack dump:
0.    Program arguments: bugpoint /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/misopt-basictest.ll -dce -bugpoint-deletecalls -simplifycfg -silence-passes 

FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/remove_arguments_test.ll
Failed with signal(SIGBUS) at line 1
while running: bugpoint /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/remove_arguments_test.ll  -bugpoint-crashcalls -silence-passes
0   bugpoint          0x0035dd25 llvm::sys::SetInterruptFunction(void (*)()) + 85
1   bugpoint          0x0035e382 llvm::sys::RemoveFileOnSignal(llvm::sys::Path const&, std::string*) + 706
2   libSystem.B.dylib 0x92f112bb _sigtramp + 43
3   libSystem.B.dylib 0xffffffff _sigtramp + 1829694831
4   bugpoint          0x00021d1c main + 92
5   bugpoint          0x00002106 start + 54
Stack dump:
0.    Program arguments: bugpoint /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/BugPoint/remove_arguments_test.ll -bugpoint-crashcalls -silence-passes 

--- Reverse-merging (from foreign repository) r66920 into '.':
U    include/llvm/Support/CallSite.h
U    include/llvm/Instructions.h
U    lib/Analysis/IPA/GlobalsModRef.cpp
U    lib/Analysis/IPA/Andersens.cpp
U    lib/Bitcode/Writer/BitcodeWriter.cpp
U    lib/VMCore/Instructions.cpp
U    lib/VMCore/Verifier.cpp
U    lib/VMCore/AsmWriter.cpp
U    lib/Transforms/Utils/LowerInvoke.cpp
U    lib/Transforms/Scalar/SimplifyCFGPass.cpp
U    lib/Transforms/IPO/PruneEH.cpp
U    lib/Transforms/IPO/DeadArgumentElimination.cpp

llvm-svn: 66953
2009-03-13 21:15:59 +00:00
Dale Johannesen c65830519e One more place where debug info affects codegen.
llvm-svn: 66930
2009-03-13 19:23:20 +00:00
Gabor Greif 258232fb80 Second installment of "BasicBlock operands to the back"
changes.

For InvokeInst now all arguments begin at op_begin().
The Callee, Cont and Fail are now faster to get by
access relative to op_end().

This patch introduces some temporary uglyness in CallSite.
Next I'll bring CallInst up to a similar scheme and then
the uglyness will magically vanish.

This patch also exposes all the reliance of the libraries
on InvokeInst's operand ordering. I am thinking of taking
care of that too.

llvm-svn: 66920
2009-03-13 18:27:29 +00:00
Bill Wendling fa54bc2052 Oops...I committed too much.
llvm-svn: 66867
2009-03-13 04:39:26 +00:00
Bill Wendling b02eadf660 Temporarily XFAIL this test.
llvm-svn: 66866
2009-03-13 04:37:11 +00:00
Dale Johannesen cecfa6e08d Fix one more place where debug info affected
codegen (speculative execution).

llvm-svn: 66859
2009-03-13 01:05:24 +00:00
Dale Johannesen ed6f5a8253 Previous debug info fix to this code wasn't quite
right; did the wrong thing when there are exactly 11
non-debug instructions, followed by debug info.
Remove a FIXME since it's apparently been fixed along the way.

llvm-svn: 66840
2009-03-12 23:18:09 +00:00
Duncan Sands 1f853d6a2a Revert commit 66140 since it caused several failures
in the Ada testcase.  Reverting this only covers up
the real problem, which is a nasty conceptual difficulty
in the phi elimination pass: when eliminating phi nodes
in landing pads, the register copies need to come before
the invoke, not at the end of the basic block which is
too late...  See PR3784.

llvm-svn: 66826
2009-03-12 21:13:42 +00:00
Dale Johannesen 7f99d22f2f There already was a class to force deterministic
sorting of ConstantInt's; unreinvent wheel.

llvm-svn: 66824
2009-03-12 21:01:11 +00:00
Dale Johannesen 578d8bfc3c Another missing check for debug intrinsics.
llvm-svn: 66800
2009-03-12 17:42:45 +00:00
Dale Johannesen 9cdb9bb3e5 Allow for switch values bigger than 64 bits.
llvm-svn: 66751
2009-03-12 01:20:06 +00:00
Dale Johannesen 5a41b2def5 Fix some nondeterministic behavior when forwarding
from a switch table.  Multiple table entries that
branch to the same place were being sorted by the
pointer value of the ConstantInt*; changed to sort
by the actual value of the ConstantInt.

llvm-svn: 66749
2009-03-12 01:00:26 +00:00
Dale Johannesen 08ccba73a7 Skip interleaved debug info when fast-forwarding through
allocations.  Apparently the assumption is there is an
instruction (terminator?) following the allocation so I
am allowing the same assumption.

llvm-svn: 66716
2009-03-11 22:19:43 +00:00
Anton Korobeynikov 38961d5bd6 I should definitely read make docs someday :(
llvm-svn: 66699
2009-03-11 20:40:15 +00:00
Anton Korobeynikov 3b046d084e Unbreak the build. Dunno, why it did not fail on mingw :(
llvm-svn: 66692
2009-03-11 20:16:05 +00:00
Anton Korobeynikov a09ba46ee3 Disable plugins / shared stuff generation on windows targets.
This fixes fallout from recent PIC/delibtoolize changes and unbreaks
build on cygming.

llvm-svn: 66686
2009-03-11 19:49:42 +00:00
Dale Johannesen 900aaa3d1e Don't consider debug intrinsics when checking
whether a callee to be inlined is a leaf.

llvm-svn: 66588
2009-03-10 22:20:02 +00:00
Dale Johannesen 703703aacb Removing a dead debug intrinsic shouldn't trigger
another instcombine pass if we weren't going to make
one without debug info.

llvm-svn: 66576
2009-03-10 21:19:49 +00:00
Devang Patel 84fceff969 Ignore dbg info, while estimating size of jump through block.
llvm-svn: 66554
2009-03-10 18:00:05 +00:00
John Criswell 073e4d16c5 Do not attempt to do parial redundancy elimination on void values.
Also fixed a punctuation error in the header comment.
This fixes PR3775.

llvm-svn: 66542
2009-03-10 15:04:53 +00:00
Evan Cheng 1c94228de3 If a function is marked alwaysinline, it must be inlined (possibly for correctness). Do so even if the callee has dynamic alloca and the caller doesn't.
llvm-svn: 66539
2009-03-10 07:57:50 +00:00
Devang Patel 04852aa933 Ignore debug info while evaluating function.
llvm-svn: 66490
2009-03-09 23:04:12 +00:00
Dan Gohman f12436891e Don't record the increment instruction; just recompute it from the Phi
if needed. This simplifies the code a little, and is needed for an
upcoming refactoring.

llvm-svn: 66479
2009-03-09 22:04:01 +00:00
Devang Patel 4a1b0776b3 Remove llvm.dbg.global_variables also.
llvm-svn: 66471
2009-03-09 21:32:28 +00:00
Dan Gohman b855164751 Fix a few more places where induction variable types were used
where memory access types are needed.

llvm-svn: 66470
2009-03-09 21:22:12 +00:00
Dan Gohman 5a4e31666d Use ReplacedTy instead of recomputing the same value.
llvm-svn: 66469
2009-03-09 21:19:58 +00:00
Dan Gohman 34e52ddb7d Use LoopInfo's getLoopLatch() instead of doing what it does manualy.
llvm-svn: 66467
2009-03-09 21:14:16 +00:00
Dan Gohman 70cc9875d8 Don't use an induction variable type as a memory access type.
Use VoidTy instead, to be properly conservative.

llvm-svn: 66463
2009-03-09 21:04:19 +00:00
Dan Gohman 917ffe4592 Factor out the code that determines the memory access type
of an instruction into a helper function.

llvm-svn: 66460
2009-03-09 21:01:17 +00:00
Devang Patel 66f84e7a42 Add helper pass to remove llvm.dbg.declare intrinsics.
llvm-svn: 66454
2009-03-09 20:49:37 +00:00
Dan Gohman e201f8ff1d Move the sorting of the StrideOrder array earlier so that it doesn't
have to be done twice.

llvm-svn: 66449
2009-03-09 20:46:50 +00:00
Dan Gohman b5001909b0 Delete the isOnlyStride argument, which is unused.
llvm-svn: 66446
2009-03-09 20:41:15 +00:00
Dan Gohman 85875f7120 Tidy some LSR debug output: announce the loop it's about to process
before it does any processing.

llvm-svn: 66443
2009-03-09 20:34:59 +00:00
Duncan Sands 5cbd3d9c52 This debug info special case should no longer
be needed now that these intrinsics are marked
as not accessing memory.

llvm-svn: 66420
2009-03-09 11:57:08 +00:00
Chris Lattner 0eab5ecb71 reimplement AliasSetTracker in terms of DenseMap instead of hash_map,
hopefully no functionality change.

llvm-svn: 66398
2009-03-09 05:11:09 +00:00
Nick Lewycky dc9642feb1 Keep calling-convention and tail-call bit when creating new invoke or call.
llvm-svn: 66384
2009-03-08 19:02:17 +00:00
Nick Lewycky 9ec96d19e3 Fix comments, pointed out by Duncan Sands.
llvm-svn: 66381
2009-03-08 17:08:09 +00:00
Nick Lewycky fbed86a865 Mark function returns as noalias.
llvm-svn: 66369
2009-03-08 06:20:47 +00:00
Chris Lattner 21a84f3054 teach SROA to handle promoting vector allocas with a memset into them into
a vector type instead of into an integer type.

llvm-svn: 66368
2009-03-08 04:17:04 +00:00
Chris Lattner c009757761 Enhance SROA to "promote to scalar" allocas which are
memcpy/memmove'd into or out of.  This fixes a serious
perf issue that Nate ran into.

llvm-svn: 66366
2009-03-08 04:04:21 +00:00
Chris Lattner dc35e5b43a change the MemIntrinsic get/setAlignment method to take an unsigned
instead of a Constant*, which is what the clients of it really want.

llvm-svn: 66364
2009-03-08 03:59:00 +00:00
Chris Lattner fee0a55c84 use MemTransferInst.
llvm-svn: 66362
2009-03-08 03:37:35 +00:00
Chris Lattner 334268a211 Introduce a new MemTransferInst pseudo class, which is a common
parent between MemCpyInst and MemMoveInst, simplify some code to
use it.

llvm-svn: 66361
2009-03-08 03:37:16 +00:00
Chris Lattner e313283199 fix a serious pessimization that Tron on IRC pointed out where we would
"boolify" pointers, generating really awful code because getting the pointer
value requires a load itself.  Before:

_foo:
	movb	$1, _X.b
	ret
_get:
	xorl	%ecx, %ecx
	movb	_X.b, %al
	testb	%al, %al
	movl	$_Y, %eax
	cmove	%ecx, %eax
	ret

With the xform disabled:

_foo:
	movl	$_Y, _X
	ret
_get:
	movl	_X, %eax
	ret

llvm-svn: 66351
2009-03-07 23:32:02 +00:00
Duncan Sands 12da8ce3d2 Introduce new linkage types linkonce_odr, weak_odr, common_odr
and extern_weak_odr.  These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global.  In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time.   This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function.  If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body.  The
code generators on the other hand map weak and weak_odr linkage
to the same thing.

llvm-svn: 66339
2009-03-07 15:45:40 +00:00
Dale Johannesen 6e447e08ee Fix another case where debug info interferes with
an optimization.

llvm-svn: 66288
2009-03-06 21:08:33 +00:00
Chris Lattner e48f897ca7 add a bunch more passes to the C bindings (PR3734), patch by
Lennart Augustsson!

llvm-svn: 66272
2009-03-06 16:52:18 +00:00
Duncan Sands ed7228319a While thinking about the one-definition-rule and trying
to find a tiny mouse hole to squeeze through, it struck
me that globals without a name can be considered internal
since they can't be referenced from outside the current
module.  This patch makes GlobalOpt give them internal
linkage.  Also done for aliases even though they always
have names, since in my opinion anonymous aliases should
be allowed for consistency with global variables and
functions.  So if that happens one day, this code is ready!

llvm-svn: 66267
2009-03-06 10:21:56 +00:00
Devang Patel 25b625165f While converting an aggregate to scalare, ignore and remove aggregate's debug info.
llvm-svn: 66262
2009-03-06 07:03:54 +00:00
Devang Patel 5aed7765b8 While hoisting instruction to speculatively execute simple bb, ignore dbg intrinsics.
llvm-svn: 66255
2009-03-06 06:00:17 +00:00
Chris Lattner e6d1e8d0cc this wasn't intended to go in.
llvm-svn: 66252
2009-03-06 05:42:30 +00:00
Chris Lattner e3fc2d13be Change various llvm utilities to use PrettyStackTraceProgram in
their main routines.  This makes the tools print their argc/argv
commands if they crash.

llvm-svn: 66248
2009-03-06 05:34:10 +00:00
Devang Patel bab43b4c91 Do not count DbgInfoIntrinsic while estimating loop header size.
llvm-svn: 66245
2009-03-06 03:51:30 +00:00
Devang Patel e8c6d3102d Skip DbgInfoIntrinsic.
llvm-svn: 66244
2009-03-06 02:59:27 +00:00
Dale Johannesen fb1caf3e1f Don't assign rank numbers to debug intrinsic "calls".
This is needed so debug info doesn't change codegen.

llvm-svn: 66235
2009-03-06 01:41:59 +00:00
Devang Patel fc507a1f9c Revert 66224.
llvm-svn: 66233
2009-03-06 01:39:36 +00:00
Devang Patel d926aaa28f Revert rev. 66167.
We are still not out of woods yet.

llvm-svn: 66232
2009-03-06 01:37:41 +00:00
Evan Cheng 5fd4fc76bf SRThreshold is meant to be inclusive.
llvm-svn: 66227
2009-03-06 00:56:43 +00:00
Dale Johannesen 073ab5acab Tweak the check for promotable alloca's to handle
debug intrinsics correctly.

llvm-svn: 66225
2009-03-06 00:42:50 +00:00
Devang Patel ab16577ade Do not let debug info prevert globalopt from shriking a global vars to boolean.
llvm-svn: 66224
2009-03-06 00:21:00 +00:00
Devang Patel 0c970f94e9 Add "check/remove dbg var" helper routines.
llvm-svn: 66223
2009-03-06 00:19:37 +00:00
Devang Patel 709d6ac46d GlobalOpt only process non constant local GVs while optimizing global vars.
If non constant local GV named A is used by a constant local GV named B (e.g. llvm.dbg.variable) and B is not used by anyone else then eliminate A as well as B.

In other words, debug info should not interfere in removal of unused GV.
--This life, and those below, will be ignored--

M    test/Transforms/GlobalOpt/2009-03-03-dbg.ll
M    lib/Transforms/IPO/GlobalOpt.cpp

llvm-svn: 66167
2009-03-05 18:12:02 +00:00
Evan Cheng b7922dee15 Do not split edges to EH landing pads. It will cause code size explosion.
llvm-svn: 66140
2009-03-05 06:31:26 +00:00
Dale Johannesen 78ab338024 Fix another case where debug info was affecting
codegen.  I convinced myself it was OK to skip all
pointer bitcasts here too.

llvm-svn: 66122
2009-03-05 02:06:48 +00:00
Bill Wendling 0bf1ded7bd Add comment to emphasize that the while body is empty.
llvm-svn: 66115
2009-03-05 01:08:35 +00:00
Dale Johannesen ad6b47377f Fix another case where a dbg.declare meant something
had 2 uses instead of 1.

llvm-svn: 66112
2009-03-05 00:39:02 +00:00
Bill Wendling 803da0db79 Temporarily revert r65994. It was causing rdar://6646455.
llvm-svn: 66083
2009-03-04 22:02:09 +00:00
Dale Johannesen df4226c0e2 Re-commit 65975 and a fix for the problem that
was causing llvm-gcc to fail to build.  I've
verified it bootstraps now; good enough for me.

llvm-svn: 66073
2009-03-04 21:24:04 +00:00
Dan Gohman 66476b582d Fix this comment.
llvm-svn: 66065
2009-03-04 20:50:23 +00:00
Dan Gohman ae0035ee15 Add an assertion for a condition that's always true, and not
immediately obvious.

llvm-svn: 66062
2009-03-04 20:49:01 +00:00
Chris Lattner a41bb40458 complete comment.
llvm-svn: 66055
2009-03-04 19:23:25 +00:00
Chris Lattner b5b0c87be6 this wasn't intended to be committed.
llvm-svn: 66054
2009-03-04 19:22:30 +00:00
Chris Lattner 5c204c92a4 Fix PR3720 by properly propagating alignment information from memcpy/memmove
onto element accesses.

llvm-svn: 66053
2009-03-04 19:20:50 +00:00
Dale Johannesen 845e582cbe Revert unintended commmit.
llvm-svn: 66001
2009-03-04 02:09:48 +00:00
Dale Johannesen d71c20081c Skip ptr-to-ptr bitcasts when counting in another case.
llvm-svn: 66000
2009-03-04 02:06:53 +00:00
Dale Johannesen c8b5a6ef7d Always skip ptr-to-ptr bitcasts when counting,
per Chris' suggestion.  Slightly faster.

llvm-svn: 65999
2009-03-04 01:53:05 +00:00
Devang Patel 812459613b If a global constant is dead then global's debug info should not prevent the optimizer in deleting the global. And while deleting global, delete global's debug info also.
llvm-svn: 65994
2009-03-04 01:22:23 +00:00
Dale Johannesen 0365d3b8b5 Make my earlier patch to skip debug intrinsics
when counting work; it was only off by 1.

llvm-svn: 65993
2009-03-04 01:20:34 +00:00
Dale Johannesen 738c60f259 Marking debug info intrinsics as not touching memory
caused them to be considered trivially dead.  Fix this.

llvm-svn: 65979
2009-03-03 23:30:00 +00:00
Dale Johannesen 09c3e8ec00 Instruction counters must skip the bitcasts that
feed into llvm.dbg.declare nodes, as well as
the debug directives themselves.

llvm-svn: 65976
2009-03-03 22:36:47 +00:00
Devang Patel b833ce74d8 Recursively remove dead argument while removing llvm.dbg.declare intrinsic.
llvm-svn: 65971
2009-03-03 21:31:02 +00:00
Dale Johannesen 77456b7ab4 When removing a store to an alloca that has only one
use, check also for the case where it has two uses,
the other being a llvm.dbg.declare.  This is needed so
debug info doesn't affect codegen.

llvm-svn: 65970
2009-03-03 21:26:39 +00:00
Bill Wendling 7fcd6148f7 Remove accidental check-ins in r65960. :-(
llvm-svn: 65961
2009-03-03 19:25:16 +00:00
Bill Wendling a68fc7af63 Use > instead of >=. We want to promote aggregates of 128-bytes.
llvm-svn: 65960
2009-03-03 19:18:49 +00:00
Bill Wendling 3e44bf3c4b Reapply r65755, but reversing "<" to ">=".
llvm-svn: 65945
2009-03-03 12:12:58 +00:00
Dan Gohman 92b551bc2b Fix a bunch of Doxygen syntax issues. Escape special characters,
and put @file directives on their own comment line.

llvm-svn: 65920
2009-03-03 02:55:14 +00:00
Dale Johannesen 0192552340 Don't count DebugInfo instructions in another limit
(lest they affect codegen).

llvm-svn: 65915
2009-03-03 01:43:03 +00:00
Dale Johannesen e1bb2f86f9 When sinking an insn in InstCombine bring its debug
info with it.
Don't count debug info insns against the scan maximum
in FindAvailableLoadedValue (lest they affect codegen).

llvm-svn: 65910
2009-03-03 01:09:07 +00:00
Devang Patel cc40a61af7 Ignore debug info intrinsics.
llvm-svn: 65908
2009-03-03 00:28:44 +00:00
Devang Patel d50ebbdf3f If branch conditions' one successor is dominating another non-latch successor then this loop's iteration space can not be restricted. In this example block bb5 is always executed.
llvm-svn: 65902
2009-03-02 23:39:14 +00:00
Devang Patel 49d64927e1 Remove all dbg symobls, including those with circular references.
This is ugly, but I can't figure out a quick way out of this.

llvm-svn: 65889
2009-03-02 22:50:58 +00:00
Duncan Sands 5795a6091d Fix PR3694: add an instcombine micro-optimization that helps
clean up when using variable length arrays in llvm-gcc.

llvm-svn: 65832
2009-03-02 09:18:21 +00:00
Bill Wendling 38eae046cf Temporarily revert r65755. It was causing failures in the self-hosting
testsuite:

Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/dg.exp ...
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/nancvt.ll
Failed with exit(1) at line 2
while running: grep 2147027116 nancvt.ll.tmp | count 3
count: expected 3 lines and got        0.
child process exited abnormally
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/vec_ins_extract.ll
Failed with exit(1) at line 1
while running:  llvm-as < /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/vec_ins_extract.ll |  opt -scalarrepl -instcombine |   llc -march=x86 -mcpu=yonah | not /usr/bin/grep sub.*esp
      subl      $28, %esp
      subl      $28, %esp
child process exited abnormally

And more.

llvm-svn: 65758
2009-03-01 03:55:12 +00:00
Chris Lattner e2bb5e31c8 hoist the check for alloca size up so that it controls CanConvertToScalar
as well as isSafeAllocaToScalarRepl.

llvm-svn: 65755
2009-03-01 02:26:47 +00:00
Nick Lewycky 34709f84d8 Silence compiler warning about use of uninitialized variables (in reality these
are always set by reference on the path that uses them.) No functional change.

llvm-svn: 65621
2009-02-27 06:37:39 +00:00
Nick Lewycky d05f6870c3 Fix compiler warning about uninitialized variables. No functional change.
llvm-svn: 65620
2009-02-27 06:29:31 +00:00
Zhou Sheng 264e46e1e9 Ignore dbg info intrinsics when folding conditional branch to
conditional branch predecessors.

llvm-svn: 65509
2009-02-26 06:56:37 +00:00
Chris Lattner af618171f4 Fix PR3667
llvm-svn: 65464
2009-02-25 18:20:01 +00:00
Zhou Sheng 5d9cc1763b Don't block basic block with only SwitchInst to fold into predecessors.
llvm-svn: 65456
2009-02-25 15:34:27 +00:00
Dan Gohman 0bddac16a8 Rename ScalarEvolution's getIterationCount to getBackedgeTakenCount,
to more accurately describe what it does. Expand its doxygen comment
to describe what the backedge-taken count is and how it differs
from the actual iteration count of the loop. Adjust names and
comments in associated code accordingly.

llvm-svn: 65382
2009-02-24 18:55:53 +00:00
Dan Gohman 4f356bb9b0 Fix a ValueTracking rule: RHS means operand 1, not 0. Add a simple
ashr instcombine to help expose this code. And apply the fix to
SelectionDAG's copy of this code too.

llvm-svn: 65364
2009-02-24 02:00:40 +00:00
Dan Gohman 5d1f458f0f Generalize the ChangeCompareStride code, in preparation for
handling non-constant strides. No functionality change.

llvm-svn: 65363
2009-02-24 01:58:00 +00:00
Dan Gohman e669884749 Preserve the DominanceFrontier analysis in the LoopDeletion pass.
llvm-svn: 65359
2009-02-24 01:21:53 +00:00
Devang Patel e288082644 While folding unconditional return move DbgRegionEndInst into the predecessor, instead of removing it. This fixes following tests from llvmgcc42 testsuite.
gcc.c-torture/execute/20000605-3.c
gcc.c-torture/execute/20020619-1.c
gcc.c-torture/execute/20030920-1.c
gcc.c-torture/execute/loop-ivopts-1.c

llvm-svn: 65353
2009-02-24 00:05:16 +00:00
Dan Gohman f6e8c77e1c Back out the change in 64918 that used sign-extensions when promoting
trip counts that use signed comparisons. It's not obviously the best
approach for preserving trip count information, and at any rate there
isn't anything in the tree right now that makes use of that, so for
now always using zero-extensions is preferable.

llvm-svn: 65347
2009-02-23 23:20:35 +00:00
Dan Gohman e591411fd6 LoopDeletion needs to inform ScalarEvolution when a loop is deleted,
so that ScalarEvolution doesn't hang onto a dangling Loop*, which
could be a problem if another Loop happens to get allocated at the
same address.

llvm-svn: 65323
2009-02-23 17:10:29 +00:00
Dan Gohman 42987f528a IndVarSimplify preserves ScalarEvolution. In the
-std-compile-opts sequence, this avoids the need for ScalarEvolution to
be rerun before LoopDeletion.

llvm-svn: 65318
2009-02-23 16:29:41 +00:00
Zhou Sheng 3a86bcf134 Should reset DBI_Prev if DBI_Next == 0.
llvm-svn: 65314
2009-02-23 10:14:11 +00:00
Mon P Wang dccfa0b26c Changed option name from inline-threshold to basic-inline-threshold because
inline-threshold option is used by the inliner.

llvm-svn: 65309
2009-02-23 07:07:56 +00:00
Chris Lattner d5420f0957 fix some typos that Duncan noticed
llvm-svn: 65306
2009-02-23 05:56:17 +00:00
Dan Gohman 648c5e9c99 Revert the part of 64623 that attempted to align the source in a
memcpy to match the alignment of the destination. It isn't necessary
for making loads and stores handled like the SSE loadu/storeu
intrinsics, and it was causing a performance regression in
MultiSource/Applications/JM/lencod.

The problem appears to have been a memcpy that copies from some
highly aligned array into an alloca; the alloca was then being
assigned a large alignment, which required codegen to perform
dynamic stack-pointer re-alignment, which forced the enclosing
function to have a frame pointer, which led to increased spilling.

llvm-svn: 65289
2009-02-22 18:06:32 +00:00
Dan Gohman f394e58af5 Properly parenthesize this expression, fixing a real bug in the new
-full-lsr code, as well as a GCC warning.

llvm-svn: 65288
2009-02-22 16:40:52 +00:00
Evan Cheng 69decbf0b2 Only try to sink immediate when TLI is not null. It needs to check if immediate would fit in target addressing field.
llvm-svn: 65268
2009-02-22 07:31:19 +00:00
Nick Lewycky d44e80d7fc Don't sign extend the char when expanding char -> int during
load(bitcast(char[4] to i32*)) evaluation.

llvm-svn: 65246
2009-02-21 20:50:42 +00:00
Evan Cheng 1173ec7a2e Add AddrModeMatcher.cpp
llvm-svn: 65228
2009-02-21 07:05:11 +00:00
Evan Cheng 107b06c4b9 Teach LSR sink to sink the immediate portion of the common expression back into uses if they fit in address modes of all the uses.
llvm-svn: 65215
2009-02-21 02:06:47 +00:00
Chris Lattner bef6b2098e rename a function to indicate that it checks for profitability as well
as legality.  Make load sinking and gep sinking more careful: we only
do it when it won't pessimize loads from the stack.  This has the added
benefit of not producing code that is unanalyzable to SROA.

llvm-svn: 65209
2009-02-21 00:46:50 +00:00
Evan Cheng 8a9481d50d Fix strange logic in CollectIVUsers used to determine whether all uses are
addresses, part 1. This fixes an obvious logic bug. Previously if the only
in-loop use is a PHI, it would return AllUsesAreAddresses as true.

llvm-svn: 65178
2009-02-20 22:16:49 +00:00
Dan Gohman 5e309a5bbb Simplify code and reduce indentation. No functionality change.
llvm-svn: 65167
2009-02-20 21:27:23 +00:00
Dan Gohman 2c8cb5b4ec Fix 80-column violations.
llvm-svn: 65159
2009-02-20 21:06:57 +00:00
Dan Gohman addc50b4ee It's not necessary to check if Base is null here.
llvm-svn: 65157
2009-02-20 21:05:23 +00:00
Dan Gohman 1608df5319 Add a comment about how Imm can be used for loop-variant values.
llvm-svn: 65147
2009-02-20 20:29:04 +00:00
Evan Cheng c380864d2c Factor address mode matcher out of codegen prepare to make it available to other passes, e.g. loop strength reduction.
llvm-svn: 65134
2009-02-20 18:24:38 +00:00
Zhou Sheng 053737e1ae Just roll back the previous change to -mem2reg.
Will re-think about this according to Chris's comments.

llvm-svn: 65126
2009-02-20 17:49:33 +00:00
Zhou Sheng 6a0634d423 patch to update the line number information in pass -mem2reg.
Currently this pass will delete the variable declaration info, 
and keep the line number info. But the kept line number info is not updated, 
and some is redundant or not correct, this patch just updates those info.

llvm-svn: 65123
2009-02-20 16:31:35 +00:00
Dan Gohman 2a12ae7d1f Implement "superhero" strength reduction, or full strength
reduction of address calculations down to basic pointer arithmetic.
This is currently off by default, as it needs a few other features
before it becomes generally useful. And even when enabled, full
strength reduction is only performed when it doesn't increase
register pressure, and when several other conditions are true.

This also factors out a bunch of exisiting LSR code out of
StrengthReduceStridedIVUsers into separate functions, and tidies
up IV insertion. This actually decreases register pressure even
in non-superhero mode. The change in iv-users-in-other-loops.ll
is an example of this; there are two more adds because there are
two fewer leas, and there is less spilling.

llvm-svn: 65108
2009-02-20 04:17:46 +00:00
Dan Gohman a34d7adefb Use DEBUG() instead of passing *DOUT to WriteAsOperand,
since the latter just passes a null reference when
debugging is not enabled.

llvm-svn: 65060
2009-02-19 19:32:06 +00:00
Dan Gohman 30a2959367 Make the debug output of LSR less cryptic and more informative.
llvm-svn: 65057
2009-02-19 19:23:27 +00:00
Duncan Sands 7a1db33e77 In theory the aliasee may have dead constant users
here.  Since we only do the transform if there is
one use, strip off any such users in the hope of
making the transform fire more often.

llvm-svn: 64926
2009-02-18 17:55:38 +00:00
Dan Gohman 8078b8bddc Use a sign-extend instead of a zero-extend when promoting a
trip count value when the original loop iteration condition is
signed and the canonical induction variable won't undergo signed
overflow. This isn't required for correctness; it just preserves
more information about original loop iteration values.

Add a getTruncateOrSignExtend method to ScalarEvolution,
following getTruncateOrZeroExtend.

llvm-svn: 64918
2009-02-18 17:22:41 +00:00
Dan Gohman aa0f01929b Simplify by using dyn_cast instead of isa and cast.
llvm-svn: 64917
2009-02-18 16:54:33 +00:00
Dan Gohman 8cab4c44bb Add explicit keywords.
llvm-svn: 64915
2009-02-18 16:37:45 +00:00
Dan Gohman 38a9631d5f Eliminate several more unnecessary intptr_t casts.
llvm-svn: 64888
2009-02-18 05:09:16 +00:00
Dan Gohman 8212ebb5cf Fix a corner case in the new indvars promotion logic: if there
are multiple IV's in a loop, some of them may under go signed
or unsigned wrapping even if the IV that's used in the loop
exit condition doesn't. Restrict sign-extension-elimination
and zero-extension-elimination to only those that operate on
the original loop-controlling IV.

llvm-svn: 64866
2009-02-18 00:52:00 +00:00
Dan Gohman d0b1fbd983 Fix a typo in a comment.
llvm-svn: 64859
2009-02-18 00:08:39 +00:00
Duncan Sands bf3ba5a1e9 If an alias is dead and so is its aliasee, then globaldce would
crash because the alias would still be using the aliasee when the
aliasee was deleted.

llvm-svn: 64844
2009-02-17 23:05:26 +00:00
Dan Gohman d90415555e LoopIndexSplit doesn't actually use ScalarEvolution.
llvm-svn: 64811
2009-02-17 20:50:11 +00:00
Dan Gohman 4330034160 Add a method to ScalarEvolution for telling it when a loop has been
modified in a way that may effect the trip count calculation. Change
IndVars to use this method when it rewrites pointer or floating-point
induction variables instead of using a doInitialization method to
sneak these changes in before ScalarEvolution has a chance to see
the loop. This eliminates the need for LoopPass to depend on
ScalarEvolution.

llvm-svn: 64810
2009-02-17 20:49:49 +00:00
Chris Lattner 24f31a0e59 commit a tweaked version of Daniel's patch for PR3599. We now
eliminate all the extensions and all but the one required truncate
from the testcase, but the or/and/shift stuff still isn't zapped.

llvm-svn: 64809
2009-02-17 20:47:23 +00:00
Dan Gohman f84d42f282 Delete trailing whitespace.
llvm-svn: 64784
2009-02-17 19:13:57 +00:00
Duncan Sands f974c5703c This transform also applies to private linkage.
llvm-svn: 64773
2009-02-17 17:50:04 +00:00
Dan Gohman efe65e547b Fix 80-column violation.
llvm-svn: 64766
2009-02-17 15:57:39 +00:00
Evan Cheng 161861deb0 Strengthen the "non-constant stride must dominate loop preheader" check.
llvm-svn: 64703
2009-02-17 00:13:06 +00:00
Dan Gohman 2cd8982002 Simplify; fix some 80-column violations.
llvm-svn: 64702
2009-02-17 00:10:53 +00:00
Dan Gohman f68d29edd5 Fix EnforceKnownAlignment so that it doesn't ever reduce the alignment
of an alloca or global variable.

llvm-svn: 64693
2009-02-16 23:02:21 +00:00
Nick Lewycky 0f269cfdee Fix typo caused by too much surfing, dudes...
llvm-svn: 64626
2009-02-16 04:26:53 +00:00
Dan Gohman 136aa1fb96 Delete this long-commented-out code. The situation it seems to have
been written for is no longer relevant with the elimination of
signed and unsigned types.

llvm-svn: 64625
2009-02-16 02:57:42 +00:00
Dan Gohman 9cdfd44521 Change these tests to use regular loads instead of llvm.x86.sse2.loadu.dq.
Enhance instcombine to use the preferred field of
GetOrEnforceKnownAlignment in more cases, so that regular IR operations are
optimized in the same way that the intrinsics currently are.

llvm-svn: 64623
2009-02-16 00:44:23 +00:00
Nick Lewycky 8f4a097f15 Update the list of function annotations for nocapture. All of these came up
when I was looking at functions used by python.

Highlights include, better largefile support (64-bit file sizes on 32-bit
systems), fputs string is nocapture, popen/pclose added (popen being noalias
return), modf and frexp and friends. Also added some missing 'break' statements
and combined identical sections.

llvm-svn: 64615
2009-02-15 22:47:25 +00:00
Duncan Sands 46196aef82 Make this more useful for cleaning up after the
one-definition-rule llvm-gcc changes (coming soon
to a tree near you!).

llvm-svn: 64588
2009-02-15 11:54:49 +00:00
Duncan Sands b3f27881a9 If the target of an alias has internal linkage, then the
alias can be morphed into the target.  Implement this
transform, and fix a crash in the existing transform at
the same time.

llvm-svn: 64583
2009-02-15 09:56:08 +00:00
Evan Cheng e79841adbb Fix pr3571: If stride is a value defined by an instruction, make sure it dominates the loop preheader. When IV users are strength reduced, the stride is inserted into the preheader. It could create a use before def situation.
llvm-svn: 64579
2009-02-15 06:06:15 +00:00
Evan Cheng fe151ba135 ifdef out unneeded if statement.
llvm-svn: 64575
2009-02-15 03:20:37 +00:00
Dan Gohman 671f2c085f Extend the IndVarSimplify support for promoting induction variables:
- Test for signed and unsigned wrapping conditions, instead of just
   testing for non-negative induction ranges. 
 - Handle loops with GT comparisons, in addition to LT comparisons.
 - Support more cases of induction variables that don't start at 0.

llvm-svn: 64532
2009-02-14 02:31:09 +00:00
Dan Gohman 47ff6aad23 Clarify debug output.
llvm-svn: 64531
2009-02-14 02:26:50 +00:00
Dan Gohman 4bfa1d4c63 Simplify some code. hasComputableLoopEvolution is overkill in this case.
No functionality change.

llvm-svn: 64530
2009-02-14 02:25:19 +00:00
Dan Gohman 55ea72179c In CodeGenPrepare's debug output, use WriteAsOperand instead of
printing getName(), so that unnamed values are printed correctly.

llvm-svn: 64468
2009-02-13 17:45:12 +00:00
Dan Gohman a2730abaaa Complete the sentance in this comment. I have reservations
about the code it describes, but at least now the comment
is right.

llvm-svn: 64465
2009-02-13 17:36:42 +00:00
Nick Lewycky d234a845f9 Mark strto* as readonly when the endptr is null.
llvm-svn: 64460
2009-02-13 17:08:33 +00:00
Nick Lewycky a0e83a0952 On strtod and friends, mark 'endptr' nocapture in the function prototype, and
mark the first argument nocapture if endptr=NULL for each particular call.

llvm-svn: 64453
2009-02-13 15:31:46 +00:00
Dan Gohman f71a473720 Fix the code that checked if a SCEVAddRecExpr Start contains an
addrec in a different loop to check the value being added to
the accumulated Start value, not the Start value before it has
the new value added to it. This prevents LSR from going crazy
on the included testcase. Dale, please review.

llvm-svn: 64440
2009-02-13 03:58:31 +00:00
Dan Gohman ba83228cdb Fix LSR's IV sorting function to explicitly sort by bitwidth
after sorting by stride value. This prevents it from missing
IV reuse opportunities in a host-sensitive manner.

llvm-svn: 64415
2009-02-13 00:26:43 +00:00
Dan Gohman eb6be650ce Teach IndVarSimplify to optimize code using the C "int" type for
loop induction on LP64 targets. When the induction variable is
used in addressing, IndVars now is usually able to inserst a
64-bit induction variable and eliminates the sign-extending cast.
This is also useful for code using C "short" types for
induction variables on targets with 32-bit addressing.

Inserting a wider induction variable is easy; the tricky part is
determining when trunc(sext(i)) expressions are no-ops. This
requires range analysis of the loop trip count. A common case is
when the original loop iteration starts at 0 and exits when the
induction variable is signed-less-than a fixed value; this case
is now handled.

This replaces IndVarSimplify's OptimizeCanonicalIVType. It was
doing the same optimization, but it was limited to loops with
constant trip counts, because it was running after the loop
rewrite, and the information about the original induction
variable is lost by that point.

Rename ScalarEvolution's executesAtLeastOnce to
isLoopGuardedByCond, generalize it to be able to test for
ICMP_NE conditions, and move it to be a public function so that
IndVars can use it.

llvm-svn: 64407
2009-02-12 22:19:27 +00:00
Dan Gohman 656b097b8a Add a utility function to LoopInfo to return the exit block
when the loop has exactly one exit, and make use of it in
LoopIndexSplit.

llvm-svn: 64388
2009-02-12 18:08:24 +00:00
Dan Gohman e0d32c490a This code doesn't actually use the ExitingBlocks list.
llvm-svn: 64376
2009-02-12 16:36:26 +00:00
Chris Lattner feb129e813 Fix a nasty bug (PR3550) where the inline pass could incorrectly mark
calls with the tail marker when inlining them through an invoke.  Patch,
testcase, and perfect analysis by Jay Foad!

llvm-svn: 64364
2009-02-12 07:06:42 +00:00
Chris Lattner 096f44de61 improve naming of values in GVN, patch by Jay Foad!
llvm-svn: 64363
2009-02-12 07:00:35 +00:00
Chris Lattner 5297c63565 fix PR3537: if resetting bbi back to the start of a block, we need to
forget about already inserted expressions.

llvm-svn: 64362
2009-02-12 06:56:08 +00:00
Nick Lewycky b92c4d72a7 Don't mark all args to strtod and friends as nocapture.
llvm-svn: 64352
2009-02-12 03:18:34 +00:00
Nate Begeman 318aea93bf the two non-mask arguments to a shufflevector must be the same width, but they do not have to be the same
width as the result value.

llvm-svn: 64335
2009-02-11 22:36:25 +00:00
Devang Patel 316705027b If llvm.dbg.region.end is disappearing then remove corresponding llvm.dbg.func.start also.
llvm-svn: 64278
2009-02-11 01:29:06 +00:00
Devang Patel 654e47f366 Ignore dbg intrinsic while folding unconditional branch.
llvm-svn: 64242
2009-02-10 22:14:17 +00:00
Devang Patel da1a632a87 Use early exits. Reduce indentation.
llvm-svn: 64226
2009-02-10 19:28:07 +00:00
Devang Patel 4bed3565f3 Do not clone llvm.dbg.func.start and corresponding llvm.dbg.region.end during inlining.
llvm-svn: 64209
2009-02-10 07:48:18 +00:00
Devang Patel caf4485781 Enable scalar replacement of AllocaInst whose one of the user is dbg info.
llvm-svn: 64207
2009-02-10 07:00:59 +00:00
Dale Johannesen cd19967754 Fix PR 3471, and some cleanups.
llvm-svn: 64177
2009-02-09 22:14:15 +00:00
Bill Wendling 415515077b Mistakenly turned this on.
llvm-svn: 64065
2009-02-08 01:32:00 +00:00
Bill Wendling 5469ec1072 Revert r63999. It was breaking self-hosting builds.
llvm-svn: 64062
2009-02-08 00:58:05 +00:00
Mon P Wang 21eb52a74f Instrcombine should not change load(cast p) to cast(load p) if the cast
changes the address space of the pointer.

llvm-svn: 64035
2009-02-07 22:19:29 +00:00
Mike Stump f009a51794 Insert space to avoid warning and make code more readable.
llvm-svn: 64003
2009-02-07 03:36:02 +00:00
Devang Patel 7cb8df4ce7 Ignore DbgInfoIntrinsics.
llvm-svn: 63923
2009-02-06 06:19:06 +00:00
Chris Lattner bbbb74372b fix PR3489, use bits instead of bytes.
llvm-svn: 63916
2009-02-06 04:34:07 +00:00
Devang Patel 409b794cfe Ignore dbg intrinsics while propagating conditional expression info. Take 2.
llvm-svn: 63898
2009-02-05 23:32:52 +00:00
Devang Patel 02f58e1e8d Revert rev. 63876. It is causing llvm-gcc bootstrap failure.
llvm-svn: 63888
2009-02-05 21:46:41 +00:00
Devang Patel 58cb603d2a Remove dead blocks in the end.
llvm-svn: 63880
2009-02-05 19:59:42 +00:00
Devang Patel 5922e26d1a Ignore dbg intrinsics while propagating conditional expression info.
llvm-svn: 63876
2009-02-05 19:15:39 +00:00
Devang Patel 086b212277 Ignore dbg intrinsics while folding switch instruction.
llvm-svn: 63802
2009-02-05 00:30:42 +00:00
Devang Patel 916fdce16d Ignore dbg intrinsics.
llvm-svn: 63781
2009-02-04 21:39:48 +00:00
Devang Patel fd9f635103 While folding vallue comparison terminators ignore dbg intrinsics.
llvm-svn: 63700
2009-02-04 01:06:11 +00:00
Devang Patel f10e287c65 Ignore dbg intrinsics while hoisting common code in the two blocks up into the branch block.
llvm-svn: 63687
2009-02-04 00:03:08 +00:00
Devang Patel 2032cadd0f Do not let dbg intrinsic block folding of two entry phi node.
llvm-svn: 63671
2009-02-03 22:12:02 +00:00
Devang Patel 43a1161379 If "optimize for size" attribute is set then block non-trivial loop unswitches but allow trivial loop unswitches.
llvm-svn: 63670
2009-02-03 22:04:27 +00:00
Chris Lattner ef37dc8511 teach "convert from scalar" to handle loads of fca's.
llvm-svn: 63659
2009-02-03 21:08:45 +00:00
Chris Lattner f5df53cb46 refactor the interface to ConvertUsesOfLoadToScalar,
renaming it to ConvertScalar_ExtractValue

llvm-svn: 63658
2009-02-03 21:01:03 +00:00
Chris Lattner 576baa4adf convert ConvertUsesOfLoadToScalar to use IRBuilder,
no functionality change.

llvm-svn: 63652
2009-02-03 19:45:44 +00:00
Chris Lattner c1fb96d347 switch ConvertScalar_InsertValue to use an IRBuilder, no
functionality change.

llvm-svn: 63651
2009-02-03 19:41:50 +00:00
Chris Lattner 18f56c295c make scalar conversion handle stores of first class
aggregate values.  loads are not yet handled (coming
soon to an sroa near you).

llvm-svn: 63649
2009-02-03 19:30:11 +00:00
Chris Lattner 73eff2e6e8 Make SROA produce a vector only when the alloca is actually
accessed at least once as a vector.  This prevents it from
compiling the example in not-a-vector into:

define double @test(double %A, double %B) {
	%tmp4 = insertelement <7 x double> undef, double %A, i32 0
	%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
	%tmp2 = extractelement <7 x double> %tmp, i32 4
	ret double %tmp2
}

instead, producing the integer code.  Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.

llvm-svn: 63638
2009-02-03 18:15:05 +00:00
Evan Cheng 8542caa3f7 APInt'fy SimplifyDemandedVectorElts so it can analyze vectors with more than 64 elements.
llvm-svn: 63631
2009-02-03 10:05:09 +00:00
Chris Lattner 80810b4c2d add another case of undefined behavior without crashing, PR3466.
llvm-svn: 63620
2009-02-03 07:08:57 +00:00
Nick Lewycky 05daea5d32 Revert r63600. It didn't fix the bug, it just moved it a bit.
llvm-svn: 63618
2009-02-03 06:30:37 +00:00
Nick Lewycky 12a130bd06 Update the callgraph when replacing InvokeInst with CallInst when inlining.
llvm-svn: 63600
2009-02-03 04:34:40 +00:00
Chris Lattner 6aa6b1f263 Teach ConvertUsesToScalar to handle memset, allowing it to handle
crazy cases like:

struct f {  int A, B, C, D, E, F; };
short test4() {
  struct f A;
  A.A = 1;
  memset(&A.B, 2, 12);
  return A.C;
}

llvm-svn: 63596
2009-02-03 02:01:43 +00:00
Chris Lattner 09b65ab288 rearrange how SRoA handles promotion of allocas to vectors.
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type 
itself.  This allows us to un-xfail 
test/CodeGen/X86/vec_ins_extract.ll.

llvm-svn: 63590
2009-02-03 01:30:09 +00:00
Chris Lattner 43cecd7c26 inline SROA::ConvertToScalar, no functionality change.
llvm-svn: 63544
2009-02-02 20:44:45 +00:00
Chris Lattner 18eba4f211 Fix a bug which caused us to miscompile a couple of Ada
tests.  Thanks for the beautiful reduced testcase Duncan!

llvm-svn: 63529
2009-02-02 18:02:59 +00:00
Duncan Sands 6f361ff345 Fix a comment (bytes -> bits), reformat a comment
and remove trailing whitespace.  No functionality
change.

llvm-svn: 63511
2009-02-02 10:06:20 +00:00
Duncan Sands 33d6e97e33 Fix an obvious thinko.
llvm-svn: 63510
2009-02-02 09:53:14 +00:00
Chris Lattner 1aafe4cece reduce indentation, (~XorCST->getValue()).isSignBit() -> isMaxSignedValue()
llvm-svn: 63500
2009-02-02 07:15:30 +00:00
Nick Lewycky f23908151a Reinstate this optimization to fold icmp of xor when possible. Don't try to
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.

llvm-svn: 63487
2009-01-31 21:30:05 +00:00
Chris Lattner 9e2b9f3234 Fix PR3452 (an infinite loop bootstrapping) by disabling the recent
improvements to the EvaluateInDifferentType code.  This code works 
by just inserted a bunch of new code and then seeing if it is 
useful.  Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point.  Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.

llvm-svn: 63483
2009-01-31 19:05:27 +00:00
Chris Lattner 76a63ed099 now that all the pieces are in place, teach instcombine's
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it.  This allows
it to simplify the code in multi-use-or.ll into a single 'add 
double'.

This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch.  When working on fixing those bugs,
this should be disabled.

llvm-svn: 63481
2009-01-31 08:40:03 +00:00
Chris Lattner 3e2cb66c56 simplify/clarify control flow and improve comments, no functionality change.
llvm-svn: 63480
2009-01-31 08:24:16 +00:00
Chris Lattner 83c6a141b8 make some fairly meaty internal changes to how SimplifyDemandedBits works.
Now, if it detects that "V" is the same as some other value, 
SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately.
This has two benefits:
1) simpler code in the recursive SimplifyDemandedBits routine.
2) it allows future fun stuff in instcombine where an operation has multiple
   uses and can be simplified in one context, but not all.

#2 isn't implemented yet, this patch should have no functionality change.

llvm-svn: 63479
2009-01-31 08:15:18 +00:00
Chris Lattner 585cfb2ce7 minor cleanups
llvm-svn: 63477
2009-01-31 07:26:06 +00:00
Chris Lattner 94cfb281c3 make sure to set Changed=true when instcombine hacks on the code,
not doing so prevents it from properly iterating and prevents it
from deleting the entire body of dce-iterate.ll

llvm-svn: 63476
2009-01-31 07:04:22 +00:00
Chris Lattner ec99c46d44 Simplify and generalize the SROA "convert to scalar" transformation to
be able to handle *ANY* alloca that is poked by loads and stores of 
bitcasts and GEPs with constant offsets.  Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc.  This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.

One case that is pretty great is that we compile 
2006-11-07-InvalidArrayPromote.ll into:

define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
	%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
	%tmp105 = bitcast <4 x i32> %tmp10 to i128
	%tmp1056 = zext i128 %tmp105 to i256	
	%tmp.upgrd.43 = lshr i256 %tmp1056, 96
	%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32	
	ret i32 %tmp.upgrd.44
}

which turns into:

_func:
	subl	$28, %esp
	cvttps2dq	%xmm1, %xmm0
	movaps	%xmm0, (%esp)
	movl	12(%esp), %eax
	addl	$28, %esp
	ret

Which is pretty good code all things considering :).

One effect of this is that SROA will start generating arbitrary bitwidth 
integers that are a multiple of 8 bits.  In the case above, we got a 
256 bit integer, but the codegen guys assure me that it can handle the 
simple and/or/shift/zext stuff that we're doing on these operations.

This addresses rdar://6532315

llvm-svn: 63469
2009-01-31 02:28:54 +00:00
Gabor Greif 97f1720621 use precise getters
llvm-svn: 63402
2009-01-30 18:21:13 +00:00
Chris Lattner df17987c19 Fix some issues with volatility, move "CanConvertToScalar" check
after the others.

llvm-svn: 63227
2009-01-28 20:16:43 +00:00
Duncan Sands 5a913d61e3 Rename getAnalysisToUpdate to getAnalysisIfAvailable.
llvm-svn: 63198
2009-01-28 13:14:17 +00:00
Mon P Wang 3537a62704 Fixed optimization of combining two shuffles where the first shuffle inputs
has a different number of elements than the output.

llvm-svn: 62998
2009-01-26 04:39:00 +00:00
Chris Lattner 9449991c4f Handle single-entry phi nodes gracefully in condprop.
llvm-svn: 62985
2009-01-26 02:18:20 +00:00
Chris Lattner 7b6647c178 Fix PR3408 by making a non-obvious assumption very obvious, and
handling the flaw inherent in that assumption.  :)

llvm-svn: 62984
2009-01-26 02:11:30 +00:00
Chris Lattner 57cb472b56 More cleanups and simplifications, no functionality change.
llvm-svn: 62983
2009-01-26 01:57:01 +00:00
Chris Lattner d67aaa6560 tidy asserts
llvm-svn: 62982
2009-01-26 01:38:24 +00:00
Nick Lewycky 5647c5d1a4 The function that does nothing but call malloc is noalias return.
llvm-svn: 62956
2009-01-25 07:59:57 +00:00
Dale Johannesen 2b3389a626 Revert previous change; even this mild and clearly
more accurate change loses more than it gains on
benchmarks.

llvm-svn: 62938
2009-01-24 21:49:34 +00:00
Torok Edwin f4395ea97a testcase for PR3381.
Also it was an empty struct, not a void after all.

llvm-svn: 62920
2009-01-24 17:16:04 +00:00
Torok Edwin 73ff92272f void* is represented as pointer to empty struct {}.
Thus we need to check whether the struct is empty before trying to index into
it. This fixes PR3381.

llvm-svn: 62918
2009-01-24 11:30:49 +00:00
Dale Johannesen 899ecdbbba Improve the inlining cost function a bit.
Little practical effect.

llvm-svn: 62908
2009-01-24 01:27:33 +00:00
Chris Lattner 72cd68fe64 Make InstCombineStoreToCast handle aggregates more aggressively,
handling the case in Transforms/InstCombine/cast-store-gep.ll, which
is a heavily reduced testcase from Clang on x86-64.

llvm-svn: 62904
2009-01-24 01:00:13 +00:00
Gabor Greif 59c431347f use CallSite::isCalle instead of slow getOperandNo
llvm-svn: 62877
2009-01-23 21:17:04 +00:00
Gabor Greif eb61fcf2a1 Simplify the logic of getting hold of a PHI predecessor block.
There is now a direct way from value-use-iterator to incoming block in PHINode's API.
This way we avoid the iterator->index->iterator trip, and especially the costly
getOperandNo() invocation. Additionally there is now an assertion that the iterator
really refers to one of the PHI's Uses.

llvm-svn: 62869
2009-01-23 19:40:15 +00:00
Gabor Greif f4013373cd introduce a useful abstraction to find out if a Use is in the call position of an instruction
llvm-svn: 62788
2009-01-22 21:35:57 +00:00
Chris Lattner 77527f5812 Remove uses of uint32_t in favor of 'unsigned' for better
compatibility with cygwin.  Patch by Jay Foad!

llvm-svn: 62695
2009-01-21 18:09:24 +00:00
Dale Johannesen b5721632ee Make special cases (0 inf nan) work for frem.
Besides APFloat, this involved removing code
from two places that thought they knew the
result of frem(0., x) but were wrong.

llvm-svn: 62645
2009-01-21 00:35:19 +00:00
Chris Lattner c59945b4bd another fix for PR3354
llvm-svn: 62561
2009-01-20 01:15:41 +00:00
Bill Wendling caf1d22243 Doxygen-ify comments.
llvm-svn: 62546
2009-01-19 23:43:56 +00:00
Chris Lattner ea9f1d3c47 Fix a problem exposed by PR3354: simplifycfg was making a potentially
trapping instruction be executed unconditionally.

llvm-svn: 62541
2009-01-19 23:03:13 +00:00
Chris Lattner 73d7fe5a34 improve compatibility with cygwin, patch by Jay Foad!
llvm-svn: 62535
2009-01-19 22:00:18 +00:00
Chris Lattner 6f34e317e9 Fix PR3353, infinitely jump threading an infinite loop make from switches.
llvm-svn: 62529
2009-01-19 21:20:34 +00:00
Bill Wendling 534d2e0bae Temporarily revert r62487. It's causing this error during a release bootstrap of
llvm-gcc. Most likely, it's miscompiling one of the "gen*" programs:

/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./prev-gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./prev-gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.6.0/bin/ -c -g -O2 -mdynamic-no-pic -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -mdynamic-no-pic -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/build -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include  -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include  -D_DEBUG  -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS   -o build/gencondmd.o build/gencondmd.c
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected '}' before ')' token
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: warning: excess elements in struct initializer
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: warning: (near initialization for 'insn_conditions[4]')
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected '}' before ')' token
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected ',' or ';' before ')' token
../../llvm-gcc.src/gcc/config/i386/mmx.md:927: error: expected identifier or '(' before ',' token
../../llvm-gcc.src/gcc/config/i386/sse.md:3458: error: expected identifier or '(' before ',' token
...

llvm-svn: 62506
2009-01-19 08:46:20 +00:00
Chris Lattner f2bb4ea39c Fix PR3016, a bug which can occur do to an invalid assumption:
we assumed a CFG structure that would be valid when all code in 
the function is reachable, but not all code is necessarily 
reachable.  Do a simple, but horrible, CFG walk to check for this
case.

llvm-svn: 62487
2009-01-19 02:46:28 +00:00
Chris Lattner e381d7026f reduce indentation by using 'continue', no functionality change.
llvm-svn: 62477
2009-01-19 02:07:32 +00:00
Chris Lattner 54f0c61d71 Fix some problems in SpeculativelyExecuteBB. Basically,
because of dead code, a phi could use the speculated instruction
that was not in "BB2".  Make this check explicit and tighten up 
some other corners.  This fixes PR3292.  No testcase becauase this
depends entirely on visitation order of blocks and requires a 
sequence of 8 passes to repro.

llvm-svn: 62476
2009-01-19 00:36:37 +00:00
Chris Lattner e1c01e4e2b Make this a bit more explicit about which cases need the
check.  No functionality change.

llvm-svn: 62474
2009-01-18 23:22:07 +00:00
Chris Lattner 64b7bd7f9e Fix rdar://6505632, an llc crash on 483.xalancbmk
llvm-svn: 62470
2009-01-18 20:35:00 +00:00
Duncan Sands e0aa0d677d BasicAliasAnalysis and FunctionAttrs were both
doing very similar pointer capture analysis.
Factor out the common logic.  The new version
is from FunctionAttrs since it does a better
job than the version in BasicAliasAnalysis

llvm-svn: 62461
2009-01-18 12:19:30 +00:00
Nick Lewycky 3ced0dfa69 Fix copy and pasted typos that prevented strtok_r, realloc, getenv, ungetc,
putc, puts, perror, vscanf and vsscanf from getting annotations.

Add annotations for eight printf functions, memalign, pread and pwrite.

On Linux, llvm-gcc sometimes renames strdup, getc, putc, strtok_r, scanf and
sscanf. Match the alternate function names.

Fix a crash annotating opendir.

Don't mark fsetpos's second parameter as nocapture. It's supposed to be
captured.

Do mark fopen's path and mode strings as nocapture. Mark ferror as readonly,
but not fileno which may set errno.

llvm-svn: 62456
2009-01-18 04:34:36 +00:00
Gabor Greif f1abfdccdc introduce typedef for complicated vector, and use it too
llvm-svn: 62384
2009-01-17 00:09:08 +00:00
Gabor Greif 8c573f7e49 typo
llvm-svn: 62377
2009-01-16 23:08:50 +00:00
Chris Lattner db2d9613d2 Fix PR3335 by not turning a store to one address space into a store to another.
llvm-svn: 62351
2009-01-16 20:12:52 +00:00
Chris Lattner 733256fe31 reduce indentation by using early exits, no functionality change.
llvm-svn: 62350
2009-01-16 20:08:59 +00:00
Evan Cheng beac6f8b0c Clean up previous cast optimization a bit. Also make zext elimination a bit more aggressive: if it's not necessary to emit an AND (i.e. high bits are already zero), it's profitable to evaluate the operand at a different type.
llvm-svn: 62297
2009-01-16 02:11:43 +00:00
Rafael Espindola 6de96a1b5d Add the private linkage.
llvm-svn: 62279
2009-01-15 20:18:42 +00:00
Gabor Greif 5aa1922614 avoid using iterators when they get invalidated potentially
this fixes PR3332

llvm-svn: 62271
2009-01-15 18:40:09 +00:00
Evan Cheng ff716cb342 Eliminate a redundant check.
llvm-svn: 62264
2009-01-15 17:09:07 +00:00
Evan Cheng 60e19a46f2 - Teach CanEvaluateInDifferentType of this xform: sext (zext ty1), ty2 -> zext ty2
- Looking at the number of sign bits of the a sext instruction to determine  whether new trunc + sext pair should be added when its source is being evaluated in a different type.

llvm-svn: 62263
2009-01-15 17:01:23 +00:00
Chris Lattner 8fb9480ed2 Fix PR3325, a miscompilation of invokes by IPSCCP. Patch by Jay Foad!
llvm-svn: 62244
2009-01-14 21:01:16 +00:00
Dale Johannesen 1f0e0e7c9c Fix the time regression I introduced in 464.h264ref with
my earlier patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

Also, when we build an expression that involves a (possibly
non-affine) IV from a different loop as well as an IV from
the one we're interested in (containsAddRecFromDifferentLoop),
don't recurse into that.  We can't do much with it and will
get in trouble if we try to create new non-affine IVs or something.

More testcases are coming.

llvm-svn: 62212
2009-01-14 02:35:31 +00:00
Chris Lattner 2538eb664c rewrite OptimizeAwayTrappingUsesOfLoads to 1) avoid a temporary
vector and extraneous loop over it, 2) not delete globals used by
phis/selects etc which could actually be useful.  This fixes PR3321.
Many thanks to Duncan for narrowing this down.

llvm-svn: 62201
2009-01-14 00:12:58 +00:00
Dale Johannesen 0aeabdff57 Fix testsuite regressions from recursive inlining.
llvm-svn: 62189
2009-01-13 22:43:37 +00:00
Dan Gohman 59af77376c Make instcombine ensure that all allocas are explicitly aligned at at
least their preferred alignment.

llvm-svn: 62176
2009-01-13 20:18:38 +00:00
Duncan Sands 944ccc5d6a Correct a comment.
llvm-svn: 62165
2009-01-13 13:48:44 +00:00
Dale Johannesen 433a9086c0 Enable recursive inlining. Reduce inlining threshold
back to 200; 400 seems to be too high, loses more than
it gains.

llvm-svn: 62107
2009-01-12 22:11:50 +00:00
Duncan Sands dc020f9c3c Rename getABITypeSize to getTypePaddedSize, as
suggested by Chris.

llvm-svn: 62099
2009-01-12 20:38:59 +00:00
Dale Johannesen f84685290a Increase default inlining aggressiveness in partial
compensation for turning off gcc's inliner.  This gets
us closer to the amount of inlining we were getting before.
It is not a win on everything, of course, but seems to
gain overall.

llvm-svn: 62058
2009-01-11 23:11:00 +00:00
Chris Lattner bd3c7c8b52 Duncan is nervous about undefinedness of % with negatives. I'm
not thrilled about 64-bit % in general, so rewrite to use * instead.

llvm-svn: 62047
2009-01-11 20:41:36 +00:00
Chris Lattner b19151686f do not generated GEPs into vectors where they don't already exist.
We should treat vectors as atomic types, not like arrays.

llvm-svn: 62046
2009-01-11 20:23:52 +00:00
Chris Lattner 171d2d474f Make a couple of cleanups to the instcombine bitcast/gep
canonicalization transform based on duncan's comments:

1) improve the comment about %.
2) within our index loop make sure the offset stays 
   within the *type size*, instead of within the *abi size*.
   This allows us to reason explicitly about landing in tail
   padding and means that issues like non-zero offsets into
   [0 x foo] types don't occur anymore.

llvm-svn: 62045
2009-01-11 20:15:20 +00:00
Chris Lattner 5f54d50917 fix typo Duncan noticed.
llvm-svn: 61997
2009-01-09 18:31:39 +00:00
Chris Lattner ae0e857b98 Fix PR3304
llvm-svn: 61995
2009-01-09 18:18:43 +00:00
Misha Brukman 5cbf223916 Removed trailing whitespace from Makefiles.
llvm-svn: 61991
2009-01-09 16:44:42 +00:00
Chris Lattner f50aa6ae5c Implement rdar://6480391, extending of equality icmp's to avoid a truncation.
I noticed this in the code compiled for a routine using std::map, which produced
this code:
	%25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly
	%.lobit.i = lshr i32 %25, 31		; <i32> [#uses=1]
	%tmp.i = trunc i32 %.lobit.i to i8		; <i8> [#uses=1]
	%toBool = icmp eq i8 %tmp.i, 0		; <i1> [#uses=1]
	br i1 %toBool, label %bb3, label %bb4
which compiled to:

	call	L_memcmp$stub
	shrl	$31, %eax
	testb	%al, %al
	jne	LBB1_11	## 

with this change, we compile it to:

	call	L_memcmp$stub
	testl	%eax, %eax
	js	LBB1_11

This triggers all the time in common code, with patters like this:

	%169 = and i32 %ply, 1		; <i32> [#uses=1]
	%170 = trunc i32 %169 to i8		; <i8> [#uses=1]
	%toBool = icmp ne i8 %170, 0		; <i1> [#uses=1]

 	%7 = lshr i32 %6, 24		; <i32> [#uses=1]
	%9 = trunc i32 %7 to i8		; <i8> [#uses=1]
	%10 = icmp ne i8 %9, 0		; <i1> [#uses=1]

etc

llvm-svn: 61985
2009-01-09 07:47:06 +00:00
Chris Lattner 0f7cf1d7e1 Remove some old code that looks like a remanant from signed-types days.
llvm-svn: 61984
2009-01-09 07:10:58 +00:00
Chris Lattner 482eb70a10 Fix PR3298, a crash in Jump Threading. Apparently even
jump threading can have bugs, who knew? ;-)

llvm-svn: 61983
2009-01-09 06:08:12 +00:00
Chris Lattner fef138b140 Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible.
llvm-svn: 61980
2009-01-09 05:44:56 +00:00
Chris Lattner a784a2ce01 move some code, check to see if the input to the GEP is a bitcast
(which is constant time and cheap) before checking hasAllZeroIndices.

llvm-svn: 61976
2009-01-09 04:53:57 +00:00
Dale Johannesen 4755d9df78 Adjustments to last patch based on review.
llvm-svn: 61969
2009-01-09 01:30:11 +00:00
Dale Johannesen b48fc71fc6 Do not inline functions with (dynamic) alloca into
functions that don't already have a (dynamic) alloca.
Dynamic allocas cause inefficient codegen and we shouldn't
propagate this (behavior follows gcc).  Two existing tests
assumed such inlining would be done; they are hacked by
adding an alloca in the caller, preserving the point of
the tests.

llvm-svn: 61946
2009-01-08 21:45:23 +00:00
Chris Lattner c518dfd11b This implements the second half of the fix for PR3290, handling
loads from allocas that cover the entire aggregate.  This handles
some memcpy/byval cases that are produced by llvm-gcc.  This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).

llvm-svn: 61915
2009-01-08 05:42:05 +00:00
Duncan Sands 0bcf085845 Whitespace - correct formatting.
llvm-svn: 61879
2009-01-07 20:01:06 +00:00
Duncan Sands 289f59f233 Remove alloca tracking from nocapture analysis. Not only
was it not very helpful, it was also wrong!  The problem
is shown in the testcase: the alloca might be passed to
a nocapture callee which dereferences it and returns the
original pointer.  But because it was a nocapture call we
think we don't need to track its uses, but we do.

llvm-svn: 61876
2009-01-07 19:39:06 +00:00
Duncan Sands 94bcbbab74 Reorder these.
llvm-svn: 61873
2009-01-07 19:17:02 +00:00
Duncan Sands 02599850b4 Use a switch rather than a sequence of "isa" tests.
llvm-svn: 61872
2009-01-07 19:10:21 +00:00
Duncan Sands 187c5716b6 The verifier checks that the aliasee is not null.
llvm-svn: 61870
2009-01-07 18:45:53 +00:00
Chris Lattner f2b8c82ad1 Implement the first half of PR3290: if there is a store of an
integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and 
filing them away in the SROA'd elements.

This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these.  For 
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"

In 176.gcc I see a few i32 stores to "%struct..0anon".

In the testcase, this is a difference between compiling test1 to:

_test1:
	subl	$12, %esp
	movl	20(%esp), %eax
	movl	%eax, 4(%esp)
	movl	16(%esp), %eax
	movl	%eax, (%esp)
	movl	(%esp), %eax
	addl	4(%esp), %eax
	addl	$12, %esp
	ret

vs:

_test1:
	movl	8(%esp), %eax
	addl	4(%esp), %eax
	ret

The second half of this will be to handle loads of the same form.

llvm-svn: 61853
2009-01-07 08:11:13 +00:00
Chris Lattner 9a2de65fd6 Factor a bunch of code out into a helper method.
llvm-svn: 61852
2009-01-07 07:18:45 +00:00
Chris Lattner db561146aa use continue to simplify code and reduce nesting, no functionality
change.

llvm-svn: 61851
2009-01-07 06:39:58 +00:00
Chris Lattner 938b54f383 Get TargetData once up front and cache as an ivar instead of
requerying it all over the place.

llvm-svn: 61850
2009-01-07 06:34:28 +00:00
Chris Lattner a63dba9e6c Use the hasAllZeroIndices predicate to simplify some
code, no functionality change.

llvm-svn: 61849
2009-01-07 06:25:07 +00:00
Chris Lattner 2fdcc59bb6 Change m_ConstantInt and m_SelectCst to take their constant integers
as template arguments instead of as instance variables, exposing more
optimization opportunities to the compiler earlier.

llvm-svn: 61776
2009-01-05 23:53:12 +00:00
Duncan Sands 582c53d147 Teach the internalize pass to also internalize
global aliases.

llvm-svn: 61754
2009-01-05 21:24:45 +00:00
Evan Cheng 8804293fe9 Find loop back edges only after empty blocks are eliminated.
llvm-svn: 61752
2009-01-05 21:17:27 +00:00
Duncan Sands 52e5deece5 Not having an aliasee is a theoretical possibility.
llvm-svn: 61745
2009-01-05 20:47:56 +00:00
Duncan Sands 821d13cf78 Format more neatly.
llvm-svn: 61744
2009-01-05 20:39:50 +00:00
Duncan Sands d24b93f339 Remove trailing spaces.
llvm-svn: 61743
2009-01-05 20:38:27 +00:00
Duncan Sands f5dbbae4f4 Delete unused global aliases with internal linkage.
In fact this also deletes those with linkonce linkage,
however this is currently dead because for the moment
aliases aren't allowed to have this linkage type.

llvm-svn: 61742
2009-01-05 20:37:33 +00:00
Dan Gohman 906152a20f Tidy up #includes, deleting a bunch of unnecessary #includes.
llvm-svn: 61715
2009-01-05 17:59:02 +00:00
Nick Lewycky e4e5532e05 Move the libcall annotating part from doFinalization to doInitialization.
Finalization occurs after all the FunctionPasses in the group have run, which
is clearly not what we want.

This also means that we have to make sure that we apply the right param 
attributes when creating a new function.

Also, add a missed optimization: strdup and strndup. NoCapture and 
NoAlias return!

llvm-svn: 61658
2009-01-05 00:07:50 +00:00
Nick Lewycky 959af7ba30 Run a post-pass that marks known function declarations by name.
llvm-svn: 61632
2009-01-04 20:27:34 +00:00
Bill Wendling 0c04f9fdc3 Revert this transform. It was causing some dramatic slowdowns in a few tests. See PR3266.
llvm-svn: 61623
2009-01-04 06:19:11 +00:00
Nick Lewycky 1d805c62c4 Any void readonly functions are provably dead, don't waste time adding
nocapture attributes to them.

llvm-svn: 61610
2009-01-03 17:05:32 +00:00
Duncan Sands c7affb0a8f Load tracking means that the value analyzed may
not have pointer type.  In particular, it may
be the condition argument for a select or a GEP
index.  While I was unable to construct a testcase
for which some bits of the original pointer are
captured due to one of these, it's very very close
to being possible - so play safe and exclude these
possibilities.

llvm-svn: 61580
2009-01-02 15:16:38 +00:00
Duncan Sands b193a37cd3 When calculating 'nocapture' argument attributes, allow
the argument to be stored to an alloca by tracking uses
of the alloca.  This occurs 4 times (out of 7121, 0.05%)
in MultiSource/Applications, so may not be worth it.  On
the other hand, it is easy to do and fairly cheap.  The
functions it helps are: W_addcom and W_addlit in spiff;
process_args (argv) in d (make_dparser); ercPixConcealIMB
in JM/ldecod.

llvm-svn: 61570
2009-01-02 11:54:37 +00:00
Duncan Sands cefc8604aa Improve comments and reorganize a bit - no functionality
change.

llvm-svn: 61569
2009-01-02 11:46:24 +00:00
Nick Lewycky 7e82055e88 Make adding nocapture a bit stronger. FreeInst is nocapture. Also,
functions that don't write can't leak a pointer except through 
the return value, so a void readonly function is implicitly nocapture.

Test these, and add a test that verifies that f1 calling f2 with an 
otherwise dead pointer gets both of them marked nocapture.

llvm-svn: 61552
2009-01-02 03:46:56 +00:00
Duncan Sands 1f11d2bbc1 Mention that this pass does escape analysis in the
leading comments.

llvm-svn: 61548
2009-01-01 20:45:19 +00:00
Bill Wendling 0fcff2c203 Fix comment.
llvm-svn: 61538
2009-01-01 01:19:59 +00:00
Bill Wendling aedb54a947 Add transformation:
xor (or (icmp, icmp), true) -> and(icmp, icmp)

This is possible because of De Morgan's law.

llvm-svn: 61537
2009-01-01 01:18:23 +00:00
Duncan Sands 163848021b Look through phi nodes and select instructions when
calculating nocapture attributes.

llvm-svn: 61535
2008-12-31 20:21:34 +00:00
Duncan Sands df128eb477 Don't analyze arguments already marked 'nocapture'.
llvm-svn: 61532
2008-12-31 18:08:59 +00:00
Duncan Sands 44c8cd97a5 Rename AddReadAttrs to FunctionAttrs, and teach it how
to work out (in a very simplistic way) which function
arguments (pointer arguments only) are only dereferenced
and so do not escape.  Mark such arguments 'nocapture'.

llvm-svn: 61525
2008-12-31 16:14:43 +00:00
Duncan Sands f6069577fa Experiments show that looking through phi nodes
and select instructions doesn't buy anything here
except extra complexity: the only difference in
the entire testsuite was that a readonly function
became readnone in MiBench/consumer-typeset.  Add
a comment about this.

llvm-svn: 61478
2008-12-29 20:51:17 +00:00
Duncan Sands c125d6a3d3 Allow readnone functions to read (and write!) global
constants, since doing so is irrelevant for aliasing
purposes.  While this doesn't increase the total number
of functions marked readonly or readnone in MultiSource/
Applications (3089), it does result in 12 functions being
marked readnone rather than readonly.
Before:
  readnone: 820
  readonly: 2269
After:
  readnone: 832
  readonly: 2257

llvm-svn: 61469
2008-12-29 11:34:09 +00:00
Dale Johannesen 656237beca Revert 61362 and 61402 until SPEC breakage is fixed.
llvm-svn: 61403
2008-12-23 23:21:35 +00:00
Dale Johannesen f8b161bcd1 This fixes the bug in 175.vpr. It doesn't fix the
other SPEC breakage.  I'll be reverting all recent
changes shortly, this checking is mostly so this
change doesn't get lost.

llvm-svn: 61402
2008-12-23 23:05:26 +00:00
Dale Johannesen 93b9aa8799 Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

I owe some testcases for this, want to get it in for nightly runs.

llvm-svn: 61362
2008-12-23 02:12:52 +00:00
Owen Anderson 164274eeb1 Don't forget to remove phi nodes from the value numbering table after we collapse them.
llvm-svn: 61358
2008-12-23 00:49:51 +00:00
Bill Wendling 456e885382 Comment clean-ups. No functionality change.
llvm-svn: 61354
2008-12-22 22:32:22 +00:00
Bill Wendling e7f08e7250 Check that the instruction isn't in the value numbering scope.
llvm-svn: 61353
2008-12-22 22:28:56 +00:00
Bill Wendling 86f01cb9f6 Simplification: Negate the operator== method instead of implementing a full operator!= method.
llvm-svn: 61352
2008-12-22 22:16:31 +00:00
Bill Wendling 3c793441cb Add verification that deleted instruction isn't hiding in the PHI map.
llvm-svn: 61350
2008-12-22 22:14:07 +00:00
Bill Wendling ebb6a543fa Verify removed in a few more places.
llvm-svn: 61349
2008-12-22 21:57:30 +00:00
Bill Wendling 6b18a3994b Add verification functions to GVN which check to see that an instruction was
truely deleted. These will be expanded with further checks of all of the data
structures.

llvm-svn: 61347
2008-12-22 21:36:08 +00:00
Nick Lewycky 10eb8e533f Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2).
llvm-svn: 61297
2008-12-21 00:19:21 +00:00
Nick Lewycky 4bc10c9e77 Remove redundant test for vector-nature. Scan the vector first to see whether
our optz'n will apply to it, then build the replacement vector only if needed.

llvm-svn: 61279
2008-12-20 16:48:00 +00:00
Evan Cheng 3b3de7c228 - CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges.
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.

llvm-svn: 61248
2008-12-19 18:03:11 +00:00
Bill Wendling 070de29fcf Didn't mean to commit this.
llvm-svn: 61222
2008-12-18 22:19:50 +00:00
Bill Wendling 4c13e77d49 Re-XFAIL this test until debug stuff settles down.
llvm-svn: 61219
2008-12-18 22:13:31 +00:00
Nick Lewycky c3a70ade66 Oops! Left out a line.
Simplifying the sdiv might allow further simplifications for our users.

llvm-svn: 61196
2008-12-18 06:42:28 +00:00
Nick Lewycky 0f0e63fe73 Make all the vector elements positive in an srem of constant vector.
llvm-svn: 61195
2008-12-18 06:31:11 +00:00
Chris Lattner 4caf5eb70c Fix PR2929 by making bugpoint/code extract propagate the nothrow
bit from the original function to the cloned one.

llvm-svn: 61194
2008-12-18 05:52:56 +00:00
Dale Johannesen 3e5843b992 Revert previous patch, appears to break bootstrap.
llvm-svn: 61181
2008-12-18 01:23:41 +00:00
Dale Johannesen 12d031b716 Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  (This patch does not handle 
all the cases where this can happen.)  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Everything above is exercised in
CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is
the same IR).

llvm-svn: 61178
2008-12-18 00:57:22 +00:00
Chris Lattner b6372933b5 reapply this hunk from Bill's reversion in r61169, it is conservative
and safe and orthogonal from turning off load pre.

llvm-svn: 61177
2008-12-18 00:51:32 +00:00
Chris Lattner c1c6404bba make instnamer name unnamed blocks as well as instructions and args.
llvm-svn: 61175
2008-12-18 00:33:11 +00:00
Bill Wendling be4fb8a25f Temporarily revert r61027. It was causing a bootstrap failure in "release" mode
with everyone's favorite error messages:

Comparing stages 2 and 3
warning: ./cc1-checksum.o differs
warning: ./cc1plus-checksum.o differs
Bootstrap comparison failure!
./c-decl.o differs
./cp/decl.o differs
./df-core.o differs
./gcc.o differs
./i386.o differs
./stor-layout.o differs
./tree-pretty-print.o differs
./tree.o differs
make[2]: *** [compare] Error 1
make[1]: *** [stage3-bubble] Error 2

See PR3227.

llvm-svn: 61169
2008-12-17 23:31:20 +00:00
Chris Lattner 0cdf52310a insert some sequence points and preincrement an iterator to avoid
iterator invalidation problems.

llvm-svn: 61124
2008-12-17 05:42:08 +00:00
Chris Lattner 222ef4c489 Enhance heap sra to be substantially more aggressive w.r.t PHI
nodes.  This allows it to do fairly general phi insertion if a 
load from a pointer global wants to be SRAd but the load is used
by (recursive) phi nodes.  This fixes a pessimization on ppc
introduced by Load PRE.

llvm-svn: 61123
2008-12-17 05:28:49 +00:00
Dale Johannesen 904ce8120d Clarify that the scale factor from CheckForIVReuse
can be negative.  Keep track of whether all uses of
an IV are outside the loop.  Some cosmetics; no
functional change.

llvm-svn: 61109
2008-12-16 22:16:28 +00:00
Chris Lattner 56b55387fc Fix another crash found by inspection. If we have a PHI node merging
the load multiple times, make sure the check the uses of the PHI to 
ensure they are transformable.

llvm-svn: 61102
2008-12-16 21:24:51 +00:00
Chris Lattner 06a456b3f4 fix a crash found by inspection.
llvm-svn: 61101
2008-12-16 21:04:51 +00:00
Eli Friedman cb61afb546 Add a helper to remove a branch and DCE the condition, and use it
consistently for deleting branches.  In addition to being slightly 
more readable, this makes SimplifyCFG a bit better 
about cleaning up after itself when it makes conditions unused.

llvm-svn: 61100
2008-12-16 20:54:32 +00:00
Chris Lattner 6ddde53783 switch some std::set/std::map to SmallPtrSet/DenseMap.
llvm-svn: 61081
2008-12-16 07:34:30 +00:00
Chris Lattner 49e3bdc165 enhance heap-sra to apply to fixed sized array allocations, not just
variable sized array allocations.

llvm-svn: 61051
2008-12-15 21:44:34 +00:00
Chris Lattner 1c731fa86f Use stripPointerCasts.
llvm-svn: 61047
2008-12-15 21:20:32 +00:00
Chris Lattner f0eb568021 minor tweaks for formatting, allow bitcast in ValueIsOnlyUsedLocallyOrStoredToOneGlobal.
llvm-svn: 61046
2008-12-15 21:08:54 +00:00
Chris Lattner c4274a71d5 refactor some code into a new TryToOptimizeStoreOfMallocToGlobal function.
Use GetElementPtrInst::hasAllZeroIndices where possible.

llvm-svn: 61045
2008-12-15 21:02:25 +00:00
Chris Lattner 0c68ae0603 Enable Load PRE. This teaches GVN to push partially redundant loads up the
CFG when there is exactly one predecessor where the load is not available.
This is designed to not increase code size but still eliminate partially
redundant loads.  This fires 1765 times on 403.gcc even though it doesn't
do critical edge splitting yet (the most common reason for it to fail).

llvm-svn: 61027
2008-12-15 05:28:29 +00:00
Owen Anderson 03aacbae90 Ifdef out some code that I didn't mean to enable by default yet.
llvm-svn: 61024
2008-12-15 03:52:17 +00:00
Chris Lattner 69131fd872 make GVN try to rename inputs to the resultant replaced values, which
cleans up the generated code a bit.  This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)

llvm-svn: 61023
2008-12-15 03:46:38 +00:00
Owen Anderson bfe133e4ac Add support for slow-path GVN with full phi construction for scalars. This is disabled for now, as it actually pessimizes code in the abscence
of phi translation for load elimination.  This slow down GVN a bit, by about 2% on 403.gcc.

llvm-svn: 61021
2008-12-15 02:03:00 +00:00
Chris Lattner f5eef9f6db eliminate warning when asserts disabled.
llvm-svn: 61012
2008-12-14 21:36:23 +00:00
Owen Anderson e34c2399de Generalize GVN's phi construciton routine to work for things other than loads.
llvm-svn: 61009
2008-12-14 19:10:35 +00:00
Bill Wendling 293b9181e5 Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM:
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
  "llvm::APFloat::IEEEsingle", referenced from:
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
  "llvm::APFloat::IEEEdouble", referenced from:
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found

This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.

llvm-svn: 60977
2008-12-13 09:28:44 +00:00
Chris Lattner 1e29f7c97d make RLE preserve the name of the load that it replaces. This is just
a pretification of the IR.

llvm-svn: 60973
2008-12-13 07:22:47 +00:00
Misha Brukman 234b44add2 Fix spelling.
llvm-svn: 60971
2008-12-13 05:21:37 +00:00
Chris Lattner fa9f99aa12 Teach GVN to invalidate some memdep information when it does an RAUW
of a pointer.  This allows is to catch more equivalencies.  For example,
the type_lists_compatible_p function used to require two iterations of
the gvn pass (!) to delete its 18 redundant loads because the first pass
would CSE all the addressing computation cruft, which would unblock the
second memdep/gvn passes from recognizing them.  This change allows
memdep/gvn to catch all 18 when run just once on the function (as is 
typical :) instead of just 3.

On all of 403.gcc, this bumps up the # reundandancies found from:

     63 gvn    - Number of instructions PRE'd
 153991 gvn    - Number of instructions deleted
  50069 gvn    - Number of loads deleted
to:
     63 gvn    - Number of instructions PRE'd
 154137 gvn    - Number of instructions deleted
  50185 gvn    - Number of loads deleted

+120 loads deleted isn't bad.

llvm-svn: 60799
2008-12-09 22:06:23 +00:00
Chris Lattner 254314e6bc rename getNonLocalDependency -> getNonLocalCallDependency, and remove
pointer stuff from it, simplifying the code a bit.

llvm-svn: 60783
2008-12-09 19:38:05 +00:00
Chris Lattner b6fc4b8d92 Switch GVN::processNonLocalLoad to using the new
MemDep::getNonLocalPointerDependency method.  There are
some open issues with this (missed optimizations) and
plenty of future work, but this does allow GVN to eliminate
*slightly* more loads (49246 vs 49033).

Switching over now allows simplification of the other code
path in memdep.

llvm-svn: 60780
2008-12-09 19:25:07 +00:00
Chris Lattner 0a5a8d54a9 random cleanups, no functionality change.
llvm-svn: 60779
2008-12-09 19:21:47 +00:00
Chris Lattner 56b20ffc5f Fix a really subtle off-by-one bug that Duncan noticed with valgrind
on test/CodeGen/Generic/2007-06-06-CriticalEdgeLandingPad.

llvm-svn: 60739
2008-12-09 04:47:21 +00:00
Chris Lattner e598370ae9 remove DebugIterations option. Despite the accusations,
jump threading has been shown to only expose problems not
have bugs itself.  I'm sure it's completely bug free! ;-)

llvm-svn: 60725
2008-12-08 22:44:07 +00:00
Devang Patel 2bb8a2f80f Fix spelling.
Thanks Duncan!

llvm-svn: 60702
2008-12-08 17:07:24 +00:00
Devang Patel 1c469d36b0 Undo previous patch.
llvm-svn: 60701
2008-12-08 17:02:37 +00:00
Chris Lattner f50d7f76c6 fix a bug I introduced in simplifycfg handling single entry phi
nodes. FoldSingleEntryPHINodes deletes the PHI, so there is no
need to delete it afterward.

llvm-svn: 60653
2008-12-07 07:22:45 +00:00
Chris Lattner 5df5b4cc2e don't bother touching volatile stores, they will just return clobber on
everything interesting anyway.

llvm-svn: 60640
2008-12-07 00:25:15 +00:00
Chris Lattner 57e91eaf61 Reimplement the inner loop of DSE. It now uniformly uses getDependence(),
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase).  This eliminates the last external client 
of MemDep::getDependenceFrom().

llvm-svn: 60619
2008-12-06 00:53:22 +00:00
Dale Johannesen 9efd2ce55b Make LoopStrengthReduce smarter about hoisting things out of
loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608
2008-12-05 21:47:27 +00:00
Chris Lattner 0e3d6337c6 Make a few major changes to memdep and its clients:
1. Merge the 'None' result into 'Normal', making loads
   and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
   distinguish between the cases when memdep knows the value is
   produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
   are CSEs into memdep instead of it being in GVN.  This still
   leaves verification that the arguments are hte same to GVN to
   let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use 
   getModRefInfo(CallSite,CallSite) instead of doing something 
   very weak.  This only really matters for things like DSA, but
   someday maybe we'll have some other decent context sensitive
   analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
   a) readonly call CSE is slightly simpler
   b) I eliminated the "getDependencyFrom" chaining for load 
      elimination and load CSE doesn't have to worry about 
      volatile (they are always clobbers) anymore.
   c) GVN no longer does any 'lastLoad' caching, leaving it to 
      memdep.
7. The logic in DSE is simplified a bit and sped up.  A potentially
   unsafe case was eliminated.

llvm-svn: 60607
2008-12-05 21:04:20 +00:00
Anton Korobeynikov 24600bf05a Revert invalid r60393. It causes llvm-gcc bootstrap fails in release builds.
See PR3160 for details

llvm-svn: 60604
2008-12-05 19:38:49 +00:00
Chris Lattner c100828026 Fix test/Transforms/GVN/pre-load.ll
llvm-svn: 60594
2008-12-05 17:04:12 +00:00
Chris Lattner d2a653af0c Make IsValueFullyAvailableInBlock safe.
llvm-svn: 60588
2008-12-05 07:49:08 +00:00
Devang Patel c56423b500 Rewrite code that 1) filters loops and 2) calculates new loop bounds.
This fixes many bugs. I will add more test cases in a separate check-in.

Some day, the code that manipulates CFG and updates dom. info could use refactoring help.

llvm-svn: 60554
2008-12-04 21:38:42 +00:00
Chris Lattner 8f723670ce Start simplifying a switch that has a successor that is a switch.
llvm-svn: 60534
2008-12-04 06:31:07 +00:00
Chris Lattner 75c2661d24 add a debugging option to help track down j-t problems.
llvm-svn: 60514
2008-12-04 00:07:59 +00:00
Dale Johannesen 4e9e6ea604 Remove an unused field.
llvm-svn: 60508
2008-12-03 22:43:56 +00:00
Dale Johannesen f7a588b909 Fix a misspelled function name.
llvm-svn: 60506
2008-12-03 20:56:12 +00:00
Chris Lattner dc3f6f2c12 Factor some code into a new FoldSingleEntryPHINodes method.
llvm-svn: 60501
2008-12-03 19:44:02 +00:00
Dale Johannesen d49ceff6ba Fix a really wrong comment.
llvm-svn: 60494
2008-12-03 19:25:46 +00:00
Chris Lattner 595c7279bd Teach jump threading some more simple tricks:
1) have it fold "br undef", which does occur with
   surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks.  This removes the successor
   edges, reducing the in-edges of other blocks, allowing 
   recursive simplification.
3) Fold things like:
     br COND, BBX, BBY
  BBX:
     br COND, BBZ, BBW

   which also happens because jump threading iterates.

llvm-svn: 60470
2008-12-03 07:48:08 +00:00
Chris Lattner 37e0136fef third time is the charm.
llvm-svn: 60469
2008-12-03 07:45:15 +00:00
Chris Lattner c04a1ffa9a fix assertion.
llvm-svn: 60468
2008-12-03 07:43:05 +00:00
Chris Lattner 7eb270ed03 Rename DeleteBlockIfDead to DeleteDeadBlock and make it
unconditionally delete the block.  All likely clients will
do the checking anyway.

llvm-svn: 60464
2008-12-03 06:40:52 +00:00
Chris Lattner bcc904a67c Factor some code out of SimplifyCFG, forming a new
DeleteBlockIfDead method.

llvm-svn: 60463
2008-12-03 06:37:44 +00:00
Dale Johannesen 4d2ecb8f68 Minor rewrite per review feedback.
llvm-svn: 60442
2008-12-02 21:17:11 +00:00
Dale Johannesen 70060013d2 Make the code do what the comment says it does.
llvm-svn: 60431
2008-12-02 18:40:09 +00:00
Chris Lattner 1db9bbe802 Implement PRE of loads in the GVN pass with a pretty cheap and
straight-forward implementation.  This does not require any extra
alias analysis queries beyond what we already do for non-local loads.

Some programs really really like load PRE.  For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.

The biggest limitation to the implementation is that it does not split
critical edges.  This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.

The implementation of this should incidentally speed up rejection of 
non-local loads because it avoids creating the repl densemap in cases 
when it won't be used for fully redundant loads.

This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact.  This is pretty close to ready though.

llvm-svn: 60408
2008-12-02 08:16:11 +00:00
Bill Wendling 87beb9b909 Remove some errors that crept in. No functionality change.
llvm-svn: 60403
2008-12-02 06:24:20 +00:00
Bill Wendling 790b4bf9a9 Merge two if-statements into one.
llvm-svn: 60402
2008-12-02 06:22:04 +00:00
Bill Wendling 5635295266 More styalistic changes. No functionality change.
llvm-svn: 60401
2008-12-02 06:18:11 +00:00
Bill Wendling 85de4b35ca - Remove the buggy -X/C -> X/-C transform. This isn't valid when X isn't a
constant. If X is a constant, then this is folded elsewhere.

- Added a note to Target/README.txt to indicate that we'd like to implement
  this when we're able.

llvm-svn: 60399
2008-12-02 05:12:47 +00:00
Bill Wendling 5369db5917 Improve comment.
llvm-svn: 60398
2008-12-02 05:09:00 +00:00
Bill Wendling 21716dff5e - Reduce nesting.
- No need to do a swap on a canonicalized pattern.

No functionality change.

llvm-svn: 60397
2008-12-02 05:06:43 +00:00
Chris Lattner ead1a61b47 some random comment improvements.
llvm-svn: 60395
2008-12-02 04:52:26 +00:00
Owen Anderson d930420ccf Fix an issue that Chris noticed, where local PRE was not properly instantiating
a new value numbering set after splitting a critical edge.  This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.

llvm-svn: 60393
2008-12-02 04:09:22 +00:00
Dale Johannesen 069a4eee55 Consider only references to an IV within the loop when
figuring out the base of the IV.  This produces better
code in the example.  (Addresses use (IV) instead of 
(BASE,IV) - a significant improvement on low-register
machines like x86).

llvm-svn: 60374
2008-12-01 22:00:01 +00:00
Bill Wendling 6f71bce4cf Don't rebuild RHSNeg. Just use the one that's already there.
llvm-svn: 60370
2008-12-01 21:06:30 +00:00
Bill Wendling 84f6f2539f Document what this check is doing. Also, no need to cast to ConstantInt.
llvm-svn: 60369
2008-12-01 21:03:43 +00:00
Bill Wendling e6c87a4952 Use a simple comparison. Overflow on integer negation can only occur when the
integer is "minint".

llvm-svn: 60366
2008-12-01 19:46:27 +00:00
Bill Wendling 47f733e4ea Generalize the FoldOrWithConstant method to fold for any two constants which
don't have overlapping bits.

llvm-svn: 60344
2008-12-01 08:32:40 +00:00
Bill Wendling 22e761b302 Reduce copy-and-paste code by splitting out the code into its own function.
llvm-svn: 60343
2008-12-01 08:23:25 +00:00
Bill Wendling 582fe6b0ca Use m_Specific() instead of double matching.
llvm-svn: 60341
2008-12-01 08:09:47 +00:00
Bill Wendling 4eecfb655b Move pattern check outside of the if-then statement. This prevents us from fiddling with constants unless we have to.
llvm-svn: 60340
2008-12-01 07:47:02 +00:00
Chris Lattner 6f5bf6a718 Rename some variables, only increment BI once at the start of the loop instead of throughout it.
llvm-svn: 60339
2008-12-01 07:35:54 +00:00
Chris Lattner f00aae4968 pull the predMap densemap out of the inner loop of performPRE, so
that it isn't reallocated all the time.  This is a tiny speedup for
GVN: 3.90->3.88s

llvm-svn: 60338
2008-12-01 07:29:03 +00:00
Chris Lattner 2b07d3ccde switch a couple more calls to use array_pod_sort.
llvm-svn: 60337
2008-12-01 06:52:57 +00:00
Chris Lattner 2c2dd15a85 Introduce a new array_pod_sort function and switch LSR to use it
instead of std::sort.  This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.

We should start using array_pod_sort where possible.

llvm-svn: 60335
2008-12-01 06:49:59 +00:00
Chris Lattner 2aebea5735 Eliminate use of setvector for the DeadInsts set, just use a smallvector.
This is a lot cheaper and conceptually simpler.

llvm-svn: 60332
2008-12-01 06:27:41 +00:00
Chris Lattner 4da78e3774 DeleteTriviallyDeadInstructions is always passed the
DeadInsts ivar, just use it directly.

llvm-svn: 60330
2008-12-01 06:14:28 +00:00
Chris Lattner a68a5a4784 simplify DeleteTriviallyDeadInstructions again, unlike my previous
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then 
notifying.

llvm-svn: 60329
2008-12-01 06:11:32 +00:00
Chris Lattner 9e6b243428 simplify these patterns using m_Specific. No need to grep for
xor in testcase (or is a substring).

llvm-svn: 60328
2008-12-01 05:16:26 +00:00
Chris Lattner 88a1f0213d Teach jump threading to clean up after itself, DCE and constfolding the
new instructions it simplifies.  Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block.  Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.

llvm-svn: 60327
2008-12-01 04:48:07 +00:00
Chris Lattner 084b3a47d3 Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs
instead of using FoldPHIArgBinOpIntoPHI.  In addition to being more
obvious, this also fixes a problem where instcombine wouldn't merge two
phis that had different variable indices.  This prevented instcombine
from factoring big chunks of code in 403.gcc.  For example:

 insn_cuid.exit:                
-       %tmp336 = load i32** @uid_cuid, align 4      
-       %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3    
-       %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32*               
-       %tmp339 = load i32* %tmp338, align 4           
-       %tmp340 = getelementptr i32* %tmp336, i32 %tmp339     
        br label %bb62
 
 bb61:       
-       %tmp341 = load i32** @uid_cuid, align 4     
-       %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3        
-       %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32*           
-       %tmp344 = load i32* %tmp343, align 4        
-       %tmp345 = getelementptr i32* %tmp341, i32 %tmp344          
        br label %bb62
 
 bb62:      
-       %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ]         
+       %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ]         
+       %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3     
+       %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32*  
+       %tmp341.pn = load i32** @uid_cuid     
+       %tmp344.pn = load i32* %tmp344.pn.in 
+       %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn   
        %iftmp.62.0 = load i32* %iftmp.62.0.in     

llvm-svn: 60325
2008-12-01 03:42:51 +00:00
Chris Lattner 9d02a70a7d Teach inst combine to merge GEPs through PHIs. This is really
important because it is sinking the loads using the GEPs, but
not the GEPs themselves.  This triggers 647 times on 403.gcc
and makes the .s file much much nicer.  For example before:

        je      LBB1_87 ## bb78
LBB1_62:        ## bb77
        leal    84(%esi), %eax
LBB1_63:        ## bb79
        movl    (%eax), %eax
...
LBB1_87:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
        jmp     LBB1_62 ## bb77


after:

        jne     LBB1_63 ## bb79
LBB1_62:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
LBB1_63:        ## bb79
        movl    84(%esi), %eax

The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):

        br i1 %tmp233, label %bb78, label %bb77
bb77:           
        %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb78:           
        call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
        %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb79:           
        %iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]           
        %iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in             

llvm-svn: 60322
2008-12-01 02:34:36 +00:00
Chris Lattner 9ce8995d24 Make GVN be more intelligent about redundant load
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal.  This makes load elimination
more aggressive.  For example, on 403.gcc, we had:

<     68 gvn    - Number of instructions PRE'd
< 152718 gvn    - Number of instructions deleted
<  49699 gvn    - Number of loads deleted
<   6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses

now we have:

>     64 gvn    - Number of instructions PRE'd
> 153623 gvn    - Number of instructions deleted
>  49856 gvn    - Number of loads deleted
>   5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses

That's an extra 157 loads deleted and extra 905 other instructions nuked.

This slows down GVN very slightly, from 3.91 to 3.96s.

llvm-svn: 60314
2008-12-01 01:31:36 +00:00
Chris Lattner 7e61dafc95 Reimplement the non-local dependency data structure in terms of a sorted
vector instead of a densemap.  This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster.  This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc

This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case.  Here's the stats
for 403.gcc:

  6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses

yay for caching :)

llvm-svn: 60313
2008-12-01 01:15:42 +00:00
Bill Wendling 5b902c5b1e Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of
permutations of this pattern.

llvm-svn: 60312
2008-12-01 01:07:11 +00:00
Chris Lattner 8541edec44 Cache analyses in ivars and add some useful DEBUG output.
This speeds up GVN from 4.0386s to 3.9376s.

llvm-svn: 60310
2008-12-01 00:40:32 +00:00
Chris Lattner 80c7d81e81 improve indentation, do cheap checks before expensive ones,
remove some fixme's.  This speeds up GVN very slightly on 403.gcc 
(4.06->4.03s)

llvm-svn: 60309
2008-11-30 23:39:23 +00:00
Eli Friedman 11c15a5de7 Minor cleanup: use getTrue and getFalse where appropriate. No
functional change.

llvm-svn: 60307
2008-11-30 22:48:49 +00:00
Eli Friedman 55e4becba9 Some minor cleanups to instcombine; no functionality change.
Note that the FoldOpIntoPhi call is dead because it's impossible for the 
first operand of a subtraction to be both a ConstantInt and a PHINode.

llvm-svn: 60306
2008-11-30 21:09:11 +00:00
Bill Wendling de89bc275c Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations.
llvm-svn: 60291
2008-11-30 13:52:49 +00:00
Bill Wendling 9eef421e12 Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This
takes care of all permutations of this pattern.

llvm-svn: 60290
2008-11-30 13:08:13 +00:00
Bill Wendling 2fe3229824 Forgot one remaining call to getSExtValue().
llvm-svn: 60289
2008-11-30 12:41:09 +00:00
Bill Wendling 2d2e7861b5 getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all
APInt calls instead.

This fixes PR3144.

llvm-svn: 60288
2008-11-30 12:38:24 +00:00
Eli Friedman 09bc610945 Optimize memmove and memset into the LLVM builtins. Note that these
only show up in code from front-ends besides llvm-gcc, like clang.

llvm-svn: 60287
2008-11-30 08:32:11 +00:00
Bill Wendling 7abf352f44 Don't make TwoToExp signed by default.
llvm-svn: 60279
2008-11-30 05:29:33 +00:00
Bill Wendling af200e9237 From Hacker's Delight:
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."

In this case, x == -1.

llvm-svn: 60278
2008-11-30 05:01:05 +00:00
Bill Wendling 70635adea3 Instcombine was illegally transforming -X/C into X/-C when either X or C
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.

llvm-svn: 60275
2008-11-30 03:42:12 +00:00