Commit Graph

19671 Commits

Author SHA1 Message Date
Owen Anderson 68c6732d2c Clean up a bunch of caching stuff in memdep. This reduces the time to run GVN
on 403.gcc from ~15s to ~10s.

llvm-svn: 40884
2007-08-07 00:33:45 +00:00
Devang Patel c70106cb30 Begin loop index split pass.
llvm-svn: 40883
2007-08-07 00:25:56 +00:00
Owen Anderson 4898513d96 Improve the accuracy of memdep for determining the dependencies of loads.
This brings GVN to parity with GCSE+LoadVN.

llvm-svn: 40882
2007-08-06 23:26:03 +00:00
Dale Johannesen a010822b45 Replace 4-line function with 10-line version per review comment.
llvm-svn: 40881
2007-08-06 22:10:35 +00:00
Dale Johannesen d1822ea7d1 Move lengthy conditional down 1 level per review comment.
llvm-svn: 40878
2007-08-06 21:48:35 +00:00
Dale Johannesen 75169a82d6 Get X86 long double calling convention to work
(on Darwin, anyway).  Fix some table omissions for
LD arithmetic.

llvm-svn: 40877
2007-08-06 21:31:06 +00:00
Chris Lattner f72a2db072 regenerate
llvm-svn: 40875
2007-08-06 21:00:46 +00:00
Chris Lattner 6a5a2620ba Fix PR1577, a crash on invalid bug.
llvm-svn: 40874
2007-08-06 21:00:37 +00:00
Chandler Carruth bebc3bb2e3 This resolves a regression of BasicAA which failed to find any memory information for overloaded intrinsics (PR1600). This resolves that issue, and improves the matching scheme to use a BitVector rather than a binary search.
llvm-svn: 40872
2007-08-06 20:57:16 +00:00
Nick Lewycky 8052019a20 It's safe to fold not of fcmp.
llvm-svn: 40870
2007-08-06 20:04:16 +00:00
Dale Johannesen e279fd6ce8 Make 80-bit store maintain simulated FP stack correctly.
llvm-svn: 40868
2007-08-06 19:50:32 +00:00
Nick Lewycky 96606cec20 Let scalar-evolution analyze loops with an unsigned comparison for the exit
condition. Fixes 1597.

llvm-svn: 40867
2007-08-06 19:21:00 +00:00
Nick Lewycky b9819f3a8b Don't assume it's safe to transform a loop just because it's dominated by any
comparison. Fixes bug 1598.

llvm-svn: 40866
2007-08-06 18:33:46 +00:00
Chris Lattner 079ebcfae5 Fix a regression compiling 2005-05-11-Popcount-ffs-fls with the CBE,
introduced by chandler's patch.

llvm-svn: 40864
2007-08-06 16:36:18 +00:00
Christopher Lamb 2e5fb9f71e Implement review feedback. No functionality change.
llvm-svn: 40863
2007-08-06 16:33:56 +00:00
David Greene 77b2accbca Make this code more efficient.
llvm-svn: 40861
2007-08-06 15:09:17 +00:00
Chris Lattner c7ba225705 remove some dead lines
llvm-svn: 40859
2007-08-06 06:21:06 +00:00
Chris Lattner 67f1c3335c 1. Random tidiness cleanups
2. Make domtree printing print dfin/dfout #'s
3. Fix the Transforms/LoopSimplify/2004-04-13-LoopSimplifyUpdateDomFrontier.ll failure from last night (in DominanceFrontier::splitBlock).

w.r.t. #3, my patches last night happened to expose the bug, but this 
has been broken since Owen's r35839 patch to LoopSimplify.  The code
was subsequently moved over from LoopSimplify into Dominators, carrying
the latent bug.  Fun stuff.

llvm-svn: 40858
2007-08-06 06:19:47 +00:00
Reid Spencer 446282ae3d Fix minor doxygen nits.
llvm-svn: 40854
2007-08-05 20:06:04 +00:00
Reid Spencer d959cfc882 Silence some warnings from doxygen about @param argument name not matching the
actual argument name of the documented function.

llvm-svn: 40851
2007-08-05 19:35:22 +00:00
Reid Spencer f13bcdc4a4 Escape some escapes that confuse doxygen.
llvm-svn: 40850
2007-08-05 19:33:11 +00:00
Reid Spencer 05d55b396c Fix a doxygen directive.
llvm-svn: 40849
2007-08-05 19:27:01 +00:00
Dale Johannesen b1888e73ad Long double patch 4 of N: initial x87 implementation.
Lots of problems yet but some simple things work.

llvm-svn: 40847
2007-08-05 18:49:15 +00:00
Chris Lattner 6299a45277 shorten this name
llvm-svn: 40843
2007-08-05 18:45:33 +00:00
Chris Lattner f0da7975ea at the end of instcombine, explicitly clear WorklistMap.
This shrinks it down to something small.  On the testcase
from PR1432, this speeds up instcombine from 0.7959s to 0.5000s,
(59%)

llvm-svn: 40840
2007-08-05 08:47:58 +00:00
Chris Lattner 6493fc79fe Upgrade BasicAliasAnalysis::getModRefBehavior to not call Value::getName,
which dynamically allocates the string result.  This speeds up dse on the
testcase from PR1432 from 0.3781s to 0.1804s (2.1x).

llvm-svn: 40838
2007-08-05 07:50:06 +00:00
Chris Lattner 44f7d3aa0b When clearing a SmallPtrSet, if the set had a huge capacity, but the
contents of the set were small, deallocate and shrink the set.  This
avoids having us to memset as much data, significantly speeding up
some pathological cases.  For example, this speeds up the verifier
from 0.3899s to 0.0763 (5.1x) on the testcase from PR1432 in a 
release build.

llvm-svn: 40837
2007-08-05 07:32:14 +00:00
Chris Lattner d2eb0c9653 Fix an iterator invalidation bug I induced.
llvm-svn: 40830
2007-08-05 00:24:30 +00:00
Chris Lattner 0e8f85f87e Switch some std::sets to SmallPtrSet. This speeds up
domtree by 10% and postdomtree by 17%

llvm-svn: 40829
2007-08-05 00:15:57 +00:00
Chris Lattner 5f5585c432 Switch DomTreeNode::assignDFSNumber from using a std::set to using
a smallptrset.  This speeds up domtree by about 15% and postdomtree by 20%.

llvm-svn: 40828
2007-08-05 00:10:08 +00:00
Chris Lattner 77e05fe20d Switch the internal "Info" map from an std::map to a DenseMap. This
speeds up idom by about 45% and postidom by about 33%.

Some extra precautions must be taken not to invalidate densemap iterators.

llvm-svn: 40827
2007-08-05 00:02:00 +00:00
Chris Lattner bd0fe01daf switch the DomTreeNodes and IDoms maps in idom/postidom to a
DenseMap instead of an std::map.  This speeds up postdomtree
by about 25% and domtree by about 23%.  It also speeds up clients,
for example, domfrontier by 11%, mem2reg by 4% and ADCE by 6%.

llvm-svn: 40826
2007-08-04 23:48:07 +00:00
Chris Lattner edce70d2fe rewrite the code used to construct pruned SSA form with the IDF method.
In the old way, we computed and inserted phi nodes for the whole IDF of 
the definitions of the alloca, then computed which ones were dead and
removed them.

In the new method, we first compute the region where the value is live,
and use that information to only insert phi nodes that are live.  This
eliminates the need to compute liveness later, and stops the algorithm
from inserting a bunch of phis which it then later removes.

This speeds up the testcase in PR1432 from 2.00s to 0.15s (14x) in a
release build and 6.84s->0.50s (14x) in a debug build.

llvm-svn: 40825
2007-08-04 22:50:14 +00:00
Chris Lattner d91576b01e Factor out a whole bunch of code into it's own method.
llvm-svn: 40824
2007-08-04 21:14:29 +00:00
Chris Lattner 4e1b4140eb Use getNumPreds(BB) instead of computing them manually. This is a very small but
measurable speedup.

llvm-svn: 40823
2007-08-04 21:06:15 +00:00
Chris Lattner b6a4ba808b Change the rename pass to be "tail recursive", only adding N-1 successors
to the worklist, and handling the last one with a 'tail call'.  This speeds
up PR1432 from 2.0578s to 2.0012s (2.8%)

llvm-svn: 40822
2007-08-04 20:40:27 +00:00
Chris Lattner 840259c8d3 cache computation of #preds for a BB. This speeds up
mem2reg from 2.0742->2.0522s on PR1432.

llvm-svn: 40821
2007-08-04 20:24:50 +00:00
Chris Lattner 050bac4bed reserve operand space for phi nodes when we insert them.
llvm-svn: 40820
2007-08-04 20:14:34 +00:00
Chris Lattner 9318785df5 use continue to avoid nesting, no functionality change.
llvm-svn: 40819
2007-08-04 20:07:06 +00:00
Chris Lattner 6b04ecbaf9 Promoting allocas with the 'single store' fastpath is
faster than with the 'local to a block' fastpath.  This speeds
up PR1432 from 2.1232 to 2.0686s (2.6%)

llvm-svn: 40818
2007-08-04 20:03:23 +00:00
Chris Lattner 4a930f9444 When PromoteLocallyUsedAllocas promoted allocas, it didn't remember
to increment NumLocalPromoted, and didn't actually delete the
dead alloca, leading to an extra iteration of mem2reg.

llvm-svn: 40817
2007-08-04 20:01:43 +00:00
Chris Lattner 63c039780c std::map -> DenseMap
llvm-svn: 40816
2007-08-04 19:52:20 +00:00
Nick Lewycky 20f0811fc0 Clean up comments, fix up some confusing code logic.
Predsimplify fails llvm-gcc bootstrap.

llvm-svn: 40815
2007-08-04 18:45:32 +00:00
Chris Lattner 7d382f7680 fix a logic bug where we wouldn't promote single store allocas if the
stored value was a non-instruction value.  Doh.

This increase the # single store allocas from 8982 to 9026, and
speeds up mem2reg on the testcase in PR1432 from 2.17 to 2.13s.

llvm-svn: 40813
2007-08-04 02:45:02 +00:00
Chris Lattner 1b215f0661 When we do the single-store optimization, delete both the store
and the alloca so they don't get reprocessed.

This speeds up PR1432 from 2.20s to 2.17s.

llvm-svn: 40812
2007-08-04 02:38:38 +00:00
Chris Lattner 862f125457 Three improvements:
1. Check for revisiting a block before checking domination, which is faster.
  2. If the stored value isn't an instruction, we don't have to check for domination.
  3. If we have a value used in the same block more than once, make sure to remove the
     block from the UsingBlocks vector.  Not doing so forces us to go through the slow
     path for the alloca.

The combination of these improvements increases the number of allocas on the fastpath
from 8935 to 8982 on PR1432.  This speeds it up from 2.90s to 2.20s (31%)

llvm-svn: 40811
2007-08-04 02:32:22 +00:00
Chris Lattner ae1e00eb36 switch from using a std::set to using a SmallPtrSet. This speeds up the
testcase in PR1432 from 6.33s to 2.90s (2.22x)

llvm-svn: 40810
2007-08-04 02:21:22 +00:00
Chris Lattner 9181801bb7 In mem2reg, when handling the single-store case, make sure to remove
a using block from the list if we handle it.  Not doing this caused us
to not be able to promote (with the fast path) allocas which have uses (whoops).

This increases the # allocas hitting this fastpath from 4042 to 8935 on the
testcase in PR1432, speeding up mem2reg by 2.6x

llvm-svn: 40809
2007-08-04 02:15:24 +00:00
Chandler Carruth 450f95c857 Regenerating.
llvm-svn: 40808
2007-08-04 01:56:21 +00:00
Chandler Carruth 7132e00de7 This is the patch to provide clean intrinsic function overloading support in LLVM. It cleans up the intrinsic definitions and generally smooths the process for more complicated intrinsic writing. It will be used by the upcoming atomic intrinsics as well as vector and float intrinsics in the future.
This also changes the syntax for llvm.bswap, llvm.part.set, llvm.part.select, and llvm.ct* intrinsics. They are automatically upgraded by both the LLVM ASM reader and the bitcode reader. The test cases have been updated, with special tests added to ensure the automatic upgrading is supported.

llvm-svn: 40807
2007-08-04 01:51:18 +00:00