Commit Graph

13660 Commits

Author SHA1 Message Date
Evan Cheng 65b8ccf6ac Revert r124518. It broke Linux self-host.
llvm-svn: 124522
2011-01-29 02:43:04 +00:00
Evan Cheng d4eff31476 Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand.
llvm-svn: 124518
2011-01-29 01:29:26 +00:00
Andrew Trick 24f5ff0f23 Implementation of path profiling.
Modified patch by Adam Preuss.

This builds on the existing framework for block tracing, edge profiling and optimal edge profiling.
See -help-hidden for new flags.
For documentation, see the technical report "Implementation of Path Profiling..." in llvm.org/pubs.

llvm-svn: 124515
2011-01-29 01:09:53 +00:00
Duncan Sands 771e82a863 My auto-simplifier noticed that ((X/Y)*Y)/Y occurs several times in SPEC
benchmarks, and that it can be simplified to X/Y.  (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case).  This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too.  This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too.  It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.

llvm-svn: 124487
2011-01-28 16:51:11 +00:00
Nick Lewycky 6593e6eaf6 Add missing include for ptrdiff_t. Patch by Joerg Sonnenberger!
llvm-svn: 124470
2011-01-28 03:52:25 +00:00
Roman Divacky 36b1b47c5a Introduce virtual ParseRegister method in TargetAsmParser.
Create override of this method in X86/ARM/MBlaze.

llvm-svn: 124378
2011-01-27 17:14:22 +00:00
Nick Lewycky 0c572ebc46 Add DenseSet::resize for API parity with DenseMap::resize.
llvm-svn: 124370
2011-01-27 09:10:42 +00:00
Jay Foad b0c5e35929 Simplify User::operator delete().
llvm-svn: 124330
2011-01-26 21:56:10 +00:00
Eric Christopher cd55a46c31 Temporarily revert 124275 to see if it brings the dragonegg buildbot back.
llvm-svn: 124312
2011-01-26 19:40:31 +00:00
David Greene bab5e6ed0e [AVX] Add INSERT_SUBVECTOR and support it on x86. This provides a
default implementation for x86, going through the stack in a similr
fashion to how the codegen implements BUILD_VECTOR.  Eventually this
will get matched to VINSERTF128 if AVX is available.

llvm-svn: 124307
2011-01-26 19:13:22 +00:00
David Greene b6f1611928 [AVX] Support EXTRACT_SUBVECTOR on x86. This provides a default
implementation of EXTRACT_SUBVECTOR for x86, going through the stack
in a similr fashion to how the codegen implements BUILD_VECTOR.
Eventually this will get matched to VEXTRACTF128 if AVX is available.

llvm-svn: 124292
2011-01-26 15:38:49 +00:00
Eric Christopher 078159e310 Separate out the constant bonus from the size reduction metrics. Rework
a few loops accordingly. Should be no functional change.

This is a step for more accurate cost/benefit analysis of devirt/inlining
bonuses.

llvm-svn: 124275
2011-01-26 02:58:39 +00:00
Devang Patel efc6b16e4b Provide an interface to transfer SDDbgValue from one SDNode to another.
llvm-svn: 124245
2011-01-25 23:27:42 +00:00
David Greene 78221d4679 [AVX] Fix a typo in the extract subvector type constraints to specify
the correct number of operands.

llvm-svn: 124234
2011-01-25 22:05:41 +00:00
David Greene 672a4a297e [AVX] Add TableGen classes for vector/subvector type constraints.
This will be used to check patterns referencing a forthcoming
INSERT_SUBVECTOR SDNode and will also be used to check
EXTRACT_SUBVECTOR nodes.

llvm-svn: 124191
2011-01-25 16:16:32 +00:00
Jay Foad 0c125f9f02 Avoid compiler errors when this header file is included first, by adding
a forward declaration of simplify_type<>.

llvm-svn: 124187
2011-01-25 14:33:33 +00:00
Duncan Sands d395108394 According to my auto-simplifier the most common missed simplifications in
optimized code are:
  (non-negative number)+(power-of-two) != 0 -> true
and
  (x | 1) != 0 -> true
Instcombine knows about the second one of course, but only does it if X|1
has only one use.  These fire thousands of times in the testsuite.

llvm-svn: 124183
2011-01-25 09:38:29 +00:00
Eric Christopher 6e959a5687 Perhaps a bit too much vertical whitespace.
llvm-svn: 124148
2011-01-24 22:19:14 +00:00
Dan Gohman 0f124e1987 Give GetUnderlyingObject a TargetData, to keep it in sync
with BasicAA's DecomposeGEPExpression, which recently began
using a TargetData. This fixes PR8968, though the testcase
is awkward to reduce.

Also, update several off GetUnderlyingObject's users
which happen to have a TargetData handy to pass it in.

llvm-svn: 124134
2011-01-24 18:53:32 +00:00
Rafael Espindola b3eca9bb71 Add support for the --noexecstack option.
llvm-svn: 124077
2011-01-23 17:55:27 +00:00
Cameron Zwarich b5ee5168a3 Convert a std::vector to a SmallVector for another 5.4% speedup on domtree.
llvm-svn: 124065
2011-01-23 06:54:22 +00:00
Cameron Zwarich a9339c6cd0 In the simpler version of the link-eval data structure that we use in dominator
computation, the Ancestor field is always set to the Parent, so we can remove
the explicit link entirely and merge the Parent and Ancestor fields. Instead of
checking for whether an ancestor exists for a node or not, we simply check
whether the node has already been processed. This is simpler if Compress is
inlined into Eval, so I did that as well.

This is about a 3% speedup running -domtree on test-suite + SPEC2000 & SPEC2006,
but it also opens up some opportunities for further improvement.

llvm-svn: 124061
2011-01-23 06:16:06 +00:00
Rafael Espindola 4b7b7fba38 Delay the creation of eh_frame so that the user can change the defaults.
Add support for SHT_X86_64_UNWIND.

llvm-svn: 124059
2011-01-23 05:43:40 +00:00
Cameron Zwarich ac5a95b68b Remove useless struct fields.
llvm-svn: 124058
2011-01-23 05:11:18 +00:00
Cameron Zwarich 652696fc13 Remove friend declaration for removed function.
llvm-svn: 124057
2011-01-23 04:54:34 +00:00
Rafael Espindola 0e7e34e476 Remove more duplicated code.
llvm-svn: 124056
2011-01-23 04:43:11 +00:00
Cameron Zwarich cfeb0eee06 Convert a std::vector to a SmallVector.
llvm-svn: 124055
2011-01-23 04:30:59 +00:00
Rafael Espindola aea4958ea6 Remove duplicated code.
llvm-svn: 124054
2011-01-23 04:28:49 +00:00
Cameron Zwarich a914bed800 Simplify some code now that we've removed the more optimal (but slower) version
of the link-eval data structure from dominator computation.

llvm-svn: 124053
2011-01-23 04:13:53 +00:00
Benjamin Kramer 54cd042125 Remove dead ivar.
llvm-svn: 124028
2011-01-22 12:13:28 +00:00
Chris Lattner d5407dd2cd add DW_TAG for rvalue refs.
llvm-svn: 124019
2011-01-22 01:47:25 +00:00
Renato Golin 83758d5cd7 Clang was not parsing target triples involving EABI and was generating wrong IR (wrong PCS) and passing the wrong information down llc via the target-triple printed in IR. I've fixed this by adding the parsing of EABI into LLVM's Triple class and using it to choose the correct PCS in Clang's Tools. A Clang patch is on its way to use this infrastructure.
llvm-svn: 123990
2011-01-21 18:25:47 +00:00
Oscar Fuentes 6495595382 Handles libffi on the CMake build.
Patch by arrowdodger!

llvm-svn: 123976
2011-01-21 15:42:54 +00:00
Andrew Trick 47ff14b091 Convert -enable-sched-cycles and -enable-sched-hazard to -disable
flags. They are still not enable in this revision.

Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.

Generalized unit tests to work with sched-cycles.

llvm-svn: 123969
2011-01-21 05:51:33 +00:00
Michael J. Spencer 0324b67267 Object: Fix type punned pointer issues by making DataRefImpl a union and using intptr_t.
llvm-svn: 123962
2011-01-21 02:27:02 +00:00
Evan Cheng b8b0ad80a8 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.

llvm-svn: 123905
2011-01-20 08:34:58 +00:00
Michael J. Spencer b60a18dea8 Object: Add ELF support.
llvm-svn: 123896
2011-01-20 06:38:47 +00:00
Cameron Zwarich a9797804da Remove an unnecessary #include.
llvm-svn: 123877
2011-01-20 03:56:35 +00:00
Cameron Zwarich ce25e88218 There is no point in verifying an analysis that is never updated.
llvm-svn: 123743
2011-01-18 05:44:04 +00:00
Cameron Zwarich 66f3c66b9d Remove some now-unused DominanceFrontier methods.
llvm-svn: 123726
2011-01-18 04:21:57 +00:00
Cameron Zwarich 4694e69540 Remove outdated references to dominance frontiers.
llvm-svn: 123724
2011-01-18 03:53:26 +00:00
Jim Grosbach 834c373e3f Trailing whitespace.
llvm-svn: 123665
2011-01-17 18:34:03 +00:00
Cameron Zwarich b410858a5f Roll r123609 back in with two changes that fix test failures with expensive
checks enabled:

1) Use '<' to compare integers in a comparison function rather than '<='.

2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.

The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.

llvm-svn: 123662
2011-01-17 17:38:41 +00:00
Devang Patel ea49cb04a5 Revert rr123550. It causes clang build failure on darwin9.
llvm-svn: 123661
2011-01-17 17:34:43 +00:00
Oscar Fuentes e65ed1c794 Add some platform checks. Also fix a typo on a Makefile.
Patch by arrowdodger!

llvm-svn: 123659
2011-01-17 16:35:14 +00:00
Jay Foad fe87364215 Remove useless Tag enumeration.
llvm-svn: 123623
2011-01-17 15:18:06 +00:00
Cameron Zwarich 67431d7943 Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot.
llvm-svn: 123618
2011-01-17 07:26:51 +00:00
Cameron Zwarich 814cd9233e Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to
eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.

Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.

llvm-svn: 123609
2011-01-17 01:08:59 +00:00
Michael J. Spencer 5ce56081c7 UnRevert "Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.""
llvm-svn: 123605
2011-01-16 23:39:59 +00:00
Jay Foad bbb91f2b22 Simplify the construction and destruction of Uses. Simplify
User::dropHungOffUses().

llvm-svn: 123580
2011-01-16 15:30:52 +00:00
Jay Foad 5ded9df82a Remove unnecessary specialization OperandTraits<User>.
llvm-svn: 123577
2011-01-16 08:23:16 +00:00
Jay Foad 59809c7a62 Move the implementation of the User class into a new source file,
User.cpp.

llvm-svn: 123575
2011-01-16 08:10:57 +00:00
Michael J. Spencer 2ff30b84f8 Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1."
llvm-svn: 123557
2011-01-16 01:43:22 +00:00
Chris Lattner 1e209b87ad remove the partial specialization pass. It is unmaintained and has bugs.
llvm-svn: 123554
2011-01-16 00:27:10 +00:00
Michael J. Spencer 53dcdc7420 Archive: Fix spelling.
llvm-svn: 123552
2011-01-15 21:43:45 +00:00
Michael J. Spencer a0ce763290 Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.
llvm-svn: 123551
2011-01-15 21:43:37 +00:00
Michael J. Spencer 8685f387eb Support/GraphWriter: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.
llvm-svn: 123550
2011-01-15 21:43:25 +00:00
Michael J. Spencer 94b2ab3556 Support/PathV2: Add identify_magic.
llvm-svn: 123548
2011-01-15 20:39:36 +00:00
Michael J. Spencer ee1699c362 Support/PathV2: Implement get_magic.
llvm-svn: 123544
2011-01-15 18:52:33 +00:00
Oscar Fuentes 25ac830e72 Make config.h.cmake similar to config.h.in
Patch by arrowdodger!

llvm-svn: 123539
2011-01-15 13:35:37 +00:00
Nick Lewycky 367f98f000 Teach LazyValueInfo that allocas aren't NULL. Over all of llvm-test, this saves
half a million non-local queries, each of which would otherwise have triggered a
linear scan over a basic block.

Also fix a fixme for memory intrinsics which dereference pointers. With this,
we prove that a pointer is non-null because it was dereferenced by an intrinsic
112 times in llvm-test.

llvm-svn: 123533
2011-01-15 09:16:12 +00:00
Chris Lattner c23ca1f217 fix typo
llvm-svn: 123519
2011-01-15 06:27:35 +00:00
Chris Lattner 76580f0ec3 Fix m_Not and m_Neg to not match random ConstantInt's. Before
these would try hard to match constants by inverting the bits
and recursively matching.  There are two problems with this:
1) some patterns would match when we didn't want them to (theoretical)
2) this is insanely expensive to do, and most often pointless.

This was apparently useful in just 2 instcombine cases, which I
added code to handle explicitly.  This change speeds up 'opt'
time on 176.gcc by 1% and produces bitwise identical code.

llvm-svn: 123518
2011-01-15 05:52:27 +00:00
Chris Lattner b68ec5c339 Generalize LoadAndStorePromoter a bit and switch LICM
to use it.

llvm-svn: 123501
2011-01-15 00:12:35 +00:00
Anton Korobeynikov 9be547cfd3 Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff.
llvm-svn: 123476
2011-01-14 21:58:08 +00:00
Anton Korobeynikov b46ef57de5 Add CFI directives-based frame information emission. Not hooked yet.
llvm-svn: 123474
2011-01-14 21:57:53 +00:00
Chris Lattner 95294b8796 Add a new LoadAndStorePromoter class, which implements the general
"promote a bunch of load and stores" logic, allowing the code to
be shared and reused.

llvm-svn: 123456
2011-01-14 19:36:13 +00:00
Jay Foad cbe1505617 OperandTraits<>::Layout isn't used for anything. Remove it.
llvm-svn: 123452
2011-01-14 18:41:56 +00:00
Oscar Fuentes 959d253476 Reorder macros on config.h.cmake to easily compare it against
config.h.in.

Patch by arrowdodger!

llvm-svn: 123445
2011-01-14 16:41:03 +00:00
Chris Lattner 8d7716a220 switch the second scalarrepl pass to use SSAUpdater. We run two scalarrepl passes: one
early in the cleanup code and one late interlaced with the inliner.  The second one is
important because inlining and other scalar optzns can unpin allocas, allowing them to 
be split up and promoted.  While important for performance, this is also relatively
rare, and we would previously force a (non-lazy) computation of DomFrontiers, which 
happened even if nothing became unpinned.

With this patch, the first pass of scalarrepl still promotes the vast bulk of allocas
in programs, but hte second pass has changed to use SSAUpdater, which is more "sparse"
and lazy.  This speeds up opt -O3 time on kimwitu++ (a c++ app) by about 1%.  The
numbers are interesting: the first pass promotes ~17500 allocas.  The second pass
promotes about 1600.  For non-C++ codes, the compile time win should be greater, 
because the second pass of scalarrepl does less.

llvm-svn: 123437
2011-01-14 08:21:08 +00:00
Chris Lattner 9987a6f49b split SROA into two passes: one that uses DomFrontiers (-scalarrepl)
and one that uses SSAUpdater (-scalarrepl-ssa)

llvm-svn: 123436
2011-01-14 08:13:00 +00:00
Jay Foad 1d4a8fe156 Remove casts between Value** and Constant**, which won't work if a
static_cast from Constant* to Value* has to adjust the "this" pointer.
This is groundwork for PR889.

llvm-svn: 123435
2011-01-14 08:07:43 +00:00
Evan Cheng d4a5c05c97 Completed :lower16: / :upper16: support for movw / movt pairs on Darwin.
- Fixed :upper16: fix up routine. It should be shifting down the top 16 bits first.
- Added support for Thumb2 :lower16: and :upper16: fix up.
- Added :upper16: and :lower16: relocation support to mach-o object writer.

llvm-svn: 123424
2011-01-14 02:38:49 +00:00
Owen Anderson e3ed20ce9c Rather than doing early instcombine, try doing early CSE instead. This should still handle
most important simplifications, as well as resolving phase ordering issues where instcombine
would inhibit important CSE'ing opportunities, for instance on BitBench/drop3.

llvm-svn: 123418
2011-01-14 00:41:11 +00:00
Duncan Sands 7f60dc1eb0 Move some shift transforms out of instcombine and into InstructionSimplify.
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything.  I fixed this in the constant folder as well.  Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero.  This is in accordance with the LangRef, but I must
admit that it is fairly aggressive.  Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.

llvm-svn: 123417
2011-01-14 00:37:45 +00:00
Owen Anderson ae6ce377c2 Don't bother conditionalizing the use of SROA in -O1 mode. We're already running it unconditionally
later in the pipeline.

llvm-svn: 123416
2011-01-14 00:36:40 +00:00
Tobias Grosser b1d11c19da Add single entry / single exit accessors.
Add methods for accessing the (single) entry / exit edge of a region. If no such
edge exists, null is returned.  Both accessors return the start block of the
corresponding edge. The edge can finally be formed by utilizing
Region::getEntry() or Region::getExit();

Contributed by: Andreas Simbuerger <simbuerg@fim.uni-passau.de>

llvm-svn: 123410
2011-01-13 23:18:04 +00:00
Jakob Stoklund Olesen 4bc5e38960 Teach frame lowering to ignore debug values after the terminators.
llvm-svn: 123399
2011-01-13 21:28:52 +00:00
Oscar Fuentes 0a1b6b8ff2 Add some platform tests.
Patch by arrowdodger!

llvm-svn: 123388
2011-01-13 19:17:28 +00:00
Oscar Fuentes f8e26b123c Platform tests for argz_* functions.
Patch by arrowdodger!

llvm-svn: 123376
2011-01-13 15:06:32 +00:00
Evan Cheng 965b3c7323 Model :upper16: and :lower16: as ARM specific MCTargetExpr. This is a step
in the right direction. It eliminated some hacks and will unblock codegen
work. But it's far from being done. It doesn't reject illegal expressions,
e.g. (FOO - :lower16:BAR). It also doesn't work in Thumb2 mode at all.

llvm-svn: 123369
2011-01-13 07:58:56 +00:00
Michael J. Spencer d9960c69b5 Support/Path: Deprecate PathV1::IsSymlink and replace all uses with PathV2::is_symlink.
llvm-svn: 123345
2011-01-12 23:55:06 +00:00
Jakob Stoklund Olesen 71a3853332 Annotate VirtRegRewriter debug output with slot indexes.
llvm-svn: 123333
2011-01-12 22:28:48 +00:00
Jakob Stoklund Olesen 62e01eefde Assert if anybody tries to put a slot index on a DBG_VALUE instruction.
llvm-svn: 123323
2011-01-12 21:27:45 +00:00
Jakob Stoklund Olesen 819eb4ed0b Put the Dominator improvements back in. They were not the cause of bootstrap miscomparisons.
llvm-svn: 123273
2011-01-11 21:23:09 +00:00
Jakob Stoklund Olesen 32bd3a1e9a Speculatively revert the recent improvements to Dominators.h in an attempt to track down the gcc bootstrap miscompare.
llvm-svn: 123254
2011-01-11 19:26:30 +00:00
Chris Lattner d30de95520 some comment improvements.
llvm-svn: 123243
2011-01-11 17:11:59 +00:00
Jay Foad c8adf5f458 FixedNumOperandTraits and VariadicOperandTraits assumed that, given a
"this" pointer for any subclass of User, you could static_cast it to
User* and then reinterpret_cast that to Use* to get the end of the
operand list. This isn't a safe assumption in general, because the
static_cast might adjust the "this" pointer. Fixed by having these
OperandTraits classes take an extra template parameter, which is the
subclass of User. This is groundwork for PR889.

llvm-svn: 123235
2011-01-11 15:07:38 +00:00
Oscar Fuentes c8cb58e130 Add to the CMake build some options and platform tests supported by
the traditional build.

Patch by arrowdodger!

llvm-svn: 123233
2011-01-11 12:31:54 +00:00
Chris Lattner f6ae904e34 Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes
phi nodes.  It is called from MergeBlockIntoPredecessor which is 
called from GVN, which claims to preserve these.

I'm skeptical that this is the actual problem behind PR8954, but
this is a stab in the right direction.

llvm-svn: 123222
2011-01-11 08:13:40 +00:00
Michael J. Spencer 0d771edeee Support/Path: Deprecate PathV1::isDirectory and replace all uses with PathV2::is_directory.
llvm-svn: 123209
2011-01-11 01:21:55 +00:00
Anton Korobeynikov 2f93128109 Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there.
llvm-svn: 123170
2011-01-10 12:39:04 +00:00
Jakob Stoklund Olesen 2fb5b31578 Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic.
These functions not longer assert when passed 0, but simply return false instead.

No functional change intended.

llvm-svn: 123155
2011-01-10 02:58:51 +00:00
Michael J. Spencer 58df2e00b2 Support/Path: Deprecate PathV1::exists and replace all uses with PathV2::fs::exists.
llvm-svn: 123151
2011-01-10 02:34:23 +00:00
Jakob Stoklund Olesen 8786bda222 Remove TargetRegisterInfo::NoRegister.
Fix the TargetRegisterInfo::NoRegister places where someone preferred
typing 'TargetRegisterInfo::NoRegister' instead of typing '0'.

Note that TableGen is already emitting xx::NoRegister in xxGenRegisterNames.inc.

llvm-svn: 123140
2011-01-09 23:20:48 +00:00
Jakob Stoklund Olesen 8828e482b6 Change virtual register numbering to make more space for physical registers.
The numbering plan is now:

0           NoRegister.
[1;2^30)    Physical registers.
[2^30;2^31) Stack slots.
[2^31;2^32) Virtual registers. (With -1u and -2u used by DenseMapInfo.)

Each segment is filled from the left, so any mistaken interpretation should
quickly cause crashes.

FirstVirtualRegister has been removed. TargetRegisterInfo provides predicates
conversion functions that should be used instead of interpreting register
numbers manually.

It is now legal to pass NoRegister to isPhysicalRegister() and
isVirtualRegister(). The result is false in both cases.

It is quite rare to represent stack slots in this way, so isPhysicalRegister()
and isVirtualRegister() require that isStackSlot() be checked first if it can
possibly return true. This allows a very fast implementation of the common
predicates.

llvm-svn: 123137
2011-01-09 22:42:48 +00:00
Chris Lattner fc87752d55 Step #2 to improve trip count analysis for loops like this:
void f(int* begin, int* end) { std::fill(begin, end, 0); }

which turns into a != exit expression where one pointer is
strided and (thanks to step #1) known to not overflow, and 
the other is loop invariant.

The observation here is that, though the IV is strided by
4 in this case, that the IV *has* to become equal to the
end value.  It cannot "miss" the end value by stepping over
it, because if it did, the strided IV expression would
eventually wrap around.

Handle this by turning A != B into "A-B != 0" where the A-B
part is known to be NUW.

llvm-svn: 123131
2011-01-09 22:26:35 +00:00
Jakob Stoklund Olesen d82ac37594 Remove MachineRegisterInfo::getLastVirtReg(), it was giving wrong results
when no virtual registers have been allocated.

It was only used to resize IndexedMaps, so provide an IndexedMap::resize()
method such that

 Map.grow(MRI.getLastVirtReg());

can be replaced with the simpler

 Map.resize(MRI.getNumVirtRegs());

This works correctly when no virtuals are allocated, and it bypasses the to/from
index conversions.

llvm-svn: 123130
2011-01-09 21:58:20 +00:00
Jakob Stoklund Olesen b83a6b23dc Teach TargetRegisterInfo how to cram stack slot indexes in with the virtual and
physical register numbers.

This makes the hack used in LiveInterval official, and lets LiveInterval be
oblivious of stack slots.

The isPhysicalRegister() and isVirtualRegister() predicates don't know about
this, so when a variable may contain a stack slot, isStackSlot() should always
be tested first.

llvm-svn: 123128
2011-01-09 21:17:37 +00:00
Jakob Stoklund Olesen 8145f85633 Fix comment.
llvm-svn: 123125
2011-01-09 19:45:45 +00:00
Tobias Grosser bc453f6e18 DominatorTree->print() now prints the status of the DFSNumbers correctly
llvm-svn: 123120
2011-01-09 16:00:09 +00:00
Oscar Fuentes edfc184222 Rewrite handling of LLVM_ENABLE_PIC. It was being processed after
config.h was generated, so it had no effect on it.

Thanks to arrowdodger for pointing out this and a tentative patch.

llvm-svn: 123119
2011-01-09 14:34:39 +00:00
Jakob Stoklund Olesen 9adf5e09cd Simplify LiveDebugVariables by storing MachineOperand copies locations instead
of using a Location class with the same information.

When making a copy of a MachineOperand that was already stored in a
MachineInstr, it is necessary to clear the parent pointer on the copy. Otherwise
the register use-def lists become inconsistent.

Add MachineOperand::clearParent() to do that. An alternative would be a custom
MachineOperand copy constructor that cleared ParentMI. I didn't want to do that
because of the performance impact.

llvm-svn: 123109
2011-01-09 05:33:21 +00:00
Jakob Stoklund Olesen 1331a15b0c Replace TargetRegisterInfo::printReg with a PrintReg class that also works without a TRI instance.
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.

llvm-svn: 123107
2011-01-09 03:05:53 +00:00
Jakob Stoklund Olesen 7f93d8d62c Use IndexedMap for MachineRegisterInfo as well. No functional change.
llvm-svn: 123106
2011-01-09 03:05:46 +00:00
Jakob Stoklund Olesen cf4d5ced0f Fix VirtRegMap to use TRI::index2VirtReg and TRI::virtReg2Index instead of
depending on TRI::FirstVirtualRegister.

Also use TRI::printReg instead of printing virtual registers directly.

llvm-svn: 123101
2011-01-08 23:11:07 +00:00
Jakob Stoklund Olesen 28d76692b6 Use an IndexedMap for LiveVariables::VirtRegInfo.
Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow
iteration over virtual registers without depending on the representation of
virtual register numbers.

llvm-svn: 123098
2011-01-08 23:10:57 +00:00
Jakob Stoklund Olesen a1e03cfba7 Do not talk about TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123097
2011-01-08 23:10:53 +00:00
Jakob Stoklund Olesen 793d7b7626 Use an IndexedMap for LiveOutRegInfo to hide its dependence on TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123096
2011-01-08 23:10:50 +00:00
Chris Lattner 2f2c3351e1 fit in 80 cols
llvm-svn: 123085
2011-01-08 20:53:41 +00:00
Rafael Espindola 45e6c195d7 First step in fixing PR8927:
Add a unnamed_addr bit to global variables and functions. This will be used
to indicate that the address is not significant and therefore the constant
or function can be merged with others.

If an optimization pass can show that an address is not used, it can set this.

Examples of things that can have this set by the FE are globals created to
hold string literals and C++ constructors.

Adding unnamed_addr to a non-const global should have no effect unless
an optimization can transform that global into a constant.

Aliases are not allowed to have unnamed_addr since I couldn't figure
out any use for it.

llvm-svn: 123063
2011-01-08 16:42:36 +00:00
Chris Lattner 75c82cb594 make this file properly self contained.
llvm-svn: 123059
2011-01-08 08:19:49 +00:00
Chris Lattner 43f8d16482 Revamp the ValueMapper interfaces in a couple ways:
1. Take a flags argument instead of a bool.  This makes
   it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
   map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
   more efficient.  For lookup failures, don't drop null values
   into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
   and LoopUnswitch, kill it.

No functionality change.

llvm-svn: 123058
2011-01-08 08:15:20 +00:00
Evan Cheng 078b0b095e Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call.
llvm-svn: 123048
2011-01-08 01:24:27 +00:00
Evan Cheng 6eb516dbea Do not model all INLINEASM instructions as having unmodelled side effects.
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.

This allows memory instructions to be moved around INLINEASM instructions.

llvm-svn: 123044
2011-01-07 23:50:32 +00:00
Devang Patel 39bd6cebcd Do not include DataTypes.h in llvm-c/lto.h.
This means avoid using uint32_t. This patch reverts r112200 and fixes original  problem by fixing argument type in lto.cpp.

llvm-svn: 123038
2011-01-07 22:26:25 +00:00
Evan Cheng 36f0a06593 Fix comment. INLINEASM node operand #3 is IsAlignStack bit.
llvm-svn: 123036
2011-01-07 21:38:59 +00:00
Evan Cheng 0638c20e7c DBG_VALUE does not have any side effects; it also makes no sense to mark it cheap as a copy.
llvm-svn: 123031
2011-01-07 21:08:26 +00:00
Jay Foad d81f3c9659 Simplify the allocation and freeing of Users' operand lists, now that
every BranchInst has a fixed number of operands.

llvm-svn: 123027
2011-01-07 20:29:02 +00:00
Jay Foad 814f1bb8e3 Remove the "ugly" method BranchInst::setUnconditionalDest().
llvm-svn: 123026
2011-01-07 20:26:51 +00:00
Bob Wilson 8265d56638 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

llvm-svn: 122995
2011-01-07 04:59:04 +00:00
Bob Wilson f291cb268f Change EXTRACT_SUBVECTOR to require a constant index.
We were never generating any of these nodes with variable indices, and there
was one legalizer function asserting on a non-constant index.  If we ever have
a need to support variable indices, we can add this back again.

llvm-svn: 122993
2011-01-07 04:58:56 +00:00
Evan Cheng 3ae2b79aa3 Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy
etc. takes an option OptSize. If OptSize is true, it would return
the inline limit for functions with attribute OptSize.

llvm-svn: 122952
2011-01-06 06:52:41 +00:00
Jakob Stoklund Olesen 8e236eac74 Add the SpillPlacement analysis pass.
This pass precomputes CFG block frequency information that can be used by the
register allocator to find optimal spill code placement.

Given an interference pattern, placeSpills() will compute which basic blocks
should have the current variable enter or exit in a register, and which blocks
prefer the stack.

The algorithm is ready to consume block frequencies from profiling data, but for
now it gets by with the static estimates used for spill weights.

This is a work in progress and still not hooked up to RegAllocGreedy.

llvm-svn: 122938
2011-01-06 01:21:53 +00:00
Bob Wilson 7bdd440035 Revert svn 122743, removing the instcombine pass that was replaced by earlycse.
My i386 llvm-gcc nightly tester found a regression for
SingleSource/Benchmarks/McGill/chomp that a bisect blamed on 122743.
That seems strange but apparently the combination of earlycse and instcombine
did something bad.  Chris says he intended to remove the instcombine pass, so
let's go ahead and try that.  We'll see if there are any performance losses.

llvm-svn: 122907
2011-01-05 21:16:50 +00:00
Wesley Peck 164e67705a Fix small bug in setDebugInfoAvailability.
llvm-svn: 122886
2011-01-05 17:01:57 +00:00
Chris Lattner d7fe9d665c Fix PR8906: -fno-builtin should disable loop-idiom recognition.
It forms memset and memcpy's, and will someday form popcount and
other stuff.  All of this is bad when compiling the implementation
of memset, memcpy, popcount, etc.

llvm-svn: 122854
2011-01-05 01:03:32 +00:00
Jakob Stoklund Olesen 01d4d86585 Use the EdgeBundles analysis in X86FloatingPoint instead of recomputing CFG
bundles in the pass.

llvm-svn: 122833
2011-01-04 21:10:11 +00:00
Jakob Stoklund Olesen f96ae684c4 Turn the EdgeBundles class into a stand-alone machine CFG analysis pass.
The analysis will be needed by both the greedy register allocator and the
X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't
change.

This pass is very fast, usually showing up as 0.0% wall time.

llvm-svn: 122832
2011-01-04 21:10:05 +00:00
Owen Anderson e8d7bd0323 Give MachineFunctionAnalysis a getPassName() implementation to make timing reports prettier.
llvm-svn: 122816
2011-01-04 18:21:18 +00:00
Duncan Sands 95c4eccbe9 These methods should be "const"; make them so.
llvm-svn: 122809
2011-01-04 12:52:29 +00:00
Owen Anderson b6e4ff0d85 Stub out a new updating interface to AliasAnalysis, allowing stateful analyses to be informed when
a pointer value has potentially become escaping.  Implementations can choose to either fall back to
conservative responses for that value, or may recompute their analysis to accomodate the change.

llvm-svn: 122777
2011-01-03 21:38:41 +00:00
Evan Cheng 150d2124b2 Undo what looks like accidental removal of an instcombine pass in r122740.
llvm-svn: 122743
2011-01-03 07:53:18 +00:00
Chris Lattner b5d1579c72 Turn on earlycse by default. This seems to be a small performance
improvement in the generated code, and speeds up 'opt -std-compile-opts'
compile time on 176.gcc from 24.84s to 23.2s (about 7%).

This also resolves a specific code quality issue in rdar://7352081 which
was generating poor code for:

int t(int a, int b) {
  if (a & b & 1)
    return a & b;
  return 3;
}

llvm-svn: 122740
2011-01-03 06:19:09 +00:00
Nick Lewycky 0f87ca7733 Add spliceFunction to the CallGraph interface. This allows users to efficiently
update a callGraph when performing the common operation of splicing the body to
a new function and updating all callers (such as via RAUW).

No users yet, though this is intended for DeadArgumentElimination as part of
PR8887.

llvm-svn: 122728
2011-01-03 03:19:35 +00:00
Chris Lattner effec5f9df add a handy typedef.
llvm-svn: 122726
2011-01-03 03:16:20 +00:00
Chris Lattner 2f1c34f15e really get this working with a custom allocator.
llvm-svn: 122722
2011-01-03 01:38:29 +00:00
Chris Lattner d11fd779f2 Enhance ScopedHashTable to allow it to take an allocator argument.
llvm-svn: 122721
2011-01-03 01:29:37 +00:00
Cameron Zwarich cab9a0abab Add a new loop-instsimplify pass, with the intention of replacing the instance
of instcombine that is currently in the middle of the loop pass pipeline. This
commit only checks in the pass; it will hopefully be enabled by default later.

llvm-svn: 122719
2011-01-03 00:25:16 +00:00
Chris Lattner bf0aa927cc split dom frontier handling stuff out to its own DominanceFrontier header,
so that Dominators.h is *just* domtree.  Also prune #includes a bit.

llvm-svn: 122714
2011-01-02 22:09:33 +00:00
Chris Lattner 704541bb23 sketch out a new early cse pass. No functionality yet.
llvm-svn: 122713
2011-01-02 21:47:05 +00:00
Cameron Zwarich 7e23b72221 Remove an unused member function.
llvm-svn: 122693
2011-01-02 12:37:22 +00:00
Cameron Zwarich 7f59c65e6e Fix a typo in a variable name.
llvm-svn: 122691
2011-01-02 12:17:10 +00:00
Cameron Zwarich 42b1c9dd45 Move a load into the only branch where it is used and eliminate a temporary.
llvm-svn: 122690
2011-01-02 10:50:14 +00:00
Cameron Zwarich 718d32cca8 Add the explanatory comment from r122680's commit message to the code itself.
llvm-svn: 122689
2011-01-02 10:40:14 +00:00
Cameron Zwarich 69b67c0149 Tidy up indentation.
llvm-svn: 122688
2011-01-02 10:10:02 +00:00
Cameron Zwarich d7f02cc5dd Fix a typo, which should also fix the failure on llvm-x86_64-linux-checks.
llvm-svn: 122687
2011-01-02 10:06:44 +00:00
Cameron Zwarich 528511b155 Remove the #ifdef'd code for balancing the eval-link data structure. It doesn't
compile, and everyone's tests have shown it to be slower in practice, even for
quite large graphs.

I also hope to do an optimization that is only correct with the simpler data
structure, which would break this even further.

llvm-svn: 122684
2011-01-02 07:53:49 +00:00
Cameron Zwarich a0800337c9 Speed up dominator computation some more by optimizing bucket processing. When
naively implemented, the Lengauer-Tarjan algorithm requires a separate bucket
for each vertex. However, this is unnecessary, because each vertex is only
placed into a single bucket (that of its semidominator), and each vertex's
bucket is processed before it is added to any bucket itself.

Instead of using a bucket per vertex, we use a single array Buckets that has two
purposes. Before the vertex V with DFS number i is processed, Buckets[i] stores
the index of the first element in V's bucket. After V's bucket is processed,
Buckets[i] stores the index of the next element in the bucket to which V now
belongs, if any.

Reading from the buckets can also be optimized. Instead of processing the bucket
of V's parent at the end of processing V, we process the bucket of V itself at
the beginning of processing V. This means that the case of the root vertex can
be simplified somewhat. It also means that we don't need to look up the DFS
number of the semidominator of every node in the bucket we are processing,
since we know it is the current index being processed.

This is a 6.5% speedup running -domtree on test-suite + SPEC2000/2006, with
larger speedups of around 12% on the larger benchmarks like GCC.

llvm-svn: 122680
2011-01-02 07:03:00 +00:00
Chris Lattner 423bef8f4f turn on memset idiom recognition by default. Though there are still lots of
limitations, this kicks in dozens of times in the 4 specfp2000 benchmarks, 
and hundreds of times in the int part.  It also kicks in hundreds of times 
in multisource.

This kicks in right before loop deletion, which has the pleasant effect of
deleting loops that *just* do a memset.

llvm-svn: 122664
2011-01-01 20:39:18 +00:00