Commit Graph

4929 Commits

Author SHA1 Message Date
Chandler Carruth 0b623baeb3 [LCG] Switch the Callee sets to be DenseMaps pointing to the index into
the Callee list. This is going to be quite important to prevent removal
from going quadratic. No functionality changed at this point, this is
one of the refactoring patches I've broken out of my initial work toward
mutation updates of the call graph.

llvm-svn: 206938
2014-04-23 04:00:17 +00:00
Duncan P. N. Exon Smith b3380ea60a blockfreq: Skip irreducible backedges inside functions
The branch that skips irreducible backedges was only active when
propagating mass at the top-level.  In particular, when propagating mass
through a loop recognized by `LoopInfo` with irreducible control flow
inside, irreducible backedges would not be skipped.

Not sure where that idea came from, but the result was that mass was
lost until after loop exit.  Added a testcase that covers this case.

llvm-svn: 206860
2014-04-22 03:31:53 +00:00
Duncan P. N. Exon Smith d1aec79d7a blockfreq: Rename PackagedLoops => Loops
llvm-svn: 206859
2014-04-22 03:31:50 +00:00
Duncan P. N. Exon Smith 2984a64bae blockfreq: Use a pointer for ContainingLoop too
llvm-svn: 206858
2014-04-22 03:31:44 +00:00
Duncan P. N. Exon Smith e1423639bb blockfreq: Use pointers to loops instead of an index
Store pointers directly to loops inside the nodes.  This could have been
done without changing the type stored in `std::vector<>`.  However,
rather than computing the number of loops before constructing them
(which `LoopInfo` doesn't provide directly), I've switched to a
`vector<unique_ptr<LoopData>>`.

This adds some heap overhead, but the number of loops is typically
small.

llvm-svn: 206857
2014-04-22 03:31:37 +00:00
Duncan P. N. Exon Smith dc2d66e7b3 blockfreq: Implement clear() explicitly
This was implicitly with copy assignment before, which fails to actually
clear `std::vector<>`'s heap storage.  Move assignment would work, but
since MSVC can't imply those anyway, explicitly `clear()`-ing members
makes more sense.

llvm-svn: 206856
2014-04-22 03:31:34 +00:00
Duncan P. N. Exon Smith cc88ebfa5f blockfreq: Rename PackagedLoopData => LoopData
No functionality change.

llvm-svn: 206855
2014-04-22 03:31:31 +00:00
Chandler Carruth f1221bd01b [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all the header #include lines, lib/Analysis/...
edition.

This one has a bit extra as there were *other* #define's before #include
lines in addition to DEBUG_TYPE. I've sunk all of them as a block.

llvm-svn: 206843
2014-04-22 02:48:03 +00:00
Chandler Carruth 1b9dde087e [Modules] Remove potential ODR violations by sinking the DEBUG_TYPE
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.

Other sub-trees will follow.

llvm-svn: 206837
2014-04-22 02:02:50 +00:00
Chandler Carruth e96dd8975f [Modules] Make Support/Debug.h modular. This requires it to not change
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.

This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:

- Header files that need to provide a DEBUG_TYPE for some inline code
  can do so by defining the macro before their inline code and undef-ing
  it afterward so the macro does not escape.

- We no longer have rampant ODR violations due to including headers with
  different DEBUG_TYPE definitions. This may be mostly an academic
  violation today, but with modules these types of violations are easy
  to check for and potentially very relevant.

Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.

The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.

llvm-svn: 206822
2014-04-21 22:55:11 +00:00
Duncan P. N. Exon Smith 254689fcf9 blockfreq: Some cleanup of UnsignedFloat
Change `PositiveFloat` to `UnsignedFloat`, and fix some of the comments
to indicate that it's disappearing eventually.

llvm-svn: 206771
2014-04-21 18:31:58 +00:00
Duncan P. N. Exon Smith 10be9a8868 Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206707, reapplying r206704.  The preceding commit
to CalcSpillWeights should have sorted out the failing buildbots.

<rdar://problem/14292693>

llvm-svn: 206766
2014-04-21 17:57:07 +00:00
Chandler Carruth 572e3407c3 [PM] Add a new-PM-style CGSCC pass manager using the newly added
LazyCallGraph analysis framework. Wire it up all the way through the opt
driver and add some very basic testing that we can build pass pipelines
including these components. Still a lot more to do in terms of testing
that all of this works, but the basic pieces are here.

There is a *lot* of boiler plate here. It's something I'm going to
actively look at reducing, but I don't have any immediate ideas that
don't end up making the code terribly complex in order to fold away the
boilerplate. Until I figure out something to minimize the boilerplate,
almost all of this is based on the code for the existing pass managers,
copied and heavily adjusted to suit the needs of the CGSCC pass
management layer.

The actual CG management still has a bunch of FIXMEs in it. Notably, we
don't do *any* updating of the CG as it is potentially invalidated.
I wanted to get this in place to motivate the new analysis, and add
update APIs to the analysis and the pass management layers in concert to
make sure that the *right* APIs are present.

llvm-svn: 206745
2014-04-21 11:12:00 +00:00
Chandler Carruth 99b756db04 [LCG] Add some basic debug output to the LCG pass.
llvm-svn: 206730
2014-04-21 05:04:24 +00:00
Duncan P. N. Exon Smith e63327e967 Revert "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206704, as expected.

llvm-svn: 206707
2014-04-19 22:46:00 +00:00
Duncan P. N. Exon Smith 875ddfac75 Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206677, reapplying my BlockFrequencyInfo rewrite.

I've done a careful audit, added some asserts, and fixed a couple of
bugs (unfortunately, they were in unlikely code paths).  There's a small
chance that this will appease the failing bots [1][2].  (If so, great!)

If not, I have a follow-up commit ready that will temporarily add
-debug-only=block-freq to the two failing tests, allowing me to compare
the code path between what the failing bots and what my machines (and
the rest of the bots) are doing.  Once I've triggered those builds, I'll
revert both commits so the bots go green again.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445

<rdar://problem/14292693>

llvm-svn: 206704
2014-04-19 22:34:26 +00:00
Duncan P. N. Exon Smith 76b813619a Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2)
This reverts commit r206666, as planned.

Still stumped on why the bots are failing.  Sanitizer bots haven't
turned anything up.  If anyone can help me debug either of the failures
(referenced in r206666) I'll owe them a beer.  (In the meantime, I'll be
auditing my patch for undefined behaviour.)

llvm-svn: 206677
2014-04-19 00:42:46 +00:00
Duncan P. N. Exon Smith b3caf3646f Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2)
This reverts commit r206628, reapplying r206622 (and r206626).

Two tests are failing only on buildbots [1][2]: i.e., I can't reproduce
on Darwin, and Chandler can't reproduce on Linux.  Asan and valgrind
don't tell us anything, but we're hoping the msan bot will catch it.

So, I'm applying this again to get more feedback from the bots.  I'll
leave it in long enough to trigger builds in at least the sanitizer
buildbots (it was failing for reasons unrelated to my commit last time
it was in), and hopefully a few others.... and then I expect to revert a
third time.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445

llvm-svn: 206666
2014-04-18 22:30:03 +00:00
Chandler Carruth 2174f44f61 [LCG] Fix the bugs that Ben pointed out in code review (and the MSan bot
caught). Sad that we don't have warnings for these things, but bleh, no
idea how to fix that.

llvm-svn: 206646
2014-04-18 20:44:16 +00:00
Benjamin Kramer 147644d400 Remove a couple of redundant copies of SmallVector::operator==.
No functionality change.

llvm-svn: 206635
2014-04-18 19:48:03 +00:00
Duncan P. N. Exon Smith 0842ff36a6 Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2)
This reverts commit r206622 and the MSVC fixup in r206626.

Apparently the remotely failing tests are still failing, despite my
attempt to fix the nondeterminism in r206621.

llvm-svn: 206628
2014-04-18 17:56:08 +00:00
Duncan P. N. Exon Smith 38fe464df0 Fixing MSVC after r206622?
llvm-svn: 206626
2014-04-18 17:38:01 +00:00
Duncan P. N. Exon Smith f8361d127a Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206556, effectively reapplying commit r206548 and
its fixups in r206549 and r206550.

In an intervening commit I've added target triples to the tests that
were failing remotely [1] (but passing locally).  I'm hoping the mystery
is solved?  I'll revert this again if the tests are still failing
remotely.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816

llvm-svn: 206622
2014-04-18 17:22:25 +00:00
Chandler Carruth d8d865e266 [LCG] Remove all of the complexity stemming from supporting copying.
Reality is that we're never going to copy one of these. Supporting this
was becoming a nightmare because nothing even causes it to compile most
of the time. Lots of subtle errors built up that wouldn't have been
caught by any "normal" testing.

Also, make the move assignment actually work rather than the bogus swap
implementation that would just infloop if used. As part of that, factor
out the graph pointer updates into a helper to share between move
construction and move assignment.

llvm-svn: 206583
2014-04-18 11:02:33 +00:00
Chandler Carruth 18eadd9260 [LCG] Add support for building persistent and connected SCCs to the
LazyCallGraph. This is the start of the whole point of this different
abstraction, but it is just the initial bits. Here is a run-down of
what's going on here. I'm planning to incorporate some (or all) of this
into comments going forward, hopefully with better editing and wording.
=]

The crux of the problem with the traditional way of building SCCs is
that they are ephemeral. The new pass manager however really needs the
ability to associate analysis passes and results of analysis passes with
SCCs in order to expose these analysis passes to the SCC passes. Making
this work is kind-of the whole point of the new pass manager. =]

So, when we're building SCCs for the call graph, we actually want to
build persistent nodes that stick around and can be reasoned about
later. We'd also like the ability to walk the SCC graph in more complex
ways than just the traditional postorder traversal of the current CGSCC
walk. That means that in addition to being persistent, the SCCs need to
be connected into a useful graph structure.

However, we still want the SCCs to be formed lazily where possible.

These constraints are quite hard to satisfy with the SCC iterator. Also,
using that would bypass our ability to actually add data to the nodes of
the call graph to facilite implementing the Tarjan walk. So I've
re-implemented things in a more direct and embedded way. This
immediately makes it easy to get the persistence and connectivity
correct, and it also allows leveraging the existing nodes to simplify
the algorithm. I've worked somewhat to make this implementation more
closely follow the traditional paper's nomenclature and strategy,
although it is still a bit obtuse because it isn't recursive, using
an explicit stack and a tail call instead, and it is interruptable,
resuming each time we need another SCC.

The other tricky bit here, and what actually took almost all the time
and trials and errors I spent building this, is exactly *what* graph
structure to build for the SCCs. The naive thing to build is the call
graph in its newly acyclic form. I wrote about 4 versions of this which
did precisely this. Inevitably, when I experimented with them across
various use cases, they became incredibly awkward. It was all
implementable, but it felt like a complete wrong fit. Square peg, round
hole. There were two overriding aspects that pushed me in a different
direction:

1) We want to discover the SCC graph in a postorder fashion. That means
   the root node will be the *last* node we find. Using the call-SCC DAG
   as the graph structure of the SCCs results in an orphaned graph until
   we discover a root.

2) We will eventually want to walk the SCC graph in parallel, exploring
   distinct sub-graphs independently, and synchronizing at merge points.
   This again is not helped by the call-SCC DAG structure.

The structure which, quite surprisingly, ended up being completely
natural to use is the *inverse* of the call-SCC DAG. We add the leaf
SCCs to the graph as "roots", and have edges to the caller SCCs. Once
I switched to building this structure, everything just fell into place
elegantly.

Aside from general cleanups (there are FIXMEs and too few comments
overall) that are still needed, the other missing piece of this is
support for iterating across levels of the SCC graph. These will become
useful for implementing #2, but they aren't an immediate priority.

Once SCCs are in good shape, I'll be working on adding mutation support
for incremental updates and adding the pass manager that this analysis
enables.

llvm-svn: 206581
2014-04-18 10:50:32 +00:00
Duncan P. N. Exon Smith e576167df8 Revert "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commits r206548, r206549 and r206549.

There are some unit tests failing that aren't failing locally [1], so
reverting until I have time to investigate.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816

llvm-svn: 206556
2014-04-18 02:17:43 +00:00
Duncan P. N. Exon Smith 878cf2b804 blockfreq: Really fix r206548 (and r206549)
Turns out this code is dead.

llvm-svn: 206554
2014-04-18 02:10:09 +00:00
Duncan P. N. Exon Smith c7abca54cf blockfreq: Fixing MSVC after r206548?
llvm-svn: 206549
2014-04-18 02:06:24 +00:00
Duncan P. N. Exon Smith 12e68e1733 blockfreq: Rewrite BlockFrequencyInfoImpl
Rewrite the shared implementation of BlockFrequencyInfo and
MachineBlockFrequencyInfo entirely.

The old implementation had a fundamental flaw:  precision losses from
nested loops (or very wide branches) compounded past loop exits (and
convergence points).

The @nested_loops testcase at the end of
test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating.  This
function has three nested loops, with branch weights in the loop headers
of 1:4000 (exit:continue).  The old analysis gives non-sensical results:

    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    ---- Block Freqs ----
     entry = 1.0
     for.cond1.preheader = 1.00103
     for.cond4.preheader = 5.5222
     for.body6 = 18095.19995
     for.inc8 = 4.52264
     for.inc11 = 0.00109
     for.end13 = 0.0

The new analysis gives correct results:

    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    block-frequency-info: nested_loops
     - entry: float = 1.0, int = 8
     - for.cond1.preheader: float = 4001.0, int = 32007
     - for.cond4.preheader: float = 16008001.0, int = 128064007
     - for.body6: float = 64048012001.0, int = 512384096007
     - for.inc8: float = 16008001.0, int = 128064007
     - for.inc11: float = 4001.0, int = 32007
     - for.end13: float = 1.0, int = 8

Most importantly, the frequency leaving each loop matches the frequency
entering it.

The new algorithm leverages BlockMass and PositiveFloat to maintain
precision, separates "probability mass distribution" from "loop
scaling", and uses dithering to eliminate probability mass loss.  I have
unit tests for these types out of tree, but it was decided in the review
to make the classes private to BlockFrequencyInfoImpl, and try to shrink
them (or remove them entirely) in follow-up commits.

The new algorithm should generally have a complexity advantage over the
old.  The previous algorithm was quadratic in the worst case.  The new
algorithm is still worst-case quadratic in the presence of irreducible
control flow, but it's linear without it.

The key difference between the old algorithm and the new is that control
flow within a loop is evaluated separately from control flow outside,
limiting propagation of precision problems and allowing loop scale to be
calculated independently of mass distribution.  Loops are visited
bottom-up, their loop scales are calculated, and they are replaced by
pseudo-nodes.  Mass is then distributed through the function, which is
now a DAG.  Finally, loops are revisited top-down to multiply through
the loop scales and the masses distributed to pseudo nodes.

There are some remaining flaws.

  - Irreducible control flow isn't modelled correctly.  LoopInfo and
    MachineLoopInfo ignore irreducible edges, so this algorithm will
    fail to scale accordingly.  There's a note in the class
    documentation about how to get closer.  See also the comments in
    test/Analysis/BlockFrequencyInfo/irreducible.ll.

  - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting
    the 64-bit integer precision used downstream.

  - The "bias" calculation proposed on llvmdev is *not* incorporated
    here.  This will be added in a follow-up commit, once comments from
    this review have been handled.

llvm-svn: 206548
2014-04-18 01:57:45 +00:00
Nuno Lopes 9ced19abe8 remove some dead code
lib/Analysis/IPA/InlineCost.cpp         |   18 ------------------
 lib/Analysis/RegionPass.cpp             |    1 -
 lib/Analysis/TypeBasedAliasAnalysis.cpp |    1 -
 lib/Transforms/Scalar/LoopUnswitch.cpp  |   21 ---------------------
 lib/Transforms/Utils/LCSSA.cpp          |    2 --
 lib/Transforms/Utils/LoopSimplify.cpp   |    6 ------
 utils/TableGen/AsmWriterEmitter.cpp     |   13 -------------
 utils/TableGen/DFAPacketizerEmitter.cpp |    7 -------
 utils/TableGen/IntrinsicEmitter.cpp     |    2 --
 9 files changed, 71 deletions(-)

llvm-svn: 206506
2014-04-17 22:26:44 +00:00
Gerolf Hoflehner ecebc3730e Reverse 206485.
After some discussions the preferred semantics of
the always_inline attribute is
inline always when the compiler can determine
that it it safe to do so.

llvm-svn: 206487
2014-04-17 19:14:06 +00:00
Chandler Carruth b60cb315bc [LCG] Just move the allocator (now that we can) when moving a call
graph. This simplifies the custom move constructor operation to one of
walking the graph and updating the 'up' pointers to point to the new
location of the graph. Switch the nodes from a reference to a pointer
for the 'up' edge to facilitate this.

llvm-svn: 206450
2014-04-17 07:25:59 +00:00
Chandler Carruth 81f497d176 [LCG] Remove the Module reference member which we weren't using for
anything and doesn't make sense if assigning.

llvm-svn: 206449
2014-04-17 07:22:19 +00:00
Gerolf Hoflehner 5f6268a40e Inline a function when the always_inline attribute
is set even when it contains a indirect branch.
The attribute overrules correctness concerns
like the escape of a local block address.

This is for rdar://16501761

llvm-svn: 206429
2014-04-17 00:21:52 +00:00
Tobias Grosser 8d941ef30d RegionInfo: Do not access a value that was just moved away
This fixes a regression introduced in r206310.

llvm-svn: 206328
2014-04-15 22:09:36 +00:00
David Blaikie ec649acb82 Use unique_ptr to manage ownership of child Regions within llvm::Region
llvm-svn: 206310
2014-04-15 18:32:43 +00:00
Craig Topper 9f008867c0 [C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206243
2014-04-15 04:59:12 +00:00
Akira Hatanaka 5638b89944 Fix a bug in which BranchProbabilityInfo wasn't setting branch weights of basic blocks inside loops correctly.
Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights.

This fixes PR18705 and <rdar://problem/15991090>.

Differential Revision: http://reviews.llvm.org/D3363

llvm-svn: 206194
2014-04-14 16:56:19 +00:00
Duncan P. N. Exon Smith 689a50736e blockfreq: Rename BlockFrequencyImpl to BlockFrequencyInfoImpl
This is a shared implementation class for BlockFrequencyInfo and
MachineBlockFrequencyInfo, not for BlockFrequency, a related (but
distinct) class.

No functionality change.

<rdar://problem/14292693>

llvm-svn: 206083
2014-04-11 23:20:58 +00:00
Duncan P. N. Exon Smith 37bd529964 blockfreq: Use getSuccessorIndex()
No functionality change.

<rdar://problem/14292693>

llvm-svn: 206082
2014-04-11 23:20:52 +00:00
Tobias Grosser c3d9db2336 Delinearize: Extend informationin -analyze output
llvm-svn: 205838
2014-04-09 07:53:49 +00:00
Sebastian Pop b2fdacf3f2 divide by the result of the gcd
used to fail with 'Step should divide Start with no remainder.'

llvm-svn: 205802
2014-04-08 21:21:13 +00:00
Sebastian Pop 9738e83a7d handle special cases when findGCD returns 1
used to fail with 'Step should divide Start with no remainder.'

llvm-svn: 205801
2014-04-08 21:21:10 +00:00
Sebastian Pop b5b84e0963 in findGCD of multiply expr return the gcd
we used to return 1 instead of the gcd

llvm-svn: 205800
2014-04-08 21:21:05 +00:00
Eric Christopher beb2cd6b7c Handle vlas during inline cost computation if they'll be turned
into a constant size alloca by inlining.

Ran a run over the testsuite, no results out of the noise, fixes
the testcase in the PR.

PR19115.

llvm-svn: 205710
2014-04-07 13:36:21 +00:00
Hal Finkel b4e001cc81 Use TopTTI->getGEPCost from within getUserCost
The implementation of getUserCost had duplicated (and hard-coded) the default
logic in getGEPCost. Instead, it is better to use getGEPCost directly, which
limits the default logic to the implementation of one function, and allows
targets to override the behavior.

No functionality change intended.

llvm-svn: 205346
2014-04-01 18:50:06 +00:00
Arnold Schwaighofer 1a444489e9 PR15967 Fix in basicaa for faulty returning no alias.
This commit consist of two parts.
The first part fix the PR15967. The wrong conclusion was made when the MaxLookup
limit was reached. The fix introduce a out parameter (MaxLookupReached) to
DecomposeGEPExpression that the function aliasGEP can act upon.
The second part is introducing the constant MaxLookupSearchDepth to make sure
that DecomposeGEPExpression and GetUnderlyingObject use the same search depth.
This is a small cleanup to clarify the original algorithm.

Patch by Karl-Johan Karlsson!

llvm-svn: 204859
2014-03-26 21:30:19 +00:00
Duncan P. N. Exon Smith 3dbe10503a blockfreq: Implement Pass::releaseMemory()
Implement Pass::releaseMemory() in BlockFrequencyInfo and
MachineBlockFrequencyInfo.  Just delete the private implementation when
not in use.  Switch to a std::unique_ptr to make the logic more clear.

<rdar://problem/14292693>

llvm-svn: 204741
2014-03-25 18:01:38 +00:00
Benjamin Kramer e75eaca32f ScalarEvolution: Compute exit counts for loops with a power-of-2 step.
If we have a loop of the form
for (unsigned n = 0; n != (k & -32); n += 32) {}
then we know that n is always divisible by 32 and the loop must
terminate. Even if we have a condition where the loop counter will
overflow it'll always hold this invariant.

PR19183. Our loop vectorizer creates this pattern and it's also
occasionally formed by loop counters derived from pointers.

llvm-svn: 204728
2014-03-25 16:25:12 +00:00
Erik Verbruggen e706b88304 Simplify loop that worked around bugs in old GCC/Xcode.
GCC 4.0.1 and Xcode 2 are no longer supported for building llvm/clang.

llvm-svn: 204705
2014-03-25 09:06:18 +00:00
Karthik Bhat 195e9dd91b Allow constant folding of ceil function whenever feasible
llvm-svn: 204583
2014-03-24 04:36:06 +00:00
Juergen Ributzka f0dff49ad0 [Constant Hoisting] Make the constant materialization cost operand dependent
Extend the target hook to take also the operand index into account when
calculating the cost of the constant materialization.

Related to <rdar://problem/16381500>

llvm-svn: 204435
2014-03-21 06:04:45 +00:00
Juergen Ributzka 46357931ab Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass."
I will break this up into smaller pieces for review and recommit.

llvm-svn: 204393
2014-03-20 20:17:13 +00:00
Juergen Ributzka 6dab520c70 [Constant Hoisting] Extend coverage of the constant hoisting pass.
This commit extends the coverage of the constant hoisting pass, adds additonal
debug output and updates the function names according to the style guide.

Related to <rdar://problem/16381500>

llvm-svn: 204389
2014-03-20 19:55:52 +00:00
Michael Zolotukhin ed0a7761e5 Add stride normalization to SCEV Normalize/Denormalize transformation.
llvm-svn: 204161
2014-03-18 17:34:03 +00:00
Alon Mishne ad312155a6 [C++11] Change DebugInfoFinder to use range-based loops
Also changes the iterators to return actual DI type over MDNode.

llvm-svn: 204130
2014-03-18 09:41:07 +00:00
Eli Bendersky 576ef3c667 Consistent use of the noduplicate attribute.
The "noduplicate" attribute of call instructions is sometimes queried directly
and sometimes through the cannotDuplicate() predicate. This patch streamlines
all queries to use the cannotDuplicate() predicate. It also adds this predicate
to InvokeInst, to mirror what CallInst has.

llvm-svn: 204049
2014-03-17 16:19:07 +00:00
Arnaud A. de Grandmaison 75c9e6dedf Remove some dead assignements found by scan-build
llvm-svn: 204013
2014-03-15 22:13:15 +00:00
Michael Zolotukhin 66806aef1e PR17473:
Don't normalize an expression during postinc transformation unless it's
invertible.

llvm-svn: 203719
2014-03-12 21:31:05 +00:00
Michael Zolotukhin 15e6e543b9 Test commit
llvm-svn: 203716
2014-03-12 21:15:56 +00:00
Tim Northover e94a518a22 IR: add a second ordering operand to cmpxhg for failure
The syntax for "cmpxchg" should now look something like:

	cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic

where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).

rdar://problem/15996804

llvm-svn: 203559
2014-03-11 10:48:52 +00:00
Chandler Carruth aee3ca6cfd [TTI] There is actually no realistic way to pop TTI implementations off
the stack of the analysis group because they are all immutable passes.
This is made clear by Craig's recent work to use override
systematically -- we weren't overriding anything for 'finalizePass'
because there is no such thing.

This is kind of a lame restriction on the API -- we can no longer push
and pop things, we just set up the stack and run. However, I'm not
invested in building some better solution on top of the existing
(terrifying) immutable pass and legacy pass manager.

llvm-svn: 203437
2014-03-10 02:45:14 +00:00
Chandler Carruth e9b50617b8 [LCG] Ran clang-format over this too and it pointed out some fixes.
llvm-svn: 203435
2014-03-10 02:14:14 +00:00
Chandler Carruth b9e2f8c479 [LCG] Simplify a bunch of the LCG code with range for loops and auto.
Still more work to be done here to leverage C++11, but this clears out
the glaring issues.

llvm-svn: 203395
2014-03-09 12:20:34 +00:00
Chandler Carruth cdf4788401 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

llvm-svn: 203364
2014-03-09 03:16:01 +00:00
Benjamin Kramer b0f74b24fa [C++11] Convert sort predicates into lambdas.
No functionality change.

llvm-svn: 203288
2014-03-07 21:35:39 +00:00
Karthik Bhat b67688a87c Allow constant folding of round function whenever feasible
llvm-svn: 203198
2014-03-07 04:36:21 +00:00
Matt Arsenault a236ea551c Teach lint about address spaces
llvm-svn: 203132
2014-03-06 17:33:55 +00:00
Ahmed Charles 56440fd820 Replace OwningPtr<T> with std::unique_ptr<T>.
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.

llvm-svn: 203083
2014-03-06 05:51:42 +00:00
Karthik Bhat daa8cd10d9 Allow constant folding of copysign
llvm-svn: 203076
2014-03-06 05:32:52 +00:00
Chandler Carruth 7da14f1ab9 [Layering] Move InstVisitor.h into the IR library as it is pretty
obviously coupled to the IR.

llvm-svn: 203064
2014-03-06 03:23:41 +00:00
Chandler Carruth 9a4c9e597b [Layering] Move DebugInfo.h into the IR library where its implementation
already lives.

llvm-svn: 203046
2014-03-06 00:46:21 +00:00
Benjamin Kramer 061d147f74 ConstantFolding: Also fold the vector overloads of our math intrinsics.
llvm-svn: 202997
2014-03-05 19:41:48 +00:00
Tobias Grosser ba49e4229c Add missing parenthesis in SCEV comment
Contributed-by: Michael Zolutukin <mzolotukhin@apple.com>
llvm-svn: 202963
2014-03-05 10:37:17 +00:00
Chandler Carruth 64e9aa5c93 [C++11] Make this interface accept const Use pointers and use override
to ensure we don't mess up any of the overrides. Necessary for cleaning
up the Value use iterators and enabling range-based traversing of use
lists.

llvm-svn: 202958
2014-03-05 10:21:48 +00:00
Craig Topper e9ba759c81 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 202945
2014-03-05 07:30:04 +00:00
Matt Arsenault 8377858c55 Allow constant folding of fma and fmuladd
llvm-svn: 202914
2014-03-05 00:02:00 +00:00
Matt Arsenault f8ecf9b447 Fix duplicate code in ConstantFolding
llvm-svn: 202913
2014-03-05 00:01:58 +00:00
Chandler Carruth 8cd041ef19 [Modules] Move the ConstantRange class into the IR library. This is
a bit surprising, as the class is almost entirely abstracted away from
any particular IR, however it encodes the comparsion predicates which
mutate ranges as ICmp predicate codes. This is reasonable as they're
used for both instructions and constants. Thus, it belongs in the IR
library with instructions and constants.

llvm-svn: 202838
2014-03-04 12:24:34 +00:00
Chandler Carruth aa0ab6389a [Modules] Move the PredIteratorCache into the IR library -- it is
hardcoded to use IR BasicBlocks.

llvm-svn: 202835
2014-03-04 12:09:19 +00:00
Chandler Carruth 1305dc3351 [Modules] Move CFG.h to the IR library as it defines graph traits over
IR types.

llvm-svn: 202827
2014-03-04 11:45:46 +00:00
Chandler Carruth 4220e9c154 [Modules] Move ValueHandle into the IR library where Value itself lives.
Move the test for this class into the IR unittests as well.

This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.

llvm-svn: 202821
2014-03-04 11:17:44 +00:00
Chandler Carruth 820a908df7 [Modules] Move the LLVM IR pattern match header into the IR library, it
obviously is coupled to the IR.

llvm-svn: 202818
2014-03-04 11:08:18 +00:00
Chandler Carruth 219b89b987 [Modules] Move CallSite into the IR library where it belogs. It is
abstracting between a CallInst and an InvokeInst, both of which are IR
concepts.

llvm-svn: 202816
2014-03-04 11:01:28 +00:00
Chandler Carruth 03eb0de93d [Modules] Move GetElementPtrTypeIterator into the IR library. As its
name might indicate, it is an iterator over the types in an instruction
in the IR.... You see where this is going.

Another step of modularizing the support library.

llvm-svn: 202815
2014-03-04 10:40:04 +00:00
Chandler Carruth 8394857f43 [Modules] Move InstIterator out of the Support library, where it had no
business.

This header includes Function and BasicBlock and directly uses the
interfaces of both classes. It has to do with the IR, it even has that
in the name. =] Put it in the library it belongs to.

This is one step toward making LLVM's Support library survive a C++
modules bootstrap.

llvm-svn: 202814
2014-03-04 10:30:26 +00:00
Chandler Carruth 442f784814 [cleanup] Re-sort all the includes with utils/sort_includes.py.
llvm-svn: 202811
2014-03-04 10:07:28 +00:00
Tobias Grosser 4abf9d3a54 [C++11] Add a basic block range view for RegionInfo
This also switches the users in LLVM to ensure this functionality is tested.

llvm-svn: 202705
2014-03-03 13:00:39 +00:00
Chandler Carruth 1583e99c23 [C++11] Add two range adaptor views to User: operands and
operand_values. The first provides a range view over operand Use
objects, and the second provides a range view over the Value*s being
used by those operands.

The naming is "STL-style" rather than "LLVM-style" because we have
historically named iterator methods STL-style, and range methods seem to
have far more in common with their iterator counterparts than with
"normal" APIs. Feel free to bikeshed on this one if you want, I'm happy
to change these around if people feel strongly.

I've switched code in SROA and LCG to exercise these mostly to ensure
they work correctly -- we don't really have an easy way to unittest this
and they're trivial.

llvm-svn: 202687
2014-03-03 10:42:58 +00:00
Benjamin Kramer d6f1f84f51 [C++11] Replace llvm::tie with std::tie.
The old implementation is no longer needed in C++11.

llvm-svn: 202644
2014-03-02 13:30:33 +00:00
Benjamin Kramer b6d0bd48bd [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

llvm-svn: 202636
2014-03-02 12:27:27 +00:00
Craig Topper 73156025e0 Switch all uses of LLVM_OVERRIDE to just use 'override' directly.
llvm-svn: 202621
2014-03-02 09:09:27 +00:00
Craig Topper 77dfe45f81 Switch all uses of LLVM_FINAL to just use 'final', and remove the macro.
llvm-svn: 202618
2014-03-02 08:08:51 +00:00
Chandler Carruth 002da5db29 [C++11] Switch all uses of the llvm_move macro to use std::move
directly, and remove the macro.

llvm-svn: 202612
2014-03-02 04:08:41 +00:00
Chandler Carruth 172f7c37b9 [C++11] Remove the use of LLVM_HAS_RVALUE_REFERENCES from the rest of
the core LLVM libraries.

llvm-svn: 202582
2014-03-01 09:32:03 +00:00
Eric Christopher a13839f5ca Remove unnecessary llvm:: qualification.
llvm-svn: 202316
2014-02-26 23:27:16 +00:00
Paul Robinson 0c12b1d23c Constify the Optnone checks in IR passes.
llvm-svn: 202213
2014-02-26 01:23:26 +00:00
Rafael Espindola 339430f993 Use DataLayout from the module when easily available.
Eventually DataLayoutPass should go away, but for now that is the only easy
way to get a DataLayout in some APIs. This patch only changes the ones that
have easy access to a Module.

One interesting issue with sometimes using DataLayoutPass and sometimes
fetching it from the Module is that we have to make sure they are equivalent.
We can get most of the way there by always constructing the pass with a Module.
In fact, the pass could be changed to point to an external DataLayout instead
of owning one to make this stricter.

Unfortunately, the C api passes a DataLayout, so it has to be up to the caller
to make sure the pass and the module are in sync.

llvm-svn: 202204
2014-02-25 23:25:17 +00:00
Rafael Espindola 935125126c Make DataLayout a plain object, not a pass.
Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.

llvm-svn: 202168
2014-02-25 17:30:31 +00:00
Rafael Espindola aeff8a9c05 Make some DataLayout pointers const.
No functionality change. Just reduces the noise of an upcoming patch.

llvm-svn: 202087
2014-02-24 23:12:18 +00:00
Rafael Espindola 90c7f1cc16 Replace the F_Binary flag with a F_Text one.
After this I will set the default back to F_None. The advantage is that
before this patch forgetting to set F_Binary would corrupt a file on windows.
Forgetting to set F_Text produces one that cannot be read in notepad, which
is a better failure mode :-)

llvm-svn: 202052
2014-02-24 18:20:12 +00:00
Rafael Espindola 7dbcdd08c2 Don't make F_None the default.
This will make it easier to switch the default to being binary files.

llvm-svn: 202042
2014-02-24 15:07:20 +00:00
Rafael Espindola 5f57f462a8 Rename a few more DataLayout variables from TD to DL.
llvm-svn: 201870
2014-02-21 18:34:28 +00:00
Sebastian Pop 64f12d5324 fix a corner case in delinearization
handle special cases Step==1, Step==-1, GCD==1, and GCD==-1

llvm-svn: 201868
2014-02-21 18:15:15 +00:00
Sebastian Pop 29026d3e52 normalize the last delinearized dimension
in the dependence test, we used to discard some information that the
delinearization provides: the size of the innermost dimension of an array,
i.e., the size of scalars stored in the array, and the remainder of the
delinearization that provides the offset from which the array reads start,
i.e., the base address of the array.

To avoid losing this data in the rest of the data dependence analysis, the fix
is to multiply the access function in the last delinearized dimension by its
size, effectively making the size of the last dimension to always be in bytes,
and then add the remainder of delinearization to the last subscript,
effectively making the last subscript start at the base address of the array.

llvm-svn: 201867
2014-02-21 18:15:11 +00:00
Sebastian Pop 5133d2e9d4 fail delinearization when the size of subscripts differs
Because the delinearization is not a global analysis pass, it will compute the
delinearization independently of knowledge about the way the delinearization
happened for other data accesses to the same array: the dependence analysis will
only trigger the delinearization on a tuple of access functions, and thus
delinearization may compute different subscripts sizes for a same array.  When
that happens the safest is to discard the delinearized information.

llvm-svn: 201866
2014-02-21 18:15:07 +00:00
Rafael Espindola 37dc9e19f5 Rename many DataLayout variables from TD to DL.
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.

llvm-svn: 201827
2014-02-21 00:06:31 +00:00
Rafael Espindola 7c68bebb9c Rename some member variables from TD to DL.
TargetData was renamed DataLayout back in r165242.

llvm-svn: 201581
2014-02-18 15:33:12 +00:00
Arnold Schwaighofer 26f567d8a4 SCEVExpander: Try hard not to create derived induction variables in other loops
During LSR of one loop we can run into a situation where we have to expand the
start of a recurrence of a loop induction variable in this loop. This start
value is a value derived of the induction variable of a preceeding loop. SCEV
has cannonicalized this value to a different recurrence than the recurrence of
the preceeding loop's induction variable (the type and/or step direction) has
changed). When we come to instantiate this SCEV we created a second induction
variable in this preceeding loop.  This patch tries to base such derived
induction variables of the preceeding loop's induction variable.

This helps twolf on arm and seems to help scimark2 on x86.

Reapply with a fix for the case of a value derived from a pointer.

radar://15970709

llvm-svn: 201496
2014-02-16 15:49:50 +00:00
Arnold Schwaighofer 847d96142c Revert "SCEVExpander: Try hard not to create derived induction variables in other loops"
This reverts commit r201465. It broke an arm bot.

llvm-svn: 201466
2014-02-15 18:16:56 +00:00
Arnold Schwaighofer 1e12f8563d SCEVExpander: Try hard not to create derived induction variables in other loops
During LSR of one loop we can run into a situation where we have to expand the
start of a recurrence of a loop induction variable in this loop. This start
value is a value derived of the induction variable of a preceeding loop. SCEV
has cannonicalized this value to a different recurrence than the recurrence of
the preceeding loop's induction variable (the type and/or step direction) has
changed). When we come to instantiate this SCEV we created a second induction
variable in this preceeding loop.  This patch tries to base such derived
induction variables of the preceeding loop's induction variable.

This helps twolf on arm and seems to help scimark2 on x86.

radar://15970709

llvm-svn: 201465
2014-02-15 17:11:56 +00:00
Benjamin Kramer 989b92936c Reduce code duplication resulting from the ConstantVector/ConstantDataVector split.
No intended functionality change.

llvm-svn: 201344
2014-02-13 16:48:38 +00:00
Andrea Di Biagio b7882b3bd1 [Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo called
'OK_NonUniformConstValue' to identify operands which are constants but
not constant splats.

The cost model now allows returning 'OK_NonUniformConstValue'
for non splat operands that are instances of ConstantVector or
ConstantDataVector.

With this change, targets are now able to compute different costs
for instructions with non-uniform constant operands.
For example, On X86 the cost of a vector shift may vary depending on whether
the second operand is a uniform or non-uniform constant.

This patch applies the following changes:
 - The cost model computation now takes into account non-uniform constants;
 - The cost of vector shift instructions has been improved in
   X86TargetTransformInfo analysis pass;
 - BBVectorize, SLPVectorizer and LoopVectorize now know how to distinguish
   between non-uniform and uniform constant operands.

Added a new test to verify that the output of opt
'-cost-model -analyze' is valid in the following configurations: SSE2,
SSE4.1, AVX, AVX2.

llvm-svn: 201272
2014-02-12 23:43:47 +00:00
Benjamin Kramer 987b850cf2 SCEV: Cast switched values to make -Wswitch more useful.
llvm-svn: 201170
2014-02-11 19:02:55 +00:00
Benjamin Kramer 5a188549ad ScalarEvolution: Analyze trip count of loops with a switch guarding the exit.
llvm-svn: 201159
2014-02-11 15:44:32 +00:00
Benjamin Kramer 3c29c0704b Make succ_iterator a real random access iterator and clean up a couple of users.
llvm-svn: 201088
2014-02-10 14:17:42 +00:00
Benjamin Kramer b8266d2062 GlobalsModRef: Unify and clean up duplicated pointer analysis code.
llvm-svn: 201087
2014-02-10 14:17:30 +00:00
Chandler Carruth d1ba2efb8f [PM] Fix horrible typos that somehow didn't cause a failure in a C++11
build but spectacularly changed behavior of the C++98 build. =]

This shows my one problem with not having unittests -- basic API
expectations aren't well exercised by the integration tests because they
*happen* to not come up, even though they might later. I'll probably add
a basic unittest to complement the integration testing later, but
I wanted to revive the bots.

llvm-svn: 200905
2014-02-06 05:17:02 +00:00
Chandler Carruth bf71a34eb9 [PM] Add a new "lazy" call graph analysis pass for the new pass manager.
The primary motivation for this pass is to separate the call graph
analysis used by the new pass manager's CGSCC pass management from the
existing call graph analysis pass. That analysis pass is (somewhat
unfortunately) over-constrained by the existing CallGraphSCCPassManager
requirements. Those requirements make it *really* hard to cleanly layer
the needed functionality for the new pass manager on top of the existing
analysis.

However, there are also a bunch of things that the pass manager would
specifically benefit from doing differently from the existing call graph
analysis, and this new implementation tries to address several of them:

- Be lazy about scanning function definitions. The existing pass eagerly
  scans the entire module to build the initial graph. This new pass is
  significantly more lazy, and I plan to push this even further to
  maximize locality during CGSCC walks.
- Don't use a single synthetic node to partition functions with an
  indirect call from functions whose address is taken. This node creates
  a huge choke-point which would preclude good parallelization across
  the fanout of the SCC graph when we got to the point of looking at
  such changes to LLVM.
- Use a memory dense and lightweight representation of the call graph
  rather than value handles and tracking call instructions. This will
  require explicit update calls instead of some updates working
  transparently, but should end up being significantly more efficient.
  The explicit update calls ended up being needed in many cases for the
  existing call graph so we don't really lose anything.
- Doesn't explicitly model SCCs and thus doesn't provide an "identity"
  for an SCC which is stable across updates. This is essential for the
  new pass manager to work correctly.
- Only form the graph necessary for traversing all of the functions in
  an SCC friendly order. This is a much simpler graph structure and
  should be more memory dense. It does limit the ways in which it is
  appropriate to use this analysis. I wish I had a better name than
  "call graph". I've commented extensively this aspect.

This is still very much a WIP, in fact it is really just the initial
bits. But it is about the fourth version of the initial bits that I've
implemented with each of the others running into really frustrating
problms. This looks like it will actually work and I'd like to split the
actual complexity across commits for the sake of my reviewers. =] The
rest of the implementation along with lots of wiring will follow
somewhat more rapidly now that there is a good path forward.

Naturally, this doesn't impact any of the existing optimizer. This code
is specific to the new pass manager.

A bunch of thanks are deserved for the various folks that have helped
with the design of this, especially Nick Lewycky who actually sat with
me to go through the fundamentals of the final version here.

llvm-svn: 200903
2014-02-06 04:37:03 +00:00
Paul Robinson af4e64d095 Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892
2014-02-06 00:07:05 +00:00
Duncan P. N. Exon Smith 8e661efc00 cleanup: scc_iterator consumers should use isAtEnd
No functional change.  Updated loops from:

    for (I = scc_begin(), E = scc_end(); I != E; ++I)

to:

    for (I = scc_begin(); !I.isAtEnd(); ++I)

for teh win.

llvm-svn: 200789
2014-02-04 19:19:07 +00:00
Chandler Carruth 6b4cc8b66a [inliner] Skip debug intrinsics even earlier in computing the inline
cost so that they don't impact the vector bonus. Fundamentally, counting
unsimplified instructions is just *wrong*; it will continue to introduce
instability as things which do not generate code bizarrely impact
inlining. For example, sufficiently nested inlined functions could turn
off the vector bonus with lifetime markers just like the debug
intrinsics do. =/

This is a short-term tactical fix. Long term, I think we need to remove
the vector bonus entirely. That's a separate patch and discussion
though.

The patch to fix this provided by Dario Domizioli. I've added some
comments about the planned direction and used a heavily pruned form of
debug info intrinsics for the test case. While this debug info doesn't
work or "do" anything useful, it lets us easily test all manner of
interference easily, and I suspect this will not be the last time we
want to craft a pattern where debug info interferes with the inliner in
a problematic way.

llvm-svn: 200609
2014-02-01 10:38:17 +00:00
Chandler Carruth 394e34f5c2 [inliner] Print out extra stats about the cost, threshold, and vector
bonus in the inline cost analysis.

Split out of a patch by Dario Domizioli to commit separately.

llvm-svn: 200586
2014-01-31 22:32:32 +00:00
Matt Arsenault ee364ee729 Allow speculating llvm.sqrt, fma and fmuladd
This doesn't set errno, so this should be OK.
Also update the documentation to explicitly state
that errno are not set.

llvm-svn: 200501
2014-01-31 00:09:00 +00:00
Reid Kleckner 26af2cae05 Update optimization passes to handle inalloca arguments
Summary:
I searched Transforms/ and Analysis/ for 'ByVal' and updated those call
sites to check for inalloca if appropriate.

I added tests for any change that would allow an optimization to fire on
inalloca.

Reviewers: nlewycky

Differential Revision: http://llvm-reviews.chandlerc.com/D2449

llvm-svn: 200281
2014-01-28 02:38:36 +00:00
Nick Lewycky 629199ccb3 Fix crasher introduced in r200203 and caught by a libc++ buildbot. Don't assume that getMulExpr returns a SCEVMulExpr, it may have simplified it to something else!
llvm-svn: 200210
2014-01-27 10:47:44 +00:00
Nick Lewycky 31eaca5513 Teach SCEV to handle more cases of 'and X, CST', specifically where CST is any number of contiguous 1 bits in a row, with any number of leading and trailing 0 bits.
Unfortunately, this in turn led to some lower quality SCEVs due to some different paths through expression simplification, so add getUDivExactExpr and use it. This fixes all instances of the problems that I found, but we can make that function smarter as necessary.

Merge test "xor-and.ll" into "and-xor.ll" since I needed to update it anyways. Test 'nsw-offset.ll' analyzes a little deeper, %n now gets a scev in terms of %no instead of a SCEVUnknown.

llvm-svn: 200203
2014-01-27 10:04:03 +00:00
Juergen Ributzka f26beda7c7 Revert "Revert "Add Constant Hoisting Pass" (r200034)"
This reverts commit r200058 and adds the using directive for
ARMTargetTransformInfo to silence two g++ overload warnings.

llvm-svn: 200062
2014-01-25 02:02:55 +00:00
Hans Wennborg 4d67a2e85a Revert "Add Constant Hoisting Pass" (r200034)
This commit caused -Woverloaded-virtual warnings. The two new
TargetTransformInfo::getIntImmCost functions were only added to the superclass,
and to the X86 subclass. The other targets were not updated, and the
warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was
hiding the two new getIntImmCost variants.

We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost"
to the various subclasses, or turning it off, but I suspect that it's wrong to
leave the functions unimplemnted in those targets. The default implementations
return TCC_Free, which I don't think is right e.g. for ARM.

llvm-svn: 200058
2014-01-25 01:18:18 +00:00
Juergen Ributzka 4f3df4ad64 Add Constant Hoisting Pass
Retry commit r200022 with a fix for the build bot errors. Constant expressions
have (unlike instructions) module scope use lists and therefore may have users
in different functions. The fix is to simply ignore these out-of-function uses.

llvm-svn: 200034
2014-01-24 20:18:00 +00:00
Juergen Ributzka 50e7e80d00 Revert "Add Constant Hoisting Pass"
This reverts commit r200022 to unbreak the build bots.

llvm-svn: 200024
2014-01-24 18:40:30 +00:00
Juergen Ributzka 38b67d0caf Add Constant Hoisting Pass
This pass identifies expensive constants to hoist and coalesces them to
better prepare it for SelectionDAG-based code generation. This works around the
limitations of the basic-block-at-a-time approach.

First it scans all instructions for integer constants and calculates its
cost. If the constant can be folded into the instruction (the cost is
TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't
consider it expensive and leave it alone. This is the default behavior and
the default implementation of getIntImmCost will always return TCC_Free.

If the cost is more than TCC_BASIC, then the integer constant can't be folded
into the instruction and it might be beneficial to hoist the constant.
Similar constants are coalesced to reduce register pressure and
materialization code.

When a constant is hoisted, it is also hidden behind a bitcast to force it to
be live-out of the basic block. Otherwise the constant would be just
duplicated and each basic block would have its own copy in the SelectionDAG.
The SelectionDAG recognizes such constants as opaque and doesn't perform
certain transformations on them, which would create a new expensive constant.

This optimization is only applied to integer constants in instructions and
simple (this means not nested) constant cast experessions. For example:
%0 = load i64* inttoptr (i64 big_constant to i64*)

Reviewed by Eric

llvm-svn: 200022
2014-01-24 18:23:08 +00:00
Juergen Ributzka 3e752e7af9 Add final and owerride keywords to TargetTransformInfo's subclasses.
llvm-svn: 200021
2014-01-24 18:22:59 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Benjamin Kramer 5e1794eedb InstSimplify: Make shift, select and GEP simplifications vector-aware.
llvm-svn: 200016
2014-01-24 17:09:53 +00:00
Matt Arsenault 339506d151 Get right cost for addrspacecast in cost model
llvm-svn: 199833
2014-01-22 20:30:16 +00:00
Chandler Carruth 043949d446 [PM] Make the verifier work independently of any pass manager.
This makes the 'verifyFunction' and 'verifyModule' functions totally
independent operations on the LLVM IR. It also cleans up their API a bit
by lifting the abort behavior into their clients and just using an
optional raw_ostream parameter to control printing.

The implementation of the verifier is now just an InstVisitor with no
multiple inheritance. It also is significantly more const-correct, and
hides the const violations internally. The two layers that force us to
break const correctness are building a DomTree and dispatching through
the InstVisitor.

A new VerifierPass is used to implement the legacy pass manager
interface in terms of the other pieces.

The error messages produced may be slightly different now, and we may
have slightly different short circuiting behavior with different usage
models of the verifier, but generally everything works equivalently and
this unblocks wiring the verifier up to the new pass manager.

llvm-svn: 199569
2014-01-19 02:22:18 +00:00
Arnold Schwaighofer e3ac099726 BasicAA: We need to check both access sizes when comparing a gep and an
underlying object of unknown size.

Fixes PR18460.

llvm-svn: 199351
2014-01-16 04:53:18 +00:00
Andrew Trick ee5aa7f71a Fix PR18449: SCEV needs more precise max BECount for multi-exit loop.
llvm-svn: 199299
2014-01-15 06:42:11 +00:00
Matt Arsenault e55a2c2e6b Make nocapture analysis work with addrspacecast
llvm-svn: 199246
2014-01-14 19:11:52 +00:00
Chandler Carruth 73523021d0 [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

llvm-svn: 199104
2014-01-13 13:07:17 +00:00
Chandler Carruth e509db410a [PM] Pull the generic graph algorithms and data structures for dominator
trees into the Support library.

These are all expressed in terms of the generic GraphTraits and CFG,
with no reliance on any concrete IR types. Putting them in support
clarifies that and makes the fact that the static analyzer in Clang uses
them much more sane. When moving the Dominators.h file into the IR
library I claimed that this was the right home for it but not something
I planned to work on. Oops.

So why am I doing this? It happens to be one step toward breaking the
requirement that IR verification can only be performed from inside of
a pass context, which completely blocks the implementation of
verification for the new pass manager infrastructure. Fixing it will
also allow removing the concept of the "preverify" step (WTF???) and
allow the verifier to cleanly flag functions which fail verification in
a way that precludes even computing dominance information. Currently,
that results in a fatal error even when you ask the verifier to not
fatally error. It's awesome like that.

The yak shaving will continue...

llvm-svn: 199095
2014-01-13 10:52:56 +00:00
Chandler Carruth 5ad5f15cff [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

llvm-svn: 199082
2014-01-13 09:26:24 +00:00
Chandler Carruth b8ddc7043c [PM] Rename the IR printing pass header to a more generic and correct
name to match the source file which I got earlier. Update the include
sites. Also modernize the comments in the header to use the more
recommended doxygen style.

llvm-svn: 199041
2014-01-12 11:10:32 +00:00
Stepan Dyatkovskiy 431993b57b Fixed old typo in ScalarEvolution, that caused wrong SCEVs zext operation.
Detailed description is here:
http://llvm.org/bugs/show_bug.cgi?id=18000#c16

For participation in bugfix process special thanks to David Wiberg.

llvm-svn: 198863
2014-01-09 12:26:12 +00:00
Chandler Carruth d48cdbf0c3 Put the functionality for printing a value to a raw_ostream as an
operand into the Value interface just like the core print method is.
That gives a more conistent organization to the IR printing interfaces
-- they are all attached to the IR objects themselves. Also, update all
the users.

This removes the 'Writer.h' header which contained only a single function
declaration.

llvm-svn: 198836
2014-01-09 02:29:41 +00:00
Chandler Carruth 9aca918df9 Move the LLVM IR asm writer header files into the IR directory, as they
are part of the core IR library in order to support dumping and other
basic functionality.

Rename the 'Assembly' include directory to 'AsmParser' to match the
library name and the only functionality left their -- printing has been
in the core IR library for quite some time.

Update all of the #includes to match.

All of this started because I wanted to have the layering in good shape
before I started adding support for printing LLVM IR using the new pass
infrastructure, and commandline support for the new pass infrastructure.

llvm-svn: 198688
2014-01-07 12:34:26 +00:00
Chandler Carruth 8a8cd2bab9 Re-sort all of the includes with ./utils/sort_includes.py so that
subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.

Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.

llvm-svn: 198685
2014-01-07 11:48:04 +00:00
Mingjie Xing 9deac1b7c2 Fix comment of findGCD.
llvm-svn: 198660
2014-01-07 01:54:16 +00:00
Chandler Carruth c4ddab6ff2 [PM] Add a definition for the static PassID in the CallGraphAnalysis.
Missed this when adding the skeleton analysis. Caught by a build break
in the next patch I'm working on when trying to use the analysis.

llvm-svn: 198556
2014-01-05 10:38:52 +00:00