Commit Graph

692 Commits

Author SHA1 Message Date
Evgeny Stupachenko d6aa0d02c2 Set option enabling LSR alternative way to resolve complex solution to false.
Differential Revision: http://reviews.llvm.org/D29862

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 296959
2017-03-04 03:14:05 +00:00
Wei Mi 74d5a90fa6 [LSR] Canonicalize formula and put recursive Reg related with current loop in ScaledReg.
After rL294814, LSR formula can have multiple SCEVAddRecExprs inside of its BaseRegs.
Previous canonicalization will swap the first SCEVAddRecExpr in BaseRegs with ScaledReg.
But now we want to swap the SCEVAddRecExpr Reg related with current loop with ScaledReg.
Otherwise, we may generate code like this: RegA + lsr.iv + RegB, where loop invariant
parts RegA and RegB are not grouped together and cannot be promoted outside of loop.
With this patch, it will ensure lsr.iv to be generated later in the expr:
RegA + RegB + lsr.iv, so that RegA + RegB can be promoted outside of loop.

Differential Revision: https://reviews.llvm.org/D26781

llvm-svn: 295884
2017-02-22 21:47:08 +00:00
Evgeny Stupachenko 9909872e30 The patch introduces new way of narrowing complex (>UINT16 variants) solutions.
The new method introduced under "-lsr-exp-narrow" option (currenlty set to true).

Summary:

The method is based on registers number mathematical expectation and should be
 generally closer to optimal solution.
Please see details in comments to
 "LSRInstance::NarrowSearchSpaceByDeletingCostlyFormulas()" function
 (in lib/Transforms/Scalar/LoopStrengthReduce.cpp).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D29862

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 295704
2017-02-21 07:34:40 +00:00
Wei Mi 493fb266ed [LSR] Prevent formula with SCEVAddRecExpr type of Reg from Sibling loops
In rL294814, we allow formula with SCEVAddRecExpr type of Reg from loops
other than current loop. This is good for the case when induction variable
of outerloop being used in expr in innerloop. But it is very bad to allow
such Reg from sibling loop because we may need to add lsr.iv in other sibling
loops when scev expanding those SCEVAddRecExpr type exprs. For the testcase
below, one loop can be inserted with a bunch of lsr.iv because of LSR for
other loops. 

// The induction variable j from a loop in the middle will have initial
// value generated from previous sibling loop and exit value used by its
// next sibling loop.
void goo(long i, long j); 
long cond; 

void foo(long N) { 
long i = 0; 
long j = 0; 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
} 

The fix is to only allow formula with SCEVAddRecExpr type of Reg from current
loop or its parents.

Differential Revision: https://reviews.llvm.org/D30021

llvm-svn: 295378
2017-02-16 21:27:31 +00:00
Mikael Holmen ece84cd10c [LSR] Pointers with different address spaces are considered incompatible.
Summary:
Function isCompatibleIVType is already used as a guard before the call to

 SE.getMinusSCEV(OperExpr, PrevExpr);

in LSRInstance::ChainInstruction. getMinusSCEV requires the expressions
to be of the same type, so we now consider two pointers with different
address spaces to be incompatible, since it is possible that the pointers
in fact have different sizes.

Reviewers: qcolombet, eli.friedman

Reviewed By: qcolombet

Subscribers: nhaehnle, Ka-Ka, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D29885

llvm-svn: 295033
2017-02-14 06:37:42 +00:00
Evgeny Stupachenko fe6f548d2d Fix PR23384 (under "-lsr-insns-cost" option)
Summary:
The patch adds instructions number generated by a solution
 to LSR cost under "-lsr-insns-cost" option.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D28307

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 294821
2017-02-11 02:57:43 +00:00
Wei Mi 8f20e63a20 [LSR] Recommit: Allow formula containing Reg for SCEVAddRecExpr related with outerloop.
The recommit includes some changes of testcases. No functional change to the patch.

In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr,
and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser
and dropped.

Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only
handle inner loop now so only %for.body2 will be handled.

Using the logic above, formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) will be dropped
no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related
with outerloop. Only formula like
reg(%array) + 1*reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept
because the SCEVAddRecExpr related with outerloop is folded into the initial value of the
SCEVAddRecExpr related with current loop.

But in some cases, we do need to share the basic induction variable
reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction
variables used by LSR, so we don't want to drop the formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally.

From the existing comment, it tries to avoid considering multiple level loops at the same time.
However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other
than current loop, it is an invariant and will be simple to handle, and the formula doesn't have
to be dropped.

Differential Revision: https://reviews.llvm.org/D26429

llvm-svn: 294814
2017-02-11 00:50:23 +00:00
Matt Arsenault cb3fa37c7e LSR: Check atomic instruction pointer operands
llvm-svn: 294410
2017-02-08 06:44:58 +00:00
Matt Arsenault 1f2ca66317 LSR: Don't drop address space when type doesn't match
For targets with different addressing modes in each address space,
if this is dropped querying isLegalAddressingMode later with this
will give a nonsense result, breaking the isLegalUse assertions.

This is a candidate for the 4.0 release branch.

llvm-svn: 293542
2017-01-30 19:50:17 +00:00
Matthias Braun 8c209aa877 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Quentin Colombet 351099022a [RegisterCoalescing] Recommit the patch "Remove partial redundent copy".
In r292621, the recommit fixes a bug related with live interval update
after the partial redundent copy is moved.

This recommit solves an additional bug related to the lack of update of
subranges.

The original patch is to solve the performance problem described in
PR27827. Register coalescing sometimes cannot remove a copy because of
interference. But if we can find a reverse copy in one of the predecessor
block of the copy, the copy is partially redundent and we may remove the
copy partially by moving it to the predecessor block without the
reverse copy.

Differential Revision: https://reviews.llvm.org/D28585

Re-apply r292621

Revert "Revert rL292621. Caused some internal build bot failures in apple."

This reverts commit r292984.

Original patch: Wei Mi <wmi@google.com>
Subrange fix: Mostly Matthias Braun <matze@braunis.de>

llvm-svn: 293353
2017-01-28 01:05:27 +00:00
David Majnemer bba17390c7 [LoopStrengthReduce] Don't bother rewriting PHIs in catchswitch blocks
The catchswitch instruction cannot be split, don't bother trying to
rewrite it.

This fixes PR31627.

llvm-svn: 291966
2017-01-13 22:24:27 +00:00
Chandler Carruth 3bab7e1a79 [PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.

Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.

We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.

This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.

I haven't split the unittest though because testing one component
without the other seems nearly intractable.

Differential Revision: https://reviews.llvm.org/D28452

llvm-svn: 291662
2017-01-11 09:43:56 +00:00
Chandler Carruth 410eaeb064 [PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

Differential Revision: https://reviews.llvm.org/D28292

llvm-svn: 291651
2017-01-11 06:23:21 +00:00
Evgeny Stupachenko 0c4300fac7 Fix LSR best register search algorithm.
Summary:
Fix a case when first register in a search has maximum
RegUses.getUsedByIndices(Reg).count()

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D26877

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 288278
2016-11-30 22:23:51 +00:00
Eugene Zelenko a3fe70d233 Fix some Clang-tidy and Include What You Use warnings; other minor fixes (NFC).
This preparation to remove SetVector.h dependency on SmallSet.h.

llvm-svn: 288256
2016-11-30 17:48:10 +00:00
Evgeny Stupachenko 8efbe6acae LSR debug fix.
Summary:
Dump instruction instead of address.
Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D26877

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 287584
2016-11-21 21:55:03 +00:00
Wei Mi 37c4aaaf52 Revert r286999 which caused buildbot test failures. Some testcases need to be made target specific.
llvm-svn: 287014
2016-11-15 19:42:05 +00:00
Wei Mi 7ccf7651c0 [LSR] Allow formula containing Reg for SCEVAddRecExpr related with outerloop.
In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr,
and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser
and dropped.

Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only
handle inner loop now so only %for.body2 will be handled.

Using the logic above, formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) will be dropped
no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related
with outerloop. Only formula like
reg(%array) + 1*reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept
because the SCEVAddRecExpr related with outerloop is folded into the initial value of the
SCEVAddRecExpr related with current loop.

But in some cases, we do need to share the basic induction variable
reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction
variables used by LSR, so we don't want to drop the formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally.

From the existing comment, it tries to avoid considering multiple level loops at the same time.
However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other
than current loop, it is an invariant and will be simple to handle, and the formula doesn't have
to be dropped.

Differential Revision: https://reviews.llvm.org/D26429

llvm-svn: 286999
2016-11-15 18:35:53 +00:00
Alexandros Lamprineas 0ee3ec2fe4 [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.

Differential Revision: https://reviews.llvm.org/D26185

llvm-svn: 286347
2016-11-09 08:53:07 +00:00
Justin Lebar 54b0be048e [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.

It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range.  But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.

Fixes PR30914.

Reviewers: jeremyhu

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26323

llvm-svn: 286038
2016-11-05 16:47:25 +00:00
Jonas Paulsson 7a79422536 [LoopStrenghtReduce] Refactoring and addition of a new target cost function.
Refactored so that a LSRUse owns its fixups, as oppsed to letting the
LSRInstance own them. This makes it easier to rate formulas for
LSRUses, since the fixups are available directly. The Offsets vector
has been removed since it was no longer necessary.

New target hook isFoldableMemAccessOffset(), which is used during formula
rating.

For SystemZ, this is useful to express that loads and stores with
float or vector types with a big/negative offset should be avoided in
loops. Without this, LSR will generate a lot of negative offsets that
would require extra instructions for loading the address.

Updated tests:
test/CodeGen/SystemZ/loop-01.ll

Reviewed by: Quentin Colombet and Ulrich Weigand.
https://reviews.llvm.org/D19152

llvm-svn: 278927
2016-08-17 13:24:19 +00:00
James Molloy 196ad0823e [LSR] Don't try and create post-inc expressions on non-rotated loops
If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs!

Motivating testcase:

    void f(float *a, float *b, float *c, int n) {
      while (n-- > 0)
        *c++ = *a++ + *b++;
    }

It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage.

llvm-svn: 278658
2016-08-15 07:53:03 +00:00
David Majnemer 42531260b3 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
Geoff Berry d01828096f [SCEV] Update interface to handle SCEVExpander insert point motion.
Summary:
This is an extension of the fix in r271424.  That fix dealt with builder
insert points being moved by SCEV expansion, but only for the lifetime
of the expand call.  This change modifies the interface so that LSR can
safely call expand multiple times at the same insert point and do the
right thing if one of the expansions decides to move the original insert
point.

This is a fix for PR28719.

Reviewers: sanjoy

Subscribers: llvm-commits, mcrosier, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23342

llvm-svn: 278413
2016-08-11 21:05:17 +00:00
Sean Silva 0746f3bfa4 Consistently use LoopAnalysisManager
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.

Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278079
2016-08-09 00:28:52 +00:00
Dehao Chen 6132ee8502 [PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22468

llvm-svn: 275919
2016-07-18 21:41:50 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
Davide Italiano 709d41819b [LoopStrengthReduce] Fix -Wmisleading-indentation. Reported by GCC6.
llvm-svn: 274773
2016-07-07 17:44:38 +00:00
David Majnemer d770877328 Switch more loops to be range-based
This makes the code a little more concise, no functional change is
intended.

llvm-svn: 273644
2016-06-24 04:05:21 +00:00
Geoff Berry 43e5160d0e Reapply [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Originally reviewed in http://reviews.llvm.org/D18001

Reviewers: atrick

Subscribers: llvm-commits, mzolotukhin, mcrosier

Differential Revision: http://reviews.llvm.org/D18480

llvm-svn: 271929
2016-06-06 19:10:46 +00:00
Craig Topper 8287fd8abd [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
2016-05-30 23:15:56 +00:00
Craig Topper a423aa4642 [X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear.
I don't really know how to test this. Just seemed like we should be consistent.

llvm-svn: 270819
2016-05-26 04:28:45 +00:00
Craig Topper 12e322a8cf [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time.
llvm-svn: 270677
2016-05-25 06:56:32 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
David Majnemer e09d035dad [LoopStrengthReduce] Don't hoist into a catchswitch
We try to hoist the insertion point as high as possible to encourage
sharing.  However, we must be careful not to hoist into a catchswitch as
it is both an EHPad and a terminator.

llvm-svn: 264344
2016-03-24 21:40:22 +00:00
Geoff Berry 56fabf9b55 Revert "[LSR] Create fewer redundant instructions."
This reverts commit r263644.  Investigating bootstrap failures.

llvm-svn: 263655
2016-03-16 19:21:47 +00:00
Geoff Berry 459b750871 [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Reviewers: atrick

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18001

llvm-svn: 263644
2016-03-16 17:29:49 +00:00
David Majnemer a53b5bbb18 [LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitches
Bail out if we have a PHI on an EHPad that gets a value from a
CatchSwitchInst.  Because the CatchSwitchInst cannot be split, there is
no good place to stick any instructions.

This fixes PR26373.

llvm-svn: 259702
2016-02-03 21:30:34 +00:00
Sanjoy Das 0de2feceb1 [SCEV] Add and use SCEVConstant::getAPInt; NFCI
llvm-svn: 255921
2015-12-17 20:28:46 +00:00
Justin Bogner 843fb204b7 LPM: Stop threading `Pass *` through all of the loop utility APIs. NFC
A large number of loop utility functions take a `Pass *` and reach
into it to find out which analyses to preserve. There are a number of
problems with this:

- The APIs have access to pretty well any Pass state they want, so
  it's hard to tell what they may or may not do.

- Other APIs have copied these and pass around a `Pass *` even though
  they don't even use it. Some of these just hand a nullptr to the API
  since the callers don't even have a pass available.

- Passes in the new pass manager don't work like the current ones, so
  the APIs can't be used as is there.

Instead, we should explicitly thread the analysis results that we
actually care about through these APIs. This is both simpler and more
reusable.

llvm-svn: 255669
2015-12-15 19:40:57 +00:00
Davide Italiano 945d05f6a0 [LoopStrengthReduce] Mark dump() definitions as LLVM_DUMP_METHOD.
llvm-svn: 253841
2015-11-23 02:47:30 +00:00
David Majnemer 7378e7a333 [LoopStrengthReduce] Don't increment iterator past the end of the BB
We tried to move the insertion point beyond instructions like landingpad
and cleanuppad.
However, we *also* tried to move past catchpad.  This is problematic
because catchpad is also a terminator.

This fixes PR25541.

llvm-svn: 253238
2015-11-16 17:37:58 +00:00
David Majnemer b222184223 [LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad preds
We cannot really insert fixup code into a PHI's predecessor.

This fixes PR25445.

llvm-svn: 252416
2015-11-08 05:04:07 +00:00
Duncan P. N. Exon Smith be4d8cba1c Scalar: Remove remaining ilist iterator implicit conversions
Remove remaining `ilist_iterator` implicit conversions from
LLVMScalarOpts.

This change exposed some scary behaviour in
lib/Transforms/Scalar/SCCP.cpp around line 1770.  This patch changes a
call from `Function::begin()` to `&Function::front()`, since the return
was immediately being passed into another function that takes a
`Function*`.  `Function::front()` started to assert, since the function
was empty.  Note that `Function::end()` does not point at a legal
`Function*` -- it points at an `ilist_half_node` -- so the other
function was getting garbage before.  (I added the missing check for
`Function::isDeclaration()`.)

Otherwise, no functionality change intended.

llvm-svn: 250211
2015-10-13 19:26:58 +00:00
David Majnemer ba275f9947 Replace some calls to isa<LandingPadInst> with isEHPad()
No functionality change is intended.

llvm-svn: 245487
2015-08-19 19:54:02 +00:00
Chandler Carruth 2f1fd1658f [PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193
2015-08-17 02:08:17 +00:00