Commit Graph

9885 Commits

Author SHA1 Message Date
Nikita Popov bac97993ca [CaptureTracking] Avoid duplicate shouldExplode() check (NFCI)
We check shouldExplore() before adding uses to the worklist, so
uses that should not be explored will not reach captured() in the
first place.
2020-11-07 10:16:58 +01:00
Kazu Hirata 118c3f3cf2 [BranchProbabilityInfo] Simplify getEdgeProbability (NFC)
The patch simplifies BranchProbabilityInfo::getEdgeProbability by
handling two cases separately, depending on whether we have edge
probabilities.

- If we have edge probabilities, then add up probabilities for
  successors being equal to Dst.

- Otherwise, return the number of ocurrences divided by the total
  number of successors.

Differential Revision: https://reviews.llvm.org/D90980
2020-11-06 22:47:22 -08:00
Kazu Hirata 30929d1f7b [BranchProbabilityInfo] Use succ_size (NFC) 2020-11-06 11:05:35 -08:00
Simon Pilgrim 20f87d82ed [InstCombine] computeKnownBitsMul - use KnownBits::isNonZero() helper.
Avoid an expensive isKnownNonZero() call - this is a small cleanup before moving the extra NSW functionality from computeKnownBitsMul into KnownBits::computeForMul.
2020-11-06 17:27:13 +00:00
Yevgeny Rouban 681d6c711f [BranchProbabilityInfo] Introduce method copyEdgeProbabilities(). NFC
A new method is introduced to allow bulk copy of outgoing edge
probabilities from one block to another. This can be useful when
a block is cloned from another one and we do not know if there
are edge probabilities set for the original block or not.
Copying outside of the BranchProbabilityInfo class makes the user
unconditionally set the cloned block's edge probabilities even if
they are unset for the original block.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D90839
2020-11-06 14:52:35 +07:00
Yevgeny Rouban e38c8e7590 [BranchProbabilityInfo] Remove block handles in eraseBlock()
BranchProbabilityInfo::eraseBlock() is a public method and
can be called without deleting the block itself.
This method is made remove the correspondent tracking handle
from BranchProbabilityInfo::Handles along with
the probabilities of the block. Handles.erase() call is moved
to eraseBlock().
In setEdgeProbability() we need to add the block handle only once.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D90838
2020-11-06 13:13:58 +07:00
Yevgeny Rouban 4931158d27 [BranchProbabilityInfo] Get rid of MaxSuccIdx. NFC
This refactoring allows to eliminate the MaxSuccIdx map
proposed in the commit a7b662d0.
The idea is to remove probabilities for a block BB for
all its successors one by one from first, second, ...
till N-th until they are defined in Probs. This works
because probabilities for the block are set at once for
all its successors from number 0 to N-1 and the rest
are removed if there were stale probs.
The protected method setEdgeProbability(), which set
probabilities for individual successor, is removed.
This makes it clear that the probabilities are set in
bulk by the public method with the same name.

Reviewed By: kazu, MaskRay

Differential Revision: https://reviews.llvm.org/D90837
2020-11-06 12:21:24 +07:00
Anna Thomas afe92642cc Revert "[CaptureTracking] Avoid overly restrictive dominates check"
This reverts commit 15694fd6ad.
Need to investigate and fix a failing clang test: synchronized.m.
Might need a test update.
2020-11-05 12:27:15 -05:00
Anna Thomas 15694fd6ad [CaptureTracking] Avoid overly restrictive dominates check
CapturesBefore tracker has an overly restrictive dominates check when
the `BeforeHere` and the capture point are in different basic blocks.
All we need to check is that there is no path from the capture point
to `BeforeHere` (which is less stricter than the dominates check).
See added testcase in one of the users of CapturesBefore.

Reviewed-By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90688
2020-11-05 11:38:50 -05:00
Simon Pilgrim 6729b6de1f [KnownBits] Move ValueTracking SREM KnownBits handling to KnownBits::srem. NFCI.
Move the ValueTracking implementation to KnownBits, the SelectionDAG version is more limited so I'm intending to replace that as a separate commit.
2020-11-05 14:58:33 +00:00
Simon Pilgrim e237d56b43 [KnownBits] Move ValueTracking/SelectionDAG UREM KnownBits handling to KnownBits::urem. NFCI.
Both these have the same implementation - so move them to a single KnownBits copy.

GlobalISel will be able to use this as well with minimal effort.
2020-11-05 14:30:59 +00:00
Simon Pilgrim 32bee18b84 [KnownBits] Move ValueTracking/SelectionDAG UDIV KnownBits handling to KnownBits::udiv. NFCI.
Both these have the same implementation - so move them to a single KnownBits copy.

GlobalISel will be able to use this as well with minimal effort.
2020-11-05 13:42:42 +00:00
Max Kazantsev ab7ef35d34 Revert "[SCEV] Handle non-positive case in isImpliedViaOperations"
This reverts commit 8dc98897c4.

Commited by mistake.
2020-11-05 11:27:55 +07:00
Max Kazantsev 8dc98897c4 [SCEV] Handle non-positive case in isImpliedViaOperations
We already handle non-negative case there. Add support for non-positive.
2020-11-05 11:07:37 +07:00
Atmn Patel cea0599aa7 [LangRef] Adds llvm.loop.mustprogress loop metadata
This patch adds the llvm.loop.mustprogress loop metadata. This is to be
added to loops where the frontend language requires that the loop makes
observable interactions with the environment. This is the loop-level
equivalent to the function attribute `mustprogress` defined in D86233.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D88464
2020-11-04 22:32:50 -05:00
Nikita Popov 52b86d35a4 [MemorySSA] Use provided memory location even if instruction is call
If getClobberingMemoryAccess() is called with an explicit
MemoryLocation, but the starting access happens to be a call, the
provided location is currently ignored, and alias analysis queries
will be performed against the call instruction instead. Something
similar happens if the starting access is a load with a MemoryDef.

Change the implementation to not set Q.Inst in the first place if
we want to perform a MemoryLocation-based query, to make sure it
can't be turned into an Instruction-based query along the way...

Additionally, remove the special handling that lifetime.start
intrinsics currently get. They simply report NoAlias for clobbers
between lifetime.start and other calls, but that's obviously not
right if the other call is something like a memset or memcpy. The
default behavior we get from getModRefInfo() will already do the
right thing here.

Differential Revision: https://reviews.llvm.org/D88782
2020-11-04 20:30:22 +01:00
Sanjay Patel c74db55ff5 [InstSimplify] allow vector folds for icmp Pred (1 << X), 0x80 2020-11-04 08:12:48 -05:00
Arthur Eubanks 06926e0f01 Port print-must-be-executed-contexts and print-mustexecute to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90207
2020-11-03 21:06:46 -08:00
Fangrui Song 491dd2711f [LazyCallGraph] Build SCCs of the reference graph in order
```
// The legacy PM CGPassManager discovers SCCs this way:
for function in the source order
  tarjanSCC(function)

// While the new PM CGSCCPassManager does:
for function in the reversed source order [1]
  discover a reference graph SCC
  build call graph SCCs inside the reference graph SCC
```

In the common cases, reference graph ~= call graph, the new PM order is
undesired because for `a | b | c` (3 independent functions), the new PM will
process them in the reversed order: c, b, a. If `a <-> b <-> c`, we can see
that `-print-after-all` will report the sole SCC as `scc: (c, b, a)`.

This patch corrects the iteration order. The discovered SCC order will match
the legacy PM in the common cases.

For some tests (`Transforms/Inline/cgscc-*.ll` and
`unittests/Analysis/CGSCCPassManagerTest.cpp`), the behaviors are dependent on
the SCC discovery order and there are too many check lines for the particular
order.  This patch simply reverses the function order to avoid changing too many
check lines.

Differential Revision: https://reviews.llvm.org/D90566
2020-11-02 13:22:42 -08:00
Florian Hahn b3b993a7ad Reland "[TTI] Add VecPred argument to getCmpSelInstrCost."
This reverts the revert commit 408c4408fa.

This version of the patch includes a fix for a crash caused by
treating ICmp/FCmp constant expressions as instructions.

Original message:

On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.

This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.

This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.

I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
2020-11-02 15:39:29 +00:00
Nikita Popov cc91554ebb [SCEV] Delay strengthening of nowrap flags
Strengthening nowrap flags is relatively expensive. Make sure we
only do it if we're actually going to use the flags -- we don't
use them for many recursive invocations. Additionally, if we're
reusing an existing SCEV node, there's no point in trying to
strengthen the flags if we don't have any new baseline facts.

This change falls slightly short of being NFC, because the way
flags during add+addrec / mul+addrec folding are handled may be
more precise (as less operands are included in the calculation).
2020-11-01 22:18:07 +01:00
Nikita Popov 6ec56467cb [SCEV] Construct GEP expression more efficiently (NFCI)
Instead of performing a sequence of pairwise additions, directly
construct a multi-operand add expression.

This should be NFC modulo any SCEV canonicalization deficiencies.
2020-11-01 19:00:57 +01:00
Florian Hahn 799033d8c5 Reland "[SLP] Consider alternatives for cost of select instructions."
This reverts the revert commit a1b53db324.

This patch includes a fix for a reported issue, caused by
matchSelectPattern returning UMIN for selects of pointers in
some cases by looking to some connected casts.

For now, ensure integer instrinsics are only returned for selects of
ints or int vectors.
2020-10-31 16:52:36 +00:00
Arthur Eubanks 5c31b8b94f Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662.

More uint64_t overflows.
2020-10-31 00:25:32 -07:00
Florian Hahn a1b53db324 Revert "[SLP] Consider alternatives for cost of select instructions."
This reverts commit 1922570489.

This appears to cause a crash in the following example

 a, b, c;
 l() {
   int e = a, f = l, g, h, i, j;
   float *d = c, *k = b;
   for (;;)
     for (; g < f; g++) {
       k[h] = d[i];
       k[h - 1] = d[j];
       h += e << 1;
       i += e;
     }
 }

 clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c

 llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.
2020-10-30 21:26:14 +00:00
Florian Hahn 408c4408fa Revert "[TTI] Add VecPred argument to getCmpSelInstrCost."
This reverts commit 73f01e3df5.

This appears to break
http://lab.llvm.org:8011/#/builders/85/builds/383.
2020-10-30 21:26:14 +00:00
Anna Thomas 7aac3a9048 [CFG] Replace hardcoded max BBs explored as CL option. NFC.
This option was hardcoded to 32. Changing this as a CL option since we
have seen some cases downstream where increasing this limit allows us to
disprove reachability.

Reviewed-By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90487
2020-10-30 15:11:48 -04:00
Arthur Eubanks 10f2a0d662 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-30 10:03:46 -07:00
Roman Lebedev ef22d500f7
[NFCI][SCEV] getPtrToIntExpr(): use SCEVRewriteVisitor<> for ptrtoint cast sinking
This is functionally-identical to the previous implementation,
just using a generic interface to do that instead of hand-rolled one,
with caching as a bonus. Thought the sinking is still recursive..

Note that SCEVRewriteVisitor<>'s default implementations
don't preserve NoWrap flags on Add/Mul (but does on AddRec!),
but here we know we can preserve them,
so `visitAddExpr()`/`visitMulExpr()` are specialized.
2020-10-30 17:05:14 +03:00
Florian Hahn 73f01e3df5 [TTI] Add VecPred argument to getCmpSelInstrCost.
On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.

This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.

This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.

I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.

Reviewed By: dmgreen, RKSimon

Differential Revision: https://reviews.llvm.org/D90070
2020-10-30 13:49:08 +00:00
Roman Lebedev b4916918e5
[SCEV] SCEVPtrToIntExpr simplifications
If we've got an SCEVPtrToIntExpr(op), where op is not an SCEVUnknown,
we want to sink the SCEVPtrToIntExpr into an operand,
so that the operation is performed on integers,
and eventually we end up with just an `SCEVPtrToIntExpr(SCEVUnknown)`.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D89692
2020-10-30 11:13:35 +03:00
Roman Lebedev 81fc53a36a
[SCEV] Introduce SCEVPtrToIntExpr (PR46786)
And use it to model LLVM IR's `ptrtoint` cast.

This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)

As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
  and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
  to not loose pointer-ness of an expression in more cases

As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.

But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D89456
2020-10-30 11:13:35 +03:00
Alina Sbirlea 9d93b150c9 [AA] Pass query info.
Pass AAQI in places where it was missed.

Part of D89991.
Author: haoranxu510 (Haoran Xu)
2020-10-29 18:07:53 -07:00
Stefanos Baziotis a3345300b6 [LCSSA] Doc for special treatment of PHIs
Differential Revision: https://reviews.llvm.org/D89739
2020-10-29 22:50:07 +02:00
Florian Hahn 1922570489 [SLP] Consider alternatives for cost of select instructions.
Some architectures do not have general vector select instructions (e.g.
AArch64). But some cmp/select patterns can be vectorized using other
instructions/intrinsics.

One example is using min/max instructions for certain patterns.

This patch updates the cost calculations for selects in the SLP
vectorizer to consider using min/max intrinsics.

This patch does not change SLP vectorizer's codegen itself to actually
generate those intrinsics, but relies on the backends to lower the
vector cmps & selects. This keeps things simple on the SLP side and
works well in practice for AArch64.

This exposes additional SLP vectorization opportunities in some
benchmarks on AArch64 (-O3 -flto).

Metric: SLP.NumVectorInstructions

Program                                        base    slp     diff
 test-suite...ications/JM/ldecod/ldecod.test   502.00  697.00  38.8%
 test-suite...ications/JM/lencod/lencod.test   1023.00 1414.00 38.2%
 test-suite...-typeset/consumer-typeset.test    56.00   65.00  16.1%
 test-suite...6/464.h264ref/464.h264ref.test   804.00  822.00   2.2%
 test-suite...006/453.povray/453.povray.test   3335.00 3357.00  0.7%
 test-suite...CFP2000/177.mesa/177.mesa.test   2110.00 2121.00  0.5%
 test-suite...:: External/Povray/povray.test   2378.00 2382.00  0.2%

Reviewed By: RKSimon, samparker

Differential Revision: https://reviews.llvm.org/D89969
2020-10-29 20:39:50 +00:00
Max Kazantsev 3fc601b641 [NFC][SCEV] Use generic predicate checkers to simplify code 2020-10-29 18:12:28 +07:00
Florian Hahn 88d6421e4c [SCEV] Match 'zext (trunc A to iB) to iY' as URem.
URem operations with constant power-of-2 second operands are modeled as
such. This patch on its own has very little impact (e.g. no changes in
CodeGen for MultiSource/SPEC2000/SPEC2006 on X86 -O3 -flto), but I'll
soon post follow-up patches that make use of it to more accurately
determine the trip multiple.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D89821
2020-10-29 10:46:52 +00:00
Max Kazantsev ef129f01e9 [SCEV][NFC] Use general predicate checkers in monotonicity check
This makes the code more compact and readable.
2020-10-29 16:45:52 +07:00
Max Kazantsev a5b2e795c3 [NFC][SCEV] Refactor monotonic predicate checks to return enums instead of bools
This patch gets rid of output parameter which is not needed for most users
and prepares this API for further refactoring.
2020-10-29 16:01:25 +07:00
Philip Reames 4e4abd16a7 [Deref] Use maximum trip count instead of exact trip count
When trying to prove that a memory access touches only dereferenceable memory across all iterations of a loop, use the maximum exit count rather than an exact one.  In many cases we can't prove exact exit counts whereas we can prove an upper bound.

The test included is for a single exit loop with a min(C,V) exit count, but the true motivation is support for multiple exits loops.  It's just really hard to write a test case for multiple exits because the vectorizer (the primary user of this API), bails far before this.  For multiple exits, this allows a mix of analyzeable and unanalyzable exits when only analyzeable exits are needed to prove deref.
2020-10-28 14:33:30 -07:00
Dávid Bolvanský 49cddb90f6 [MemLoc] Adjust memccpy support in MemoryLocation::getForArgument
Use LocationSize::upperBound instead of precise since we only know an upper bound on the number of bytes read/written.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89885
2020-10-28 21:26:10 +01:00
Max Kazantsev 160a453138 Return "[IndVars] Remove monotonic checks with unknown exit count"
This reverts commit e038b60d91.
This reverts commit a0d84d8031.

This revert was a mistake. The reason of the failures was
"Use uint64_t for branch weights instead of uint32_t"

Differential Revision: https://reviews.llvm.org/D87832
2020-10-28 18:51:40 +07:00
Max Kazantsev 5ef84688fb Re-enable "[SCEV] Prove implications of different type via truncation"
When we need to prove implication of expressions of different type width,
the default strategy is to widen everything to wider type and prove in this
type. This does not interact well with AddRecs with negative steps and
unsigned predicates: such AddRec will likely not have a `nuw` flag, and its
`zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec
in some cases can easily be proved to be an `AddRec` too.

This patch introduces an alternative way to handling implications of different
type widths. If we can prove that wider type values actually fit in the narrow type,
we truncate them and prove the implication in narrow type.

The return was due to revert of underlying patch that this one depends on.

Unit test temporarily disabled because the required logic in SCEV is switched
off due to compile time reasons.

Differential Revision: https://reviews.llvm.org/D89548
2020-10-28 16:02:14 +07:00
Luqman Aden 4c0a016927 Rename EHPersonality::MSVC_Win64SEH to EHPersonality::MSVC_TableSEH. NFC.
The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining
vs table-based exception handling. x86-32 is the only arch for which Windows
uses the former. 32-bit ARM would use what is called Win64SEH today, which
is a bit confusing so instead let's just rename it to be a bit more clear.

Reviewed By: compnerd, rnk

Differential Revision: https://reviews.llvm.org/D90117
2020-10-27 23:22:13 -07:00
Max Kazantsev 624fc63a05 [SCEV] Re-enable "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 3
We can sharpen the range of a AddRec if we know that it does not
self-wrap and know the symbolic iteration count in the loop. If we can
evaluate the value of AddRec on the last iteration and prove that at least
one its intermediate value lies between start and end, then no-wrap flag
allows us to conclude that all of them also lie between start and end. So
the estimate of range can be improved to union of ranges of start and end.

Switched off by default, can be turned on by flag.

Differential Revision: https://reviews.llvm.org/D89381
Reviewed By: lebedev.ri, nikic
2020-10-28 12:39:41 +07:00
Fangrui Song d69ada30e2 [BranchProbabilityInfo] Make MaxSuccIdx[Src] efficient and add a comment about the subtle eraseBlock. NFC
Follow-up to D90272.
2020-10-27 16:29:23 -07:00
Kazu Hirata a7b662d0f4 [BranchProbabilityInfo] Fix eraseBlock
This patch ensures that BranchProbabilityInfo::eraseBlock(BB) deletes
all entries in Probs associated with with BB.

Without this patch, stale entries for BB may remain in Probs after
eraseBlock(BB), leading to a situation where a newly created basic
block has an edge probability associated with it even before the pass
responsible for creating the basic block adds any edge probability to
it.

Consider the current implementation of eraseBlock(BB):

  for (const_succ_iterator I = succ_begin(BB), E = succ_end(BB); I != E; ++I) {
    auto MapI = Probs.find(std::make_pair(BB, I.getSuccessorIndex()));
    if (MapI != Probs.end())
      Probs.erase(MapI);
  }

Notice that it uses succ_begin(BB) and succ_end(BB), which are based
on BB->getTerminator().  This means that if the terminator changes
between calls to setEdgeProbability and eraseBlock, then we may not
examine all pairs associated with BB.

This is exactly what happens in MaybeMergeBasicBlockIntoOnlyPred,
which merges basic blocks A into B if A is the sole predecessor of B,
and B is the sole successor of A.  It replaces the terminator of A
with UnreachableInst before (indirectly) calling eraseBlock(A).

The patch fixes the problem by keeping track of all edge probablities
entered with setEdgeProbability in a map from BasicBlock* to a
successor index.

Differential Revision: https://reviews.llvm.org/D90272
2020-10-27 16:14:25 -07:00
Raphael Isemann e038b60d91 Revert "[IndVars] Remove monotonic checks with unknown exit count"
This reverts commit c6ca26c0bf.
This breaks stage2 builds due to hitting this assert:
```
   Assertion failed: (WeightSum <= UINT32_MAX && "Expected weights to scale down to 32 bits"), function calcMetadataWeights
```
when compiling AArch64RegisterBankInfo.cpp in LLVM.
2020-10-27 15:31:37 +01:00
Nico Weber 2a4e704c92 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c6.
Makes clang assert when building Chromium, see https://crbug.com/1142813
for a repro.
2020-10-27 09:26:21 -04:00
Alex Richardson d323c8f791 [ValueTracking][NFC] Use Log2(Align) instead of countTrailingZeroes
The latter can probably be optimized to the same final code, but this might
help -O0 builds.
2020-10-27 12:16:45 +00:00