Use Instruction::comesBefore() instead of OrderedInstructions
inside InstructionPrecedenceTracking. This also removes the
dominator tree dependency.
Differential Revision: https://reviews.llvm.org/D78461
DelSet was used to avoid invalidating the current iterator while
modifying the map we are iterating over.
By using an early_inc_range, (which increments to iterator 'early',
allowing us to remove the current element), we can get rid of DelSet.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78420
The API for shuffles and reductions uses generic Type parameters,
instead of VectorType, and so assertions and casts are used a lot.
This patch makes those types explicit, which means that the clients
can't be lazy, but results in less ambiguity, and that can only be a
good thing.
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45562
Differential Revision: https://reviews.llvm.org/D78357
Drop visitCallInst/visitInvokeInst since the generic implementation
will delegate to visitCallBase. They appear to be from a time
before visitCallSite existed in the InstVisitor system.
Differential Revision: https://reviews.llvm.org/D78448
As suggested on D76788, this switches the LVI implementation to
return Optional<ValueLatticeElement> from various methods, instead
of passing in a ValueLatticeElement reference and returning a boolean.
Differential Revision: https://reviews.llvm.org/D78383
Cost-modeling decisions are tied to the compute interleave groups
(widening decisions, scalar and uniform values). When invalidating the
interleave groups, those decisions also need to be invalidated.
Otherwise there is a mis-match during VPlan construction.
VPWidenMemoryRecipes created initially are left around w/o converting them
into VPInterleave recipes. Such a conversion indeed should not take place,
and these gather/scatter recipes may in fact be right. The crux is leaving around
obsolete CM_Interleave (and dependent) markings of instructions along with
their costs, instead of recalculating decisions, costs, and recipes.
Alternatively to forcing a complete recompute later on, we could try
to selectively invalidate the decisions connected to the interleave
groups. But we would likely need to run the uniform/scalar value
detection parts again anyways and the extra complexity is probably not
worth it.
Fixes PR45572.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78298
This patch combines the "has" and "get" parts of the cache access.
getCachedValueInfo() now both sets the BBLV return argument, and
returns whether the value was found.
Additionally, the management of the work stack is now integrated
into getBlockValue(). If the value is not cached yet, we try to
push to the stack (and return false, indicating that we need to
solve first), or return overdefined in case of a cycle.
These changes a) avoid a duplicate cache lookup for has & get and
b) ensure that the logic is uniform everywhere. For this reason
this change is also not quite NFC, because previously overdefined
values from the cache, and overdefined values from a cycle received
different treatment when it came to assumption intersection.
Differential Revision: https://reviews.llvm.org/D76788
I uploaded the old version accidentally instead of the one with these
minor adjustments requested by the reviewers.
Differential Revision: https://reviews.llvm.org/D77855
Summary:
We can and should remove deleted nodes from their respective SCCs. We
did not do this before and this was a potential problem even though I
couldn't locally trigger an issue. Since the `DeleteNode` would assert
if the node was not in the SCC, we know we only remove nodes from their
SCC and only once (when run on all the Attributor tests).
Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku
Subscribers: hiraditya, bollu, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77855
Summary:
While it is uncommon that the ExternalCallingNode needs to be updated,
it can happen. It is uncommon because most functions listed as callees
have external linkage, modifying them is usually not allowed. That said,
there are also internal functions that have, or better had, their
"address taken" at construction time. We conservatively assume various
uses cause the address "to be taken". Furthermore, the user might have
become dead at some point. As a consequence, transformations, e.g., the
Attributor, might be able to replace a function that is listed
as callee of the ExternalCallingNode.
Since there is no function corresponding to the ExternalCallingNode, we
did just remove the node from the callee list if we replaced it (so
far). Now it would be preferable to replace it if needed and remove it
otherwise. However, removing the node has implications on the CGSCC
iteration. Locally, that caused some other nodes to be never visited
but it is for sure possible other (bad) side effects can occur. As it
seems conservatively safe to keep the new node in the callee list we
will do that for now.
Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku
Subscribers: hiraditya, bollu, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77854
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.
Differential revision: https://reviews.llvm.org/D78016
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.
Differential Revision: https://reviews.llvm.org/D77994
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.
As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.
Reviewers: dsanders, aprantl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77915
Remove SmallPtrSet include, replace with forward declaration and include SmallPtrSet.h in CodeMetrics.cpp directly.
Remove unused llvm::DataLayout/Instruction forward declarations.
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.
Reviewers: jdoerfert, hfinkel
Reviewed By: jdoerfert
Subscribers: hiraditya, mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77402
This is similar to the recent move/addition of "scaleShuffleMask" (D76508),
but there are a couple of differences:
1. The existing x86 helper (canWidenShuffleElements) always tries to
divide-by-2, so it gets called iteratively and wouldn't handle the
general case of non-pow-2 length.
2. The existing x86 code handles "SM_SentinelZero" - we don't have
that in IR, but this code should be safe to use with that or other
special (negative) values.
The motivation is to enable shuffle folds in instcombine/vector-combine
that are similar to D76844 and D76727, but in the reverse-bitcast direction.
Those patterns are visible in the tests for D40633.
Differential Revision: https://reviews.llvm.org/D77881
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.
While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
This reverts commit 60c642e74b.
This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.
Differential Revision: https://reviews.llvm.org/D77925
Summary:
Don't attempt to analyze the decomposed GEP for scalable type.
GEP index scale is not compile-time constant for scalable type.
Be conservative, return MayAlias.
Explicitly call TypeSize::getFixedSize() to assert on places where
scalable type doesn't make sense.
Add unit tests to check functionality of -basicaa for scalable type.
This patch is needed for D76944.
Reviewers: sdesmalen, efriedma, spatel, bjope, ctetreau
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77828
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.
Subscribers: hiraditya, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77632
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: sunfish, sdesmalen, efriedma
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77273
Ensured initialized fields; encapsulad delta calulations and evaluation
of threshold having had changed; assertion for CostThresholdMap
dereference, to indicate design intent.
Differential Revision: https://reviews.llvm.org/D77762
Summary:
There are at least three clients for KnownBits calculations:
ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the
common logic should be moved out of these clients and into KnownBits
itself.
This patch does this for AND, OR and XOR calculations by implementing
and using appropriate operator overloads KnownBits::operator& etc.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74060
Use the simpler BitWidth constructor instead of the copy constructor to
make it clear when we don't actually need to copy an existing KnownBits
value. Split out from D74539. NFC.
Now compiler defines 5 sets of constants to represent rounding mode.
These are:
1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.
2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.
3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.
4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.
5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.
The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.
This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.
Differential Revision: https://reviews.llvm.org/D77379
This patch introduces the heat coloring of the Control Flow Graph which is based
on the relative "hotness" of each BB. The patch is a part of sequence of three
patches, related to graphs Heat Coloring.
Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu
Differential Revision: https://reviews.llvm.org/D77161
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.
In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.
In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)
Differential Revision: https://reviews.llvm.org/D75661
The patch introduces the system to distinctively store the information
needed for the Control Flow Graph as well as the instrumentary needed for
the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo.
The patch is a part of sequence of three patches, related to graphs Heat Coloring.
Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu
Differential Revision: https://reviews.llvm.org/D76820
The cmyk test is based on the known regression that resulted from:
rGf2fbdf76d8d0
This improves on the equivalent signed min/max change:
rG867f0c3c4d8c
The underlying icmp equivalence is:
~X pred ~Y --> Y pred X
For an icmp with constant, canonicalization results in a swapped pred:
~X < C --> X > ~C
The cmyk tests are based on the known regression that resulted from:
rGf2fbdf76d8d0
So this improvement in analysis might be enough to restore that commit.
D51664 added Instruction::comesBefore which should provide better
performance than the manual check.
Reviewers: rnk, nikic, spatel
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D76228
Forward declare DemandedBits in IVDescriptors, and move include
into the cpp file. Also drop the include from LoopUtils, which
does not need it at all.
Summary:
This patch includes two extensions:
1. It extends the GraphDiff to also keep the original list of updates
after legalization, not just the deletes/insert vectors.
It also provides an API to pop the first update (the updates are store
in reverse, such that the first update is at the end of the list)
2. It adds a bool to mark whether the given updates should be applied as
given, or applied in reverse. This moves the task of reversing the
updates (when the caller needs this) to a functionality inside
GraphDiff, versus having the caller do this.
The two changes could be split into two patches, but they seemed
reasonably small to be reviewed together.
Reviewers: kuhar, dblaikie
Subscribers: hiraditya, george.burgess.iv, mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77167
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77171
This patch adds
- New arguments to getMinPrefetchStride() to let the target decide on a
per-loop basis if software prefetching should be done even with a stride
within the limit of the hw prefetcher.
- New TTI hook enableWritePrefetching() to let a target do write prefetching
by default (defaults to false).
- In LoopDataPrefetch:
- A search through the whole loop to gather information before emitting any
prefetches. This way the target can get information via new arguments to
getMinPrefetchStride() and emit prefetches more selectively. Collected
information includes: Does the loop have a call, how many memory
accesses, how many of them are strided, how many prefetches will cover
them. This is NFC to before as long as the target does not change its
definition of getMinPrefetchStride().
- If a previous access to the same exact address was 'read', and the
current one is 'write', make it a 'write' prefetch.
- If two accesses that are covered by the same prefetch do not dominate
each other, put the prefetch in a block that dominates both of them.
- If a ConstantMaxTripCount is less than ItersAhead, then skip the loop.
- A SystemZ implementation of getMinPrefetchStride().
Review: Ulrich Weigand, Michael Kruse
Differential Revision: https://reviews.llvm.org/D70228