Commit Graph

622 Commits

Author SHA1 Message Date
Mircea Trofin b1fa5ac3ba [mlgo] Factor out TensorSpec
This is a simple datatype with a few JSON utilities, and is independent
of the underlying executor. The main motivation is to allow taking a
dependency on it on the AOT side, and allow us build a correctly-sized
buffer in the cases when the requested feature isn't supported by the
model. This, in turn, allows us to grow the feature set supported by the
compiler in a backward-compatible way; and also collect traces exposing
the new features, but starting off the older model, and continue
training from those new traces.

Differential Revision: https://reviews.llvm.org/D124417
2022-04-25 18:35:46 -07:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Nikita Popov f96428e16d [MemorySSA] Don't optimize uses during construction
This changes MemorySSA to be constructed in unoptimized form.
MemorySSA::ensureOptimizedUses() can be called to optimize all
uses (once). This should be done by passes where having optimized
uses is beneficial, either because we're going to query all uses
anyway, or because we're doing def-use walks.

This should help reduce the compile-time impact of MemorySSA for
some use cases (the reason why I started looking into this is
D117926), which can avoid optimizing all uses upfront, and instead
only optimize those that are actually queried.

Actually, we have an existing use-case for this, which is EarlyCSE.
Disabling eager use optimization there gives a significant
compile-time improvement, because EarlyCSE will generally only query
clobbers for a subset of all uses (this change is not included in
this patch).

Differential Revision: https://reviews.llvm.org/D121381
2022-03-18 09:56:16 +01:00
Andrew Litteken 0c4bbd293e [IRSim] Make sure the first instruction of a block doesn't get missed if it is the first valid instruction in Module.
If an instruction is first legal instruction in the module, and is the only legal instruction in its basic block, it will be ignored by the outliner due to a length check inherited from the older version of the outliner that was restricted to outlining within a single basic block. This removes that check, and updates any tests that broke because of it.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D120786
2022-03-13 23:13:09 -05:00
serge-sans-paille 71c3a5519d Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor:
before: 1065940348
after:  1065307662

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120659
2022-03-01 18:01:54 +01:00
Bill Wendling 823b32fbfb [NFC] Add #include for constants 2022-02-23 01:26:53 -08:00
Whitney Tsang e7afbea8ca [MemorySSA] Clear VisitedBlocks per query
The problem can be shown from the newly added test case.
There are two invocations to MemorySSAUpdater::moveToPlace, and the
internal data structure VisitedBlocks is changed in the first
invocation, and reused in the second invocation. In between the two
invocations, there is a change to the CFG, and MemorySSAUpdater is
notified about the change.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D119898
2022-02-18 15:36:19 -05:00
Chuanqi Xu a2609be0b2 [ValueTracking] Checking haveNoCommonBitsSet for (x & y) and ~(x | y)
This one tries to fix:
https://github.com/llvm/llvm-project/issues/53357.

Simply, this one would check (x & y) and ~(x | y) in
haveNoCommonBitsSet. Since they shouldn't have common bits (we could
traverse the case by enumerating), and we could convert this one to (x &
y) | ~(x | y) . Then the compiler could handle it in
InstCombineAndOrXor.
Further more, since ((x & y) + (~x & ~y)) would be converted to ((x & y)
+ ~(x | y)), this patch would fix it too.

https://alive2.llvm.org/ce/z/qsKzRS

Reviewed By: spatel, xbolva00, RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D118094
2022-02-16 13:42:52 +08:00
Chuanqi Xu e59d6dc063 [NFC] Precommit for PR53357
Due to there are other required changes in
https://reviews.llvm.org/D118094, precommit these changes to ease
reviewing. Including:
- Remove *_thwart tests.
- Remove test for (x & y) + (~x & ~y)
- Fix incorrect uniitest committeed before
2022-02-14 14:37:12 +08:00
Chuanqi Xu 4ee240b860 [NFC] [ValueTracking] Add unittest for haveNoCommonBitsSet 2022-02-14 14:10:30 +08:00
David Sherwood 1badfbb4fc Fix incorrect TypeSize->uint64_t cast in InductionDescriptor::isInductionPHI
The code was relying upon the implicit conversion of TypeSize to
uint64_t and assuming the type in question was always fixed. However,
I discovered an issue when running the canon-freeze pass with some
IR loops that contains scalable vector types. I've changed the code
to bail out if the size is unknown at compile time, since we cannot
compute whether the step is a multiple of the type size or not.

I added a test here:

  Transforms/CanonicalizeFreezeInLoops/phis.ll

Differential Revision: https://reviews.llvm.org/D118696
2022-02-10 09:39:12 +00:00
Philip Reames d334fec140 [SCEV] Make SCEVUnionPredicate externally immutable [NFC]
This is the last major stepping stone before being able to allocate the node via the folding set allocator.  That will in turn allow more general SCEV predicate expression trees.
2022-02-09 13:47:28 -08:00
Andrew Litteken 30420bc344 [IRSim] Make sure that commutative intrinsics are treated as function calls without commutativity
Created to fix: https://github.com/llvm/llvm-project/issues/53537

Some intrinsics functions are considered commutative since they are performing operations like addition or multiplication. Some of these have extra parameters to provide extra information that are not part of the operation itself and are not commutative. This makes sure that if an instruction that is an intrinsic takes the non commutative path to handle this case.

Reviewer: paquette

Closes Issue #53537

Differential Revision: https://reviews.llvm.org/D118807
2022-02-02 13:24:56 -06:00
Andrew Litteken 3785c1d055 [IRSim][IROutliner] Allowing Intrinsic Calls to be Used in Similarity Matching and Outlined Regions
Due to some complications with lifetime, and assume-like intrinsics, intrinsics were not included as outlinable instructions. This patch opens up most intrinsics, excluding lifetime and assume-like intrinsics, to be outlined. For similarity, it is required that the intrinsic IDs, and the intrinsics names match exactly, as well as the function type. This puts intrinsics in a different class than normal call instructions (https://reviews.llvm.org/D109448), where the name will no longer have to match.

This also adds an additional command line flag debug option to disable outlining intrinsics.

Recommit of: 8de76bd569
Adds extra checking of intrinsic function calls names to avoid taking the address of intrinsic calls when extracting function calls.

Reviewers: paquette, jroelofs

Differential Revision: https://reviews.llvm.org/D109450
2022-01-28 13:52:21 -06:00
Andrew Litteken e8f4e41b6b [IRSim][IROutliner] Add support for outlining PHINodes with the rest of the region.
We use the same similarity scheme we used for branch instructions for phi nodes, and allow them to be outlined. There is not a lot of special handling needed for these phi nodes when outlining, as they simply act as outputs. The code extractor does not currently allow for non entry blocks within the extracted region to have predecessors, so there are not conflicts to handle with respect to predecessors no longer contained in the function.

Recommit of 515eec3553

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D106997
2022-01-25 18:25:50 -06:00
Andrew Litteken e50b217b4e Revert "[IRSim][IROutliner] Add support for outlining PHINodes with the rest of the region."
This reverts commit 515eec3553.

By mistake, commit message was not complete.
2022-01-25 18:24:19 -06:00
Andrew Litteken 515eec3553 [IRSim][IROutliner] Add support for outlining PHINodes with the rest of the region. 2022-01-25 18:20:10 -06:00
Andrew Litteken 9c2daf648c Revert "[IRSim][IROutliner] Allowing Intrinsic Calls to be Used in Similarity Matching and Outlined Regions"
This reverts commit 8de76bd569.

Reverting due to failure of different-intrinsics.ll on lld-x86_64-win buildbot.
2022-01-25 18:19:33 -06:00
Andrew Litteken 8de76bd569 [IRSim][IROutliner] Allowing Intrinsic Calls to be Used in Similarity Matching and Outlined Regions
Due to some complications with lifetime, and assume-like intrinsics, intrinsics were not included as outlinable instructions. This patch opens up most intrinsics, excluding lifetime and assume-like intrinsics, to be outlined. For similarity, it is required that the intrinsic IDs, and the intrinsics names match exactly, as well as the function type. This puts intrinsics in a different class than normal call instructions (https://reviews.llvm.org/D109448), where the name will no longer have to match.

This also adds an additional command line flag debug option to disable outlining intrinsics.

Reviewers: paquette, jroelofs

Differential Revision: https://reviews.llvm.org/D109450
2022-01-25 17:06:09 -06:00
Andrew Litteken f5f377d1fc [IRSim][IROutliner] Adding support for recognizing and outlining indirect function calls, and function calls with different names, but the same type
The outliner currently requires that function calls not be indirect calls, and have that the function name, and function type must match, as well as other attributes such as calling conventions. This patch treats called functions as values, and just another operand, and named function calls as constants. This allows functions to be treated like any other constant, or input and output into the outlined functions.

There are also debugging flags added to enforce the old behaviors where indirect calls not be allowed, and to enforce the old rule that function calls names must also match.

Reviewers: paquette, jroelofs

Differential Revision: https://reviews.llvm.org/D109448
2022-01-25 15:19:28 -06:00
Philip Reames 215bd46905 [MemoryBuiltins] Demote isMallocLikeFn to implementation routine since last use has been removed
Try 2, this time including the test.
2022-01-18 15:24:52 -08:00
Jan Svoboda 5f4ae56457 [llvm] Remove uses of `std::vector<bool>`
LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.

This patch does just that for llvm.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D117121
2022-01-18 18:20:45 +01:00
Philip Reames ee02cf0797 [MemoryBuiltins] Demote isCallocLikeFn and isAlignedAllocLikeFn to local helpers after removal of last external use [NFC] 2022-01-13 15:51:17 -08:00
Whitney Tsang cb6b9d3ae2 [LoopNest] Add new utilites
getLoopIndex() is added to get the loop index of a given loop.
getLoopsAtDepth() is added to get the loops in the nest at a given
depth.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D115590
2022-01-13 17:19:19 -05:00
Mircea Trofin 1f5dceb1d0 [MLGO] Add support for multiple training traces per module
This happens in e.g. regalloc, where we trace decisions per function,
but wouldn't want to spew N log files (i.e. one per function). So we
output a key-value association, where the key is an ID for the
sub-module object, and the value is the tensorflow::SequenceExample.

The current relation with protobuf is tenuous, so we're avoiding a
custom message type in favor of using the `Struct` message, but that
requires the values be wire-able strings, hence base64 encoding.

We plan on resolving the protobuf situation shortly, and improve the
encoding of such logs, but this is sufficient for now for setting up
regalloc training.

Differential Revision: https://reviews.llvm.org/D116985
2022-01-11 16:13:31 -08:00
Mircea Trofin b7f298f174 [NFC][MLGO] Use ASSERT_TRUE in TFUtilsTest, where appropriate. 2022-01-11 16:10:55 -08:00
Nikita Popov 92d55e7336 [MemoryBuiltins] Remove isNoAliasFn() in favor of isNoAliasCall()
We currently have two similar implementations of this concept:
isNoAliasCall() only checks for the noalias return attribute.
isNoAliasFn() also checks for allocation functions.

We should switch to only checking the attribute. SLC is responsible
for inferring the noalias return attribute for non-new allocation
functions (with a missing case fixed in
348bc76e35).
For new, clang is responsible for setting the attribute,
if -fno-assume-sane-operator-new is not passed.

Differential Revision: https://reviews.llvm.org/D116800
2022-01-10 09:18:15 +01:00
Sanjay Patel 0edf99950e [Analysis] allow caller to choose signed/unsigned when computing constant range
We should not lose analysis precision if an 'add' has both no-wrap
flags (nsw and nuw) compared to just one or the other.

This patch is modeled on a similar construct that was added with
D59386.

I don't think it is possible to expose a problem with an unsigned
compare because of the way this was coded (nuw is handled first).

InstCombine has an assert that fires with the example from:
https://github.com/llvm/llvm-project/issues/52884
...because it was expecting InstSimplify to handle this kind of
pattern with an smax.

Fixes #52884

Differential Revision: https://reviews.llvm.org/D116322
2021-12-28 09:45:37 -05:00
Sanjay Patel a56803b8f8 [Analysis] fix cast in ValueTracking to allow constant expression
The test would crash because a non-instruction negate op made it in here.

Fixes #51506
2021-12-20 17:16:47 -05:00
Momchil Velikov 6192c312cf [AA] Correctly maintain the sign of PartiaAlias offset
Preserve the invariant that offset reported in the case of a
`PartialAlias` between `Loc1` and `Loc2`, is such that
`Loc1 + Offset = Loc2`, where `Loc1` and `Loc2` are the first and
the second argument, respectively, in alias queries.

Differential Revision: https://reviews.llvm.org/D115927
2021-12-17 15:45:26 +00:00
Mircea Trofin 059e03476c [NFC][mlgo] Generalize model runner interface
This prepares it for the regalloc work. Part of it is making model
evaluation accross 'development' and 'release' scenarios more reusable.
This patch:
- extends support to tensors of any shape (not just scalars, like we had
in the inliner -Oz case). While the tensor shape can be anything, we
assume row-major layout and expose the tensor as a buffer.
- exposes the NoInferenceModelRunner, which we use in the 'development'
mode to keep the evaluation code path consistent and simplify logging,
as we'll want to reuse it in the regalloc case.

Differential Revision: https://reviews.llvm.org/D115306
2021-12-08 20:10:58 -08:00
Florian Hahn ad88a37cea
[TLI] Add memset_pattern4, memset_pattern8 lib functions.
Similar to memset_pattern16, memset_pattern4, memset_pattern8 are
available on Darwin platforms.

https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/memset_pattern4.3.html

Reviewed By: ab

Differential Revision: https://reviews.llvm.org/D114881
2021-12-01 21:18:19 +00:00
duanbo.db 53dc525828 [LoopInfo] Fix function getInductionVariable
The way function gets the induction variable is by judging whether
StepInst or IndVar in the phi statement is one of the operands of CMP.
But if the LatchCmpOp0/LatchCmpOp1 is a constant,  the subsequent
comparison may result in null == null, which is meaningless. This patch
fixes the typo.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D112980
2021-11-11 16:22:42 +08:00
Michael Liao bf225939bc [InferAddressSpaces] Support assumed addrspaces from addrspace predicates.
- CUDA cannot associate memory space with pointer types. Even though Clang could add extra attributes to specify the address space explicitly on a pointer type, it breaks the portability between Clang and NVCC.
- This change proposes to assume the address space from a pointer from the assumption built upon target-specific address space predicates, such as `__isGlobal` from CUDA. E.g.,

```
  foo(float *p) {
    __builtin_assume(__isGlobal(p));
    // From there, we could assume p is a global pointer instead of a
    // generic one.
  }
```

This makes the code portable without introducing the implementation-specific features.

Note that NVCC starts to support __builtin_assume from version 11.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D112041
2021-11-08 16:51:57 -05:00
Liren Peng 57e093162e [ScalarEvolution] Infer loop max trip count from array accesses
Data references in a loop should not access elements over the
statically allocated size. So we can infer a loop max trip count
from this undefined behavior.

Reviewed By: reames, mkazantsev, nikic

Differential Revision: https://reviews.llvm.org/D109821
2021-11-03 10:40:18 +08:00
Arthur Eubanks 029f1a5344 [LazyCallGraph] Skip blockaddresses
blockaddresses do not participate in the call graph since the only
instructions that use them must all return to someplace within the
current function. And passes cannot retrieve a function address from a
blockaddress.

This was suggested by efriedma in D58260.

Fixes PR50881.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112178
2021-11-01 13:10:24 -07:00
Simon Pilgrim 2e5daac217 [llvm] Update report_fatal_error calls from raw_string_ostream to use Twine(OS.str())
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.

We can use the raw_string_ostream::str() method to perform the implicit flush() and return a reference to the std::string container that we can then wrap inside Twine().
2021-10-05 18:42:12 +01:00
Alex Richardson 3c51b9e270 Fix incorrect GEP bitwidth in areNonOverlapSameBaseLoadAndStore()
When using a datalayout that has pointer index width != pointer size this
code triggers an assertion in Value::stripAndAccumulateConstantOffsets().
I encountered this this while compiling FreeBSD for CHERI-RISC-V.
Also update LoadsTest.cpp to use a DataLayout with index width != pointer
width to ensure this case is tested.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D110406
2021-09-28 17:57:36 +01:00
Nikita Popov ba664d9066 [AA] Move earliest escape tracking from DSE to AA
This is a followup to D109844 (and alternative to D109907), which
integrates the new "earliest escape" tracking into AliasAnalysis.
This is done by replacing the pre-existing context-free capture
cache in AAQueryInfo with a replaceable (virtual) object with two
implementations: The SimpleCaptureInfo implements the previous
behavior (check whether object is captured at all), while
EarliestEscapeInfo implements the new behavior from DSE.

This combines the "earliest escape" analysis with the full power of
BasicAA: It subsumes the call handling from D109907, considers a
wider range of escape sources, and works with AA recursion. The
compile-time cost is slightly higher than with D109907.

Differential Revision: https://reviews.llvm.org/D110368
2021-09-25 22:40:41 +02:00
David Sherwood 8e4f7b749c [Analysis] Fix another issue when querying vscale attributes on functions
There are several places in the code that are currently broken where
we assume an Instruction is always a member of a BasicBlock that
lives in a Function. This is a problem specifically when
attempting to get the vscale_range attribute. This patch adds checks
that an Instruction's parent also has a parent!

I've added a test for a function-less @llvm.vscale intrinsic call here:

  unittests/Analysis/ValueTrackingTest.cpp
2021-09-24 13:37:23 +01:00
David Sherwood c2634fc6ab [Analysis] Fix issues when querying vscale attributes on functions
There are several places in the code that are currently broken as
they assume an Instruction always has a parent Function when
attempting to get the vscale_range attribute. This patch adds checks
that an Instruction has a parent.

I've added a test for a parentless @llvm.vscale intrinsic call here:

  unittests/Analysis/ValueTrackingTest.cpp

Differential Revision: https://reviews.llvm.org/D110158
2021-09-24 09:58:10 +01:00
Florian Hahn 5131037ea9
[ValueTracking,VectorCombine] Allow passing DT to computeConstantRange.
isValidAssumeForContext can provide better results with access to the
dominator tree in some cases. This patch adjusts computeConstantRange to
allow passing through a dominator tree.

The use VectorCombine is updated to pass through the DT to enable
additional scalarization.

Note that similar APIs like computeKnownBits already accept optional dominator
tree arguments.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D110175
2021-09-21 16:54:47 +01:00
Florian Mayer 0a22510f3e [value-tracking] see through returned attribute.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D109675
2021-09-13 20:52:26 +01:00
Arthur Eubanks b493124ae2 [MemorySSA] Support invariant.group metadata
The implementation is mostly copied from MemDepAnalysis. We want to look
at all loads and stores to the same pointer operand. Bitcasts and zero
GEPs of a pointer are considered the same pointer value. We choose the
most dominating instruction.

Since updating MemorySSA with invariant.group is non-trivial, for now
handling of invariant.group is not cached in any way, so it's part of
the walker. The number of loads/stores with invariant.group is small for
now anyway. We can revisit if this actually noticeably affects compile
times.

To avoid invariant.group affecting optimized uses, we need to have
optimizeUsesInBlock() not use invariant.group in any way.

Co-authored-by: Piotr Padlewski <prazek@google.com>

Reviewed By: asbirlea, nikic, Prazek

Differential Revision: https://reviews.llvm.org/D109134
2021-09-08 13:06:12 -07:00
Andrew Litteken bd4b1b5f6d [IRSim] Adding support for recognizing branch similarity
The current IRSimilarityIdentifier does not try to find similarity across blocks, this patch provides a mechanism to compare two branches against one another, to find similarity across basic blocks, rather than just within them.

This adds a step in the similarity identification process that labels all of the basic blocks so that we can identify the relative branching locations. Within an IRSimilarityCandidate we use these relative locations to determine whether if the branching to other relative locations in the same region is the same between branches. If they are, we consider them similar.

We do not consider the relative location of the branch if the target branch is outside of the region. In this case, both branches must exit to a location outside the region, but the exact relative location does not matter.

Reviewers: paquette, yroux

Differential Revision: https://reviews.llvm.org/D106989
2021-09-06 11:55:38 -07:00
Andrew Litteken 063af63b96 [IRSim][IROutliner] Canonicalizing commutative value numbering between similarity sections.
When the initial relationship between two pairs of values between
similar sections is ambiguous to commutativity, arguments to the
outlined functions can be passed in such that the order is incorrect,
causing miscompilations.  This adds a canonical mapping to each
similarity section, so that we can maintain the relationship of global
value numbering from one section to another.

Added Tests:
Transforms/IROutliner/outlining-commutative-operands-opposite-order.ll
unittests/Analysis/IRSimilarityIdentifierTest.cpp - IRSimilarityCandidate:CanonicalNumbering

Reviewers: jroelofs, jpaquette, yroux

Differential Revision: https://reviews.llvm.org/D104143
2021-08-27 15:02:56 -07:00
Mark Danial 4018d25da8 LoopNest Analysis expansion to return instructions that prevent a Loop
Nest from being perfect

Expand LoopNestAnalysis to return the full list of instructions that
cause a loop nest to be imperfect. This is useful for other passes to
know if they should continue for in the inner loops.
Added New function getInterveningInstructions
that returns a small vector with the instructions that prevent a loop
for being perfect. Also added a couple of helper functions to reduce
code duplication.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D107773
2021-08-17 22:25:49 +00:00
Mircea Trofin ae1a2a09e4 [NFC][MLGO] Make logging more robust
1) add some self-diagnosis (when asserts are enabled) to check that all
features have the same nr of entries

2) avoid storing pointers to mutable fields because the proto API
contract doesn't actually guarantee those stay fixed even if no further
mutation of the object occurs.

Differential Revision: https://reviews.llvm.org/D107594
2021-08-06 04:44:52 -07:00
Chang-Sun Lin, Jr b58eda39eb [ValueTracking] Fix computeConstantRange to use "may" instead of "always" semantics for llvm.assume
ValueTracking should allow for value ranges that may satisfy
llvm.assume, instead of restricting the ranges only to values that
will always satisfy the condition.

Differential Revision: https://reviews.llvm.org/D107298
2021-08-02 22:20:17 +02:00
Paul Walker 8a8d01d58c [NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties.
Differential Revision: https://reviews.llvm.org/D106750
2021-07-26 12:25:46 +01:00