This reverts commit 5630143af3. The
implementation in this commit was incorrect. Also, handling this representation
of equalities in the upcoming support for symbolic lexicographic minimization
makes that patch much more complex. It will be easier to review that without
this representaiton and then reintroduce the fixed column representation
later, hence the revert rather than a bug fix.
While this claims to be the base class for fixed and scalable
vectors, this is no longer the case since D94405. In fact,
LLVMVectorType is not usable, since the methods it declares are
never defined. Remove this leftover.
Differential Revision: https://reviews.llvm.org/D122707
Allow conversion of a diagnostic to FailureOr. This conversion only results
in `failure` because in the case where operator LogicalResult would return
success, the FailureOr constructor would assert.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D122596
The op documentation of the ops of the scf dialect used /**/ style block
comments which doesn't seem to exists (anymore?).
Reviewed By: nicolasvasilache, ingomueller-net
Differential Revision: https://reviews.llvm.org/D122565
linalg.generic can also take scalars instead of tensors, which
tensor.cast doesn't support. We don't have an easy way to cast between
scalars and tensors so just keep the linalg.generic in those cases.
Differential Revision: https://reviews.llvm.org/D122575
Allow default-valued attributes to be passed as null to builders, which will not add the attribute if not present, so the default value will be returned by getters.
We are using "enable-index-optimizations" and "indexOptimizations" as
names for an optimization that consists of using i32 for indices within
a vector. For instance, when building a vector comparison for mask
generation. The name is confusing and suggests a scope beyond these
vector indices. This change makes the function of the option explicit
in its name.
Differential Revision: https://reviews.llvm.org/D122415
A Block is optionally allocated & leaks in case of failed parse. Inline the
function and ensure Block gets freed unless parse is successful.
Differential Revision: https://reviews.llvm.org/D122112
we will get warning:```parameter ‘op’ set but not used``` when template function with empty template parameter list.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D122527
The change of https://reviews.llvm.org/D121513#3411651
has caused a build error when building with clang:
/mnt/vss/_work/1/llvm-project/mlir/lib/Dialect/Tosa/IR/TosaOps.cpp:599:26: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
ReduceFolder(ReduceAllOp);
Reviewed By: hpmorgan, Mogball
Differential Revision: https://reviews.llvm.org/D122599
Adds another pointer to the union in RegionRange to allow RegionRange to work on ArrayRef<Region *> (i.e. vectors of Region *).
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D122514
(This was a TODO from the initial patch).
The control-flow sink utility accepts a callback that is used to sink an operation into a region.
The `moveIntoRegion` is called on the same operation and region that return true for `shouldMoveIntoRegion`.
The callback must preserve the dominance of the operation within the region. In the default control-flow
sink implementation, this is moving the operation to the start of the entry block.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D122445
This has been on _Both for a couple of weeks. Flip usages in core with
intention to flip flag to _Prefixed in follow up. Needed to add a couple
of helper methods in AffineOps and Linalg to facilitate a pure flag flip
in follow up as some of these classes are used in templates and so
sensitive to Vector dialect changes.
Differential Revision: https://reviews.llvm.org/D122151
- Adds default implementations of `isDefinedOutsideOfLoop` and `moveOutOfLoop` since 99% of all implementations of these functions were identical
- `moveOutOfLoop` takes one operation and doesn't return anything anymore. 100% of all implementations of this function would always return `success` and uses would either respond with a pass failure or an `llvm_unreachable`.
This commit adds MemWrite (and MemRead, as appropriate) to the arugments of the memset/memcpy/memmove intrinsics within the LLVM dialect
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122552
This revision supports padding only a subset of the iteration dimensions via an additional padding-dimensions parameter. This control allows us to pad an operation in multiple steps. For example, one may want to pad only the output dimensions of a producer matmul fused into a consumer loop nest, before tiling and padding its reduction dimension.
Depends On D122309
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D122560
Pass the padding options using arrays instead of lambdas. In particular pass the padding value as string and use the argument parser to create the padding value. Arrays are a more natural choice that matches the current use cases and avoids converting arrays to lambdas.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D122309
This patch adds the ReductionClauseInterface and also adds reduction
support for `omp.parallel` operation.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D122402
This is useful for the case that we don't need to implement
reifyReturnTypeShapes.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D121403
This patch adds MLIR NVVM support for the various NVPTX `mma.sync`
operations. There are a number of possible data type, shape,
and other attribute combinations supported by the operation, so a
custom assebmly format is added and attributes are inferred where
possible.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D122410
In order to run these integration tests, it is required access to an
SVE-enabled CPU or and emulator with SVE support. In case of using
an emulator, aarch64 versions of lli and the MLIR C Runner Utils Library
are also required.
Differential Revision: https://reviews.llvm.org/D104517
The diff is big, but there are in fact only three kinds of changes
* ir.py had a synax error -- underminated [
* forward references are unnecessary in .pyi files (see 9a76b13127/CONTRIBUTING.md?plain=1#L450-L454)
* methods defined via .def_static() are now decorated with @staticmethod
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122300
Use "enable-vla-vectorization=vla" to generate a vector length agnostic
loops during vectorization. This option works for vectorization strategy 2.
Differential Revision: https://reviews.llvm.org/D118379
The way vector.create_mask is currently lowered is
vector-length-dependent, and therefore incompatible with scalable vector
types. This patch adds an alternative lowering path for create_mask
operations that return a scalable vector mask.
Differential Revision: https://reviews.llvm.org/D118248
This patch is a cleanup patch that merges PresburgerLocalSpace and
PresburgerSpace. Asserting that there are no locals is shifted to the
users of PresburgerSpace themselves.
The reasoning for this patch is that PresburgerLocalSpace did not contribute
much and only introduced additional complexity as locals could still be present
in PresburgerSpace, just not writable. This could introduce problems if a
PresburgerSpace with locals was copied to PresburgerLocalSpace which expected
no locals in a PresburgerSpace.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D122353
This transformation allow to break up a reduction dimension in a
parallel and a reduction dimension. This is followed by a separate
reduction op. This allows to generate tree reduction which is beneficial
on target allowing to take advantage parallelism.
Differential Revision: https://reviews.llvm.org/D122045
This patch adds the nowait parameter to `createSingle` in
OpenMPIRBuilder and handling for IR generation from OpenMP Dialect.
Also added tests for the same.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122371
Previously, only LinalgOps whose operands are defined by an ExtractSliceOp could be padded. The revision supports walking a use-def chain of LinalgOps to find an ExtractSliceOp.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D122116
This revision introduces a heuristic to stop fusion for shape-only tensors. A shape-only tensor only defines the shape of the consumer computation while the data is not used. Pure producer consumer fusion thus shall not fuse the producer of a shape-only tensor. In particular, since the shape-only tensor will have other uses that actually consume the data.
The revision enables fusion for consumers that have two uses of the same tensor. One as input operand and one as shape-only output operand. In these cases, we want to fuse only the input operand and avoid output fusion via iteration argument.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D120981
Create the AffineMinOp used to compute the padding width in canonical form and update the tests.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D122311
These attributes were added because of oilist required them earlier. It
no longer requires them and so these attributes can be safely removed
from the operations.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D122289
This patch adds translation from omp.single to LLVM IR.
Depends on D122288
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D122297
Add a baseline implementation of support for local ids for `PWMAFunction::valueAt`. This can be made more efficient later if needed by handling locals with known div representations separately.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122144
In LexSimplex, instead of adding equalities as a pair of inequalities,
add them as a single row, move them into the basis, and keep them there.
There will always be a valid basis involving all non-redundant equalities. Such
equalities will then be ignored in some other operations, such as when looking
for pivot columns. This speeds them up a little bit.
More importantly, this is an important precursor patch to adding support for
symbolic integer lexmin, as this heuristic can sometimes make a big difference there.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122165
This simplifies many places where we just want to do something in a "transient context"
and return some value.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122172
It is often necessary to "rollback" IntegerRelations to an earlier state. Although providing full rollback support is non-trivial, we really only need to support the case where the only changes made are to append ids or append constraints, and then rollback these additions. This patch adds support to rollback in such situations by recording the number of ids and constraints of each kind and providing support to truncate the IntegerRelation to those counts by removing appended ids and constraints. This already simplifies subtraction a little bit and will also be useful in the implementation of symbolic integer lexmin.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122178
This provides a way to create an operation without manipulating
OperationState directly. This is useful for creating unregistered ops.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D120787
Only use this on generic parser for now by not registering any dialect. For flushing out some parser bugs. The textual format is not meant to be load bearing in production runs, but still useful to remove edge cases/failures.
Differential Revision: https://reviews.llvm.org/D122267
Emitting at error at EOF will emit the diagnostic past the end of the file. When emitting an error during parsing at EOF, emit it at the previous character.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D122295
This fixes a bufferization issue with ops that are not supported by the buffer deallocation pass when `allow-return-allocs=0`.
Differential Revision: https://reviews.llvm.org/D122304
This patch adds omp.single according to Section 2.8.2 of OpenMP 5.0.
Also added tests for the same.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D122288
Co-authored-by: Kiran Kumar T P <kirankumar.tp@amd.com>
This is a convenience function for adding new divisions to the Simplex given the numerator and denominator.
This will be needed for symbolic integer lexmin support.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122159
This also replaces the absolute link to the same document with a
relative one in the same file.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D121814
This patch
- adds assembly format for `omp.wsloop` operation
- removes the `parseClauses` clauses as it is not required anymore
This is expected to be the final patch in a series of patches for replacing
parsers for clauses with `oilist`.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D121367
We already have bit hasFolder = 0 in the defination of the class Op, so no need to have another let assignment here.
Differential Revision: https://reviews.llvm.org/D122222
Make MaxSI, MaxUI, MinSI and MinUI commutative, so they will be canonicalized to have its constants appear as the second operand. And the constant folder will match more cases.
Differential Revision: https://reviews.llvm.org/D122225
I'm using "shape" to mean the compile-time object, where zeros indicate sizes which are compile-time dynamic; and using "sizes" to mean the run-time object, where zeros indicate a dimension with no coordinates (hence resulting in trivial storage). Because their semantics differ on zeros, it's important to keep them distinguished. Although we do not define separate C++ types to capture the distinction, we can at least use variable names to do so.
This is (tangential) work towards fixing: https://github.com/llvm/llvm-project/issues/51652
Depends On D122057
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D122058
This is work towards: https://github.com/llvm/llvm-project/issues/51652
This differential doesn't yet make use of the new kSparseToSparse, just introduces it. The differential that finally makes use of them is D122061, which is the final differential in the chain that fixes bug 51652.
Depends On D122054
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D122055
This is work towards: https://github.com/llvm/llvm-project/issues/51652
This differential sets up the options and threads them through everywhere, but doesn't actually use them yet. The differential that finally makes use of them is D122061, which is the final differential in the chain that fixes bug 51652.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D122054
This is the more logical place for the function to live. If/when we factor out a separate class for just the `Coordinates` themselves, then the definition should be moved to `Coordinates::lexOrder` (and `Element::lexOrder` would become a thin wrapper delegating to that function).
This is (tangentially) work towards fixing: https://github.com/llvm/llvm-project/issues/51652
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D122057
Add method to tag classes/defs as deprecated. Previously deprecations
were only verbally communicated and folks didn't have an active warning
while building about impending removal. Add mechanism to tag defs as
deprecated to allow warning users.
This doesn't change any policy, it just moves deprecation warnings from
comments to something more user visible.
Differential Revision: https://reviews.llvm.org/D122164
Previously, an UndoLogEntry was added by addRow but not by addZeroRow. So
calling directly into addZeroRow, as LexSimplex::addCut does, was not an
undoable operation. In the current usage of addCut this could never
lead to an incorrect result, and addZeroRow is protected, so it is not
currently possible to add a regression test for this. This bug needs to be
fixed for the symbolic integer lexmin algorithm.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122162
The revision introduces a affine.min and affine.max canonicalization pattern that orders the result expressions. It flattens the result expressions to arrays of dimension and symbol coefficients plus one constant coefficient and rearranges them in lexicographic order.
Without the pattern, CSE will not eliminate two affine.min / affine.max operation if the results are ordered differently. For example, the operations
```
%1 = affine.min affine_map<(d0) -> (8, -d0 + 27)>(%arg4)
%2 = affine.min affine_map<(d0) -> (-d0 + 27, 8)>(%arg4)
```
doe not CSE. After applying the pattern, the two operations are equivalent
```
%1 = affine.min affine_map<(d0) -> (8, -d0 + 27)>(%arg4)
%2 = affine.min affine_map<(d0) -> (8, -d0 + 27)>(%arg4)
```
which enables CSE.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D121819
This patch attempts to deduce when the oilist element must be printed
based on the optional arguments to it. This especially helps creating
an operation accurately because with the current implementation, the
inferred unit attributes must be manually added to print the clauses
appropriately.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D121579
Computing dropped unit-dims when all the unit dims are dropped, does
not need to check for strides being dropped.
This also enables canonicalization of reduced-rank subviews.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D121766
I am not sure about the meaning of Type in the name (was it meant be interpreted as Kind?), and given the importance and meaning of Type in the context of MLIR, its probably better to rename it. Given the comment in the source code, the suggestion in the GitHub issue and the final discussions in the review, this patch renames the OperandType to UnresolvedOperand.
Fixes https://github.com/llvm/llvm-project/issues/54446
Differential Revision: https://reviews.llvm.org/D122142
The current nested if merging has a bug. Specifically, consider the following code:
```
%r = scf.if %arg3 -> (i32) {
scf.if %arg1 {
"test.op"() : () -> ()
}
scf.yield %arg0 : i32
} else {
scf.yield %arg2 : i32
}
```
When the above gets merged, it will become:
```
%r = scf.if %arg3 && %arg1-> (i32) {
"test.op"() : () -> ()
scf.yield %arg0 : i32
} else {
scf.yield %arg2 : i32
}
```
However, this means that when only %arg3 is true, we will incorrectly return %arg2 instead
of %arg0. This change updates the behavior of the pass to only enable nested if merging where
the outer yield contains only values from the inner if, or values defined outside of the if.
In the case of the latter, they can turned into a select of only the outer if condition, thus
maintaining correctness.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122108
When the current implementation merges two blocks that have operands defined outside of their block respectively, it will merge these by adding a block argument in the resulting merged block and adding successor arguments to the predecessors.
There is a special case where this is incorrect however: If one of predecessors terminator produce the operand, inserting the block argument and updating the predecessor would lead to the terminator using its own result as successor argument.
IR Example:
```
%0 = "test.producing_br"()[^bb1, ^bb2] {
operand_segment_sizes = dense<0> : vector<2 x i32>
} : () -> i32
^bb1:
"test.br"(%0)[^bb4] : (i32) -> ()
```
where `^bb1` is then merged with another block would lead to:
```
%0 = "test.producing_br"(%0)[^bb1, ^bb2]
```
This patch fixes that issue during clustering by making sure that if the operand is from an outside block, that it is not produced by the terminator of a predecessor.
Differential Revision: https://reviews.llvm.org/D121988
This patch adds translation from `omp.atomic.capture` to LLVM IR. Also
added tests for the same.
Depends on D121546
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121554
This patch fixes the condition for emitting atomic update using
`atomicrmw` instruction or compare-exchange loop.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121546
This patch improves the representation size of individual
`IntegerRelation`s by calling the function
`IntegerRelation::removeRedundantConstraints`. While this is only a
slight optimization in the current version, it will be necessary for
patches to come.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121989
This option tells CMake to add current source and binary
directories to the include path for each directory[1].
Required include directories from build tree (for generated
files) were previously added in `mlir_tablegen` but this was
changed in 03078ec20b .
These are still needed, however, for out-of-tree builds
that don't build as part of LLVM (via LLVM_ENABLE_PROJECTS).
Building as part of LLVM works regardless, AFAICT,
because LLVM sets this option and so the MLIR build inherits it.
FWIW, various other (in-tree) LLVM projects set this as well.
And of course this fixes the out-of-tree
mlir-by-itself build scenario I'm using.
[1] https://cmake.org/cmake/help/latest/variable/CMAKE_INCLUDE_CURRENT_DIR.html
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D122088
This support has never really worked well, and is incredibly clunky to
use (it effectively creates two argument APIs), and clunky to generate (it isn't
clear how we should actually expose this from PDL frontends). Treating these
as just attribute arguments is much much cleaner in every aspect of the stack.
If we need to optimize lots of constant parameters, it would be better to
investigate internal representation optimizations (e.g. batch attribute creation),
that do not affect the user (we want a clean external API).
Differential Revision: https://reviews.llvm.org/D121569
This commit adds signature support to the language server,
and initially supports providing help for: operation operands and results,
and constraint/rewrite calls.
Differential Revision: https://reviews.llvm.org/D121545
This commit adds code completion support to the language server,
and initially supports providing completions for: Member access,
attributes/constraint/dialect/operation names, and pattern metadata.
Differential Revision: https://reviews.llvm.org/D121544
This adds support for documenting the top-level "symbols",
e.g. patterns, constraints, rewrites, etc., within a PDLL file.
Differential Revision: https://reviews.llvm.org/D121543
This adds support for providing information when hovering over
operation names, variables, patters, constraints, and rewrites.
Differential Revision: https://reviews.llvm.org/D121542
This commits adds a basic language server for PDLL to enable providing
language features in IDEs such as VSCode. This initial commit only
adds support for tracking definitions, references, and diagnostics, but
followup commits will build upon this to provide more significant behavior.
In addition to the server, this commit also updates mlir-vscode to support
the PDLL language and invoke the server.
Differential Revision: https://reviews.llvm.org/D121541
This patch slightly updates the behavior of scf.if->select to
place any hoisted select statements prior to the remaining scf.if body.
This allows better composition with other canonicalization passes, such as
scf.if nested merging.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122027
ExpandShapeOp builder cannot infer the result type since it doesn't know
how the dimension needs to be split. Remove this builder so that it
doesn't get used accidently. Also remove one potential path using it in
generic fusion.
Differential Revision: https://reviews.llvm.org/D122019
If a stack allocation is within a nested allocation scope
don't count that as an allocation of the outer allocation scope
that would prevent inlining.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121981
This patch extends the existing combine nested if
combination canonicalization to also handle ifs which
yield values
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121923
Previously, the canonicalizer to create ifs from selects would only work
if the if did not have a body other than yielding. This patch upgrade the functionality
to be able to create selects from any if result whose operands are not defined
within the body.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121943
This patch refactors the current coalesce implementation. It introduces
the `SetCoalescer`, a class in which all coalescing functionality lives.
The main advantage over the old design is the fact that the vectors of
constraints do not have to be passed around, but are implemented as
private fields of the SetCoalescer. This will become especially
important once more inequality types are introduced.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121364
This reverts commit dad80e9710.
The build is broken with some configurations (gcc-5 and gcc-8):
mlir/lib/Analysis/Presburger/PresburgerRelation.cpp:402:32: error: qualified name does not name a class before '{' token
class presburger::SetCoalescer {
This patch refactors the current coalesce implementation. It introduces
the `SetCoalescer`, a class in which all coalescing functionality lives.
The main advantage over the old design is the fact that the vectors of
constraints do not have to be passed around, but are implemented as
private fields of the SetCoalescer. This will become especially
important once more inequality types are introduced.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121364
Fold affine.load ops on global constant memrefs when indices are all
constant.
Reviewed By: ayzhuang
Differential Revision: https://reviews.llvm.org/D120612
When the sparse_tensor dialect lowers linalg.generic,
it makes inferences about how the operations should
affect the looping logic. For example, multiplication
is an intersection while addition is a union of two
sparse tensors.
The new binary and unary op separate the looping logic
from the computation by nesting the computation code
inside a block which is merged at the appropriate level
in the lowered looping code.
The binary op can have custom computation code for the
overlap, left, and right sparse overlap regions. The
unary op can have custom computation code for the
present and absent values.
Reviewed by: aartbik
Differential Revision: https://reviews.llvm.org/D121018
Fold away empty loops that iterate at least once and only return
values defined outside of the loop.
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/D121148
This allows for sharing the implementation of key components across multiple
MLIR language servers. These will be used in a followup to help implement
a PDLL language server.
Differential Revision: https://reviews.llvm.org/D121540
This commit adds detailed documentation for PDLL, its language design, and
captures a bit of the rationale. This document captures everything in-tree at present,
and is intended to be an all encompassing manual for interacting with and understanding
PDLL.
Differential Revision: https://reviews.llvm.org/D119903
The current dialect registry allows for attaching delayed interfaces, that are added to attrs/dialects/ops/etc.
when the owning dialect gets loaded. This is clunky for quite a few reasons, e.g. each interface type has a
separate tracking structure, and is also quite limiting. This commit refactors this delayed mutation of
dialect constructs into a more general DialectExtension mechanism. This mechanism is essentially a registration
callback that is invoked when a set of dialects have been loaded. This allows for attaching interfaces directly
on the loaded constructs, and also allows for loading new dependent dialects. The latter of which is
extremely useful as it will now enable dependent dialects to only apply in the contexts in which they
are necessary. For example, a dialect dependency can now be conditional on if a user actually needs the
interface that relies on it.
Differential Revision: https://reviews.llvm.org/D120367
This PR exposes the region-based isTopLevelValue,
which is useful for other code that performs Affine transformations,
but is not within AffineOps.cpp
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D121877
The getAffineScope function is currently internal
to AffineOps.cpp. However, as the comment on the function
itself notes, this is useful in a variety of other places
externally. This PR allows other files to use the function.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D121827
This removes any potential confusion with the `getType` accessors
which correspond to SSA results of an operation, and makes it
clear what the intent is (i.e. to represent the type of the function).
Differential Revision: https://reviews.llvm.org/D121762
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.
Differential Revision: https://reviews.llvm.org/D121266
Add basic C API for the ControlFlow dialect. Follows the format of the other dialects.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D121867
The current decision of when to run the verifier is running on the
assumption that nested passes can't affect the validity of the parent
operation, which isn't true. Parent operations may attach any number
of constraints on nested operations, which may not necessarily be
captured (or shouldn't be captured) at a smaller granularity.
This commit rectifies this by properly running the verifier after an
OpToOpAdaptor pass. To avoid an explosive increase in compile time,
we only run verification on the parent operation itself. To do this, a
flag to mlir::verify is added to avoid recursive verification if it isn't
desired.
Fixes#54288
Differential Revision: https://reviews.llvm.org/D121836
dense<...> expects ... to be a tensor-literal.
Define this in the grammar in BuiltinAttributes.td,
and reflect this in the reference grammar written in
AttributeParser.cpp.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D121048
This removes a restriction wrt. scf.for loops during One-Shot Bufferization. Such IR was previously rejected. It is still rejected by default because the bufferized IR could be slow. But such IR can now be bufferized with `allow-return-allocs`.
Differential Revision: https://reviews.llvm.org/D121529
New buffer allocations can now be returned/yielded from blocks with `allow-return-allocs`. One-Shot Bufferize deallocates all buffers at the end of the block. If this is not possible (because the buffer escapes the block), this is now done by the existing BufferDeallocation pass.
Differential Revision: https://reviews.llvm.org/D121527
* Implement RegionBranchOpInterface: The op has a region, but it is conceptually not entered. The region just describes the semantics of the (monolithic) op.
* Linalg structured ops do not allocate memory.
Differential Revision: https://reviews.llvm.org/D121798
Such IR is rejected by default, but can be allowed with `allow-return-memref`. In preparation of future refactorings, do not deallocate such buffers.
One-Shot Analysis now gathers information about yielded tensors, so that we know during the actual bufferization whether a newly allocated buffer should be deallocated again. (Otherwise, it will leak. This will be addressed in a subsequent commit that also makes `allow-return-memref` a non-experimental flag.)
As a cleanup, `allow-return-memref` is now part of OneShotBufferizationOptions. (It was previously ignored by AlwaysCopyBufferizationState.) Moreover, AlwaysCopyBufferizationState now asserts that `create-deallocs` is deactivated to prevent surprising behavior.
Differential Revision: https://reviews.llvm.org/D121521
PyTACO DSL doesn't support the use of index values as in A[i] = B[i]+ i.
We extend the DSL to support such a use in MLIR-PyTACO.
Remove an obsolete unit test. Add unit tests and PyTACO tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121716
FuncOp is being moved out of the builtin dialect, and defining a custom
toy operation showcases various aspects of defining function-like operation
(e.g. inlining, passes, etc.).
Differential Revision: https://reviews.llvm.org/D121264
Defining our own function operation allows for the PDL interpreter
to be more self contained, and also removes any dependency on FuncOp;
which is moving out of the Builtin dialect.
Differential Revision: https://reviews.llvm.org/D121253
During MLIR translation to LLVMIR if an inlineable call has an UnkownLoc we get this error message:
```
inlinable function call in a function with debug info must have a !dbg location
call void @callee()
```
There is code that checks for this case and strips debug information to avoid this situation. I'm expanding this code to handle the case where an debug location points at a UnknownLoc. For example, a NamedLoc whose child location is an UnknownLoc.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121633
Fold linalg.fill into linalg.generic.
Remove dead arguments used in linalg.generic.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D121535
Define IndexExpr before IndexVar. This is to prepare for the next change
to support the use of index values in tensor expressions.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121649
When using `--convert-func-to-llvm=emit-c-wrappers` the attribute arguments of the wrapper would not be created correctly in some cases.
This patch fixes that and introduces a set of tests for (hopefully) all corner cases.
See https://github.com/llvm/llvm-project/issues/53503
Author: Sam Carroll <sam.carroll@lmns.com>
Co-Author: Laszlo Kindrat <laszlo.kindrat@lmns.com>
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119895
Patch adds a new operation for the SIMD construct. The op is designed to be very similar to the existing `wsloop` operation, so that the `CanonicalLoopInfo` of `OpenMPIRBuilder` can be used.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D118065
This improves the modularity of the bufferization.
From now on, all ops that do not implement BufferizableOpInterface are considered hoisting barriers. Previously, all ops that do not implement the interface were not considered barriers and such ops had to be marked as barriers explicitly. This was unsafe because we could've hoisted across unknown ops where it was not safe to hoist.
As a side effect, this allows for cleaning up AffineBufferizableOpInterfaceImpl. This build unit no longer needed and can be deleted.
Differential Revision: https://reviews.llvm.org/D121519
Also add a TODO to switch to a custom walk instead of the GreedyPatternRewriter, which should be more efficient. (The bufferization pattern is guaranteed to apply only a single time for every op, so a simple walk should suffice.)
We currently specify a top-to-bottom walk order. This is important because other walk orders could introduce additional casts and/or buffer copies. These canonicalize away again, but it is more efficient to never generate them in the first place.
Note: A few of these canonicalizations are not yet implemented.
Differential Revision: https://reviews.llvm.org/D121518
There is currently an awkwardly complex set of rules for how a
parser/printer is generated for AttrDef/TypeDef. It can change depending on if a
mnemonic was specified, if there are parameters, if using the assemblyFormat, if
individual parser/printer code blocks were specified, etc. This commit refactors
this to make what the attribute/type wants more explicit, and to better align
with how formats are specified for operations.
Firstly, the parser/printer code blocks are removed in favor of a
`hasCustomAssemblyFormat` bit field. This aligns with the operation format
specification (and is nice to remove code blocks from ODS).
This commit also adds a requirement to explicitly set `assemblyFormat` or
`hasCustomAssemblyFormat` when the mnemonic is set and the attr/type
has no parameters. This removes the weird implicit matrix of behavior,
and also encourages the author to make a conscious choice of either C++
or declarative format instead of implicitly opting them into the C++
format (we should be pushing towards declarative when possible).
Differential Revision: https://reviews.llvm.org/D121505
The current documentation is super old, crusty, and at times wrong. This commit
rewrites the documentation to focus on the TableGen declarative definition,
expounds on various components, and moves the doc out of Tutorials/ and into
a new top level `AttributesAndTypes.md` doc. As part of this, the AttrDef/TypeDef
documentation in OpDefinitions.md is removed.
Differential Revision: https://reviews.llvm.org/D120011
OpBase.td has formed into a huge monolith of all ODS constructs. This
commits starts to rectify that by splitting out some constructs to their
own .td files.
Differential Revision: https://reviews.llvm.org/D118636
This patch adds support for custom directives in attribute and type formats. Custom directives dispatch calls to user-defined parser and printer functions.
For example, the assembly format "custom<Foo>($foo, ref($bar))" expects a function with the signature
```
LogicalResult parseFoo(AsmParser &parser, FailureOr<FooT> &foo, BarT bar);
void printFoo(AsmPrinter &printer, FooT foo, BarT bar);
```
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120944
Implement the vectorLoopUnroll interface for MultiDimReduceOp and add a
pattern to do the unrolling following the same interface other vector
unroll patterns.
Differential Revision: https://reviews.llvm.org/D121263
The revision removes the linalg.fill operation and renames the OpDSL generated linalg.fill_tensor operation to replace it. After the change, all named structured operations are defined via OpDSL and there are no handwritten operations left.
A side-effect of the change is that the pretty printed form changes from:
```
%1 = linalg.fill(%cst, %0) : f32, tensor<?x?xf32> -> tensor<?x?xf32>
```
changes to
```
%1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<?x?xf32>) -> tensor<?x?xf32>
```
Additionally, the builder signature now takes input and output value ranges as it is the case for all other OpDSL operations:
```
rewriter.create<linalg::FillOp>(loc, val, output)
```
changes to
```
rewriter.create<linalg::FillOp>(loc, ValueRange{val}, ValueRange{output})
```
All other changes remain minimal. In particular, the canonicalization patterns are the same and the `value()`, `output()`, and `result()` methods are now implemented by the FillOpInterface.
Depends On D120726
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120728
Introduce an explicit `replaceOp` call to enable the tracking of the producer LinalgOp.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D121369
This is present since the beginning, but does not seem needed by any
in-tree target right now. This seems like the kind of thing to populate
by the caller if needed.
Differential Revision: https://reviews.llvm.org/D121565
This patch adds supports for union of relations (PresburgerRelation). Along
with this, support for PresburgerSet is also maintained.
This patch is part of a series of patches to add support for relations in
Presburger library.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121417
Current generated Python binding for the SCF dialect does not allow
users to call IfOp to create if-else branches on their own.
This PR sets up the default binding generation for scf.if operation
to address this problem.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121076
mlir-translate and related tools currently have a fixed set
of flags that are built into Translation.cpp. This works for
simple cases, but some clients want to change the default
globally (e.g. default to allowing unregistered dialects
without a command line flag), or support dialect-independent
translations without having those translations register every
conceivable dialect they could be used with (breaking
modularity).
This approach could also be applied to mlirOptMain to reduce
the significant number of flags it has accumulated.
Differential Revision: https://reviews.llvm.org/D120970
This clarifies that this is an LLVM specific variable and avoids
potential conflicts with other projects.
Differential Revision: https://reviews.llvm.org/D119918
* It doesn't required by OpenCL/Intel Level Zero and can be set programmatically.
* Add GPU to spirv lowering in case when attribute is not present.
* Set higher benefit to WorkGroupSizeConversion pattern so it will always try to lower first from the attribute.
Differential Revision: https://reviews.llvm.org/D120399
Early adoption of new technologies or adjusting certain code generation/IR optimization thresholds
is often available through some cl::opt options (which have unstable surfaces).
Specifying such an option twice will lead to an error.
```
% clang -c a.c -mllvm -disable-binop-extract-shuffle -mllvm -disable-binop-extract-shuffle
clang (LLVM option parsing): for the --disable-binop-extract-shuffle option: may only occur zero or one times!
% clang -c a.c -mllvm -hwasan-instrument-reads=0 -mllvm -hwasan-instrument-reads=0
clang (LLVM option parsing): for the --hwasan-instrument-reads option: may only occur zero or one times!
% clang -c a.c -mllvm --scalar-evolution-max-arith-depth=32 -mllvm --scalar-evolution-max-arith-depth=16
clang (LLVM option parsing): for the --scalar-evolution-max-arith-depth option: may only occur zero or one times!
```
The option is specified twice, because there is sometimes a global setting and
a specific file or project may need to override (or duplicately specify) the
value.
The error is contrary to the common practice of getopt/getopt_long command line
utilities that let the last option win and the `getLastArg` behavior used by
Clang driver options. I have seen such errors for several times. I think the
error just makes users inconvenient, while providing very little value on
discouraging production usage of unstable surfaces (this goal is itself
controversial, because developers might not want to commit to a stable surface
too early, or there is just some subtle codegen toggle which is infeasible to
have a driver option). Therefore, I suggest we drop the diagnostic, at least
before the diagnostic gets sufficiently better support for the overridding needs.
Removing the error is a degraded error checking experience. I think this error
checking behavior, if desirable, should be enabled explicitly by tools. Users
preferring the behavior can figure out a way to do so.
Reviewed By: jhenderson, rnk
Differential Revision: https://reviews.llvm.org/D120455
Add operations -, abs, ceil and floor to the index notation.
Add test cases.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121388
In this CL, update the function name of verifier according to the
behavior. If a verifier needs to access the region then it'll be updated
to `verifyRegions`.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120373
This patch removes an old recursive implementation to lower vector.transpose to extract/insert operations
and replaces it with a iterative approach that leverages newer linearization/delinearization utilities.
The patch should be NFC except by the order in which the extract/insert ops are generated.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D121321
Add operations abs, ceil, floor, and neg to the C++ API and Python API.
Add test cases.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D121339
This patch remove `spaceKind` from PresburgerSpace, making PresburgerSpace only
a space supporting relations.
Sets are still implemented in the same way, i.e. with a zero domain but instead
the asserts to check if the space is still set are added to users of
PresburgerSpace which treat it as a Set space.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121357
The enableObjectCache option was added in
https://reviews.llvm.org/rG06e8101034e, defaulting to false. However,
the init code added there got its logic reversed
(cache(enableObjectCache ? nullptr : new SimpleObjectCache()), which was
fixed in https://reviews.llvm.org/rGd1186fcb04 by setting the default to
true, thereby preserving the existing behavior even if it was
unintentional.
Default now the object cache to false as it was originally intended.
While at it, mention in enableObjectCache's documentation how the
cache can be dumped.
Reviewed-by: mehdi_amini
Differential Revision: https://reviews.llvm.org/D121291
This patch moves the testcases from
`mlir/test/Target/LLVMIR/openmp-llvm-bad-schedule-modifier.mlir` to
`mlir/test/Dialect/OpenMP/invalid.mlir` as they test the verifier
(not the translation to LLVM IR).
Reviewed By: NimishMishra
Differential Revision: https://reviews.llvm.org/D120877
On Windows (at least), cmake ignores Python3_EXECUTABLE unless the
'Interpreter' component is being found. If the user is specifying a
different version than the latest installed (say, 3.8 vs 3.9) with the
Python3_EXECUTABLE, cmake was using a combination of the newest version
and the desired version. Mitigated by adding 'Interpreter' in the first
invocation like the second one.
This patch adds lowering from omp.atomic.update to LLVM IR. Whenever a
special LLVM IR instruction is available for the operation, `atomicrmw`
instruction is emitted, otherwise a compare-exchange loop based update
is emitted.
Depends on D119522
Reviewed By: ftynse, peixin
Differential Revision: https://reviews.llvm.org/D119657
When `addCoalescedPolyhedron` was called with `j == n - 1`,
the `polyhedrons`-vector was not properly updated (the
`IntegerPolyhedron` at position `n - 2` was "lost"). This patch adds
special handling to that case and a regression testcase.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D121356
This patch moves PresburgerSpace::removeIdRange(idStart, idLimit) to
PresburgerSpace::removeIdRange(kind, idStart, idLimit), i.e. identifiers
can only be removed at once for a single kind.
This makes users of PresburgerSpace to not assume any inside ordering of
identifier kinds.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121079
NFC. Clean up memref utils library. This library had a single function
that was completely misplaced. MemRefUtils is expected to be (also per
its comment) a library providing analysis/transforms utilities on memref
dialect ops or memref types. However, in reality it had a helper that
was depended upon by the MemRef dialect, i.e., it was a helper for the
dialect ops library and couldn't contain anything that itself depends on
the MemRef dialect. Move the single method to the memref dialect that
will now allow actual utilities depending on the memref dialect to be
placed in it.
Put findDealloc in the `memref` namespace. This is a pure move.
Differential Revision: https://reviews.llvm.org/D121273
ValueShapeRange::getShape() returns ShapeAdaptor rather than ShapedType
and ShapeAdaptor allows implicit conversion to bool. It ends up that
ShapedTypeComponents can be constructed with ShapeAdaptor incorrectly.
The reason is that the type trait
std::is_constructible<ShapeStorageT, Arg>::value
is fulfilled because ShapeAdaptor can be converted to bool and it can be
used to construct ShapeStorageT. In the end, we won't give any warning
or error message when doing things like
inferredReturnShapes.emplace_back(valueShapeRange.getShape(0));
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D120845
Currently when we fold an empty loop, we assume that any loop
with iterArgs returns its iterArgs in order, which is not always
the case. It may return values defined outside of the loop or
return its iterArgs out of order. This patch adds support to
those cases.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D120776
This revision adds support for the linalg.index to the sparse compiler
pipeline. In essence, this adds the ability to refer to indices in
the tensor index expression, as illustrated below:
Y[i, j, k, l, m] = T[i, j, k, l, m] * i * j
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D121251
BuiltinOps.h
These includes are going to be removed from BuiltinOps.h in a followup
when FuncOp is moved out of the Builtin dialect. This commit
pre-emptively adds those includes to simplify the patch moving FuncOp.
It's fine to use any integer (vector) values regardless of the
signedness. The opcode decides how to interpret the bits.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D121238
This feels like a layering violation, but it fixes the build.
Fixes#54242
tools/mlir/lib/Dialect/GPU/CMakeFiles/obj.MLIRGPUTransforms.dir/Transforms/SerializeToHsaco.cpp.o:SerializeToHsaco.cpp:function (anonymous namespace)::SerializeToHsacoPass::optimizeLlvm(llvm::Module&, llvm::TargetMachine&):
error: undefined reference to 'mlir::makeOptimizingTransformer(unsigned int, unsigned int, llvm::TargetMachine*)'
This pass doesn't rely on any specific characteristics of FuncOp, and
can just be a generic operation pass.
Differential Revision: https://reviews.llvm.org/D121193
It is currently a module pass, but shouldn't be. All of the patterns
are local conversions, and don't require anything about
functions/modules.
Differential Revision: https://reviews.llvm.org/D121192
These passes generally don't rely on any special aspects of FuncOp, and moving allows
for these passes to be used in many more situations. The passes that obviously weren't
relying on invariants guaranteed by a "function" were updated to be generic pass, the
rest were updated to be FunctionOpinterface InterfacePasses.
The test updates are NFC switching from implicit nesting (-pass -pass2) form to
the -pass-pipeline form (generic passes do not implicitly nest as op-specific passes do).
Differential Revision: https://reviews.llvm.org/D121190
FuncOp isn't really important to hardcode here, it is only used to act
as a root operation for the transformation.
Differential Revision: https://reviews.llvm.org/D121195
A lot of test passes are currently anchored on FuncOp, but this
dependency
is generally just historical. A majority of these test passes can run on
any operation, or can operate on a specific interface
(FunctionOpInterface/SymbolOpInterface).
This allows for greatly reducing the API dependency on FuncOp, which
is slated to be moved out of the Builtin dialect.
Differential Revision: https://reviews.llvm.org/D121191
Commit rG1a2bb03edab9d7aa31beb587d0c863acc6715d27 introduced a pattern
to convert dynamic dimensions in operands of `GenericOp`s to static
values based on indexing maps and shapes of other operands. The logic
is directly usable to any `LinalgOp`. Move that pattern as an
`OpInterfaceRewritePattern`.
Differential Revision: https://reviews.llvm.org/D120968
This is a pass that can be used by downstream consumers directly
to avoid the boilerplate to wrap around the `populate*Patterns`.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D121222
A `tensor.cast` consumer can be folded with its producer. This is
beneficial only if the result of the tensor cast is more static than
the source. This patch adds a utility function to check that this is
the case, and adds a couple of canonicalizations patterns that fold an
operation with `tensor.cast` conusmers.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D120950
It's valid to create a TypedArrayAttr or MixedContainerType with
nullptr, e.g.,
std::vector<mlir::Attribute> attrs = {mlir::StringAttr()};
builder.createArrayAttr(attrs);
The predicate didn't check if it's a nullptr and it ended up a crash in
the attribute static verifier. We always check if an attribute is null
so it's better to align the check for these two container type attr.
Reviewed By: rdzhabarov
Differential Revision: https://reviews.llvm.org/D121178
With the recent improvements to OpDSL it is cheap to reintroduce a linalg.copy operation.
This operation is needed in at least 2 cases:
1. for copies that may want to change the elemental type (e.g. cast, truncate, quantize, etc)
2. to specify new tensors that should bufferize to a copy operation. The linalg.generic form
always folds away which is not always the right call.
Differential Revision: https://reviews.llvm.org/D121230
Allow pointwise operations to take rank zero input tensors similarly to scalar inputs. Use an empty indexing map to broadcast rank zero tensors to the iteration domain of the operation.
Depends On D120734
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120807
The revision removes the SoftPlus2DOp operation that previously served as a test operation. It has been replaced by the elemwise_unary operation, which is now used to test unary log and exp functions.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120794
Simplify tests that use `linalg.fill_rng_2d` to focus on testing the `const` and `index` functions. Additionally, cleanup emit_misc.py to use simpler test functions and fix an error message in config.py.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120734
Extend OpDSL with a `defines` method that can set the `hasCanonicalizer` flag for an OpDSL operation. If the flag is set via `defines(Canonicalizer)` the operation needs to implement the `getCanonicalizationPatterns` method. The revision specifies the flag for linalg.fill_tensor and adds an empty `FillTensorOp::getCanonicalizationPatterns` implementation.
This revision is a preparation step to replace linalg.fill by its OpDSL counterpart linalg.fill_tensor. The two are only functionally equivalent if both specify the same canonicalization patterns. The revision is thus a prerequisite for the linalg.fill replacement.
Depends On D120725
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120726
Enhance `LinalgTileAndFuseTensorOpsPattern` with an additional rewrite signature that returns the result of the rewrite.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D121212
Add a FillOpInterface similar to the contraction and convolution op interfaces. The FillOpInterface is a preparation step to replace linalg.fill by its OpDSL version linalg.fill_tensor. The interface implements the `value()`, `output()`, and `result()` methods that by default are not available on linalg.fill_tensor.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120725
Currently, the transfer mask is materialized by generating the vector
comparison: [offset + 0, .., offset + length - 1] < [dim, .., dim]
A better alternative is to materialize the transfer mask by using the
operation: `vector.create_mask (dim - offset)`, which will generate
simpler code and compose better with scalable vectors.
Differential Revision: https://reviews.llvm.org/D120487
Rewrite isInnermostAffineForOp utility to make it more direct/efficient.
Drop unnecessary check. NFC.
Differential Revision: https://reviews.llvm.org/D121170
This is to align with the PyTACO API better.
Modify an existing unit test to test the new routines.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121083
This patch cleans up the interface to PresburgerSet. At a high level it does
the following changes:
- Move member functions around to have constructors at top and print/dump
at end.
- Move a private function to be a static function instead.
- Change member functions of type "getAllIntegerPolyhedron" to "getAllPolys"
instead.
- Improve documentation for PresburgerSet.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121027
In quantized comutation, there are casting ops around computation ops.
Reorder the ops to make reduce-to-contract actually work.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D120760
The current StandardToLLVM conversion patterns only really handle
the Func dialect. The pass itself adds patterns for Arithmetic/CFToLLVM, but
those should be/will be split out in a followup. This commit focuses solely
on being an NFC rename.
Aside from the directory change, the pattern and pass creation API have been renamed:
* populateStdToLLVMFuncOpConversionPattern -> populateFuncToLLVMFuncOpConversionPattern
* populateStdToLLVMConversionPatterns -> populateFuncToLLVMConversionPatterns
* createLowerToLLVMPass -> createConvertFuncToLLVMPass
Differential Revision: https://reviews.llvm.org/D120778
These unit tests resides in an internal repository. Porting the tests to the
public repository.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121021
The default lowering of vector transpose operations generates a large sequence of
scalar extract/insert operations, one pair for each scalar element in the input tensor.
In other words, the vector transpose is scalarized. However, there are transpose
patterns where one or more adjacent high-order dimensions are not transposed (for
example, in the transpose pattern [1, 0, 2, 3], dimensions 2 and 3 are not transposed).
This patch improves the lowering of those cases by not scalarizing them and extracting/
inserting a full n-D vector, where 'n' is the number of adjacent high-order dimensions
not being transposed. By doing so, we prevent the scalarization of the code and generate a
more performant vector version.
Paradoxically, this patch shouldn't improve the performance of transpose operations if
we are using LLVM. The LLVM pipeline is able to optimize away some of the extract/insert
operations and the SLP vectorizer is converting the scalar operations back to its vector
form. However, scalarizing a vector version of the code in MLIR and relying on the SLP
vectorizer to reconstruct the vector code again is highly undesirable for several reasons.
Reviewed By: nicolasvasilache, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D120601
This patch fixes the crash when printing some ops (like affine.for and
scf.for) when they are dumped in invalid state, e.g. during pattern
application. Now the AsmState constructor verifies the operation
first and switches to generic operation printing when the verification
fails. Also operations are now printed in generic form when emitting
diagnostics and the severity level is Error.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117834
Translation.h is currently awkwardly shoved into the top-level mlir, even though it is
specific to the mlir-translate tool. This commit moves it to a new Tools/mlir-translate
directory, which is intended for libraries used to implement tools. It also splits the
translate registry from the main entry point, to more closely mirror what mlir-opt
does.
Differential Revision: https://reviews.llvm.org/D121026
MlirOptMain is currently awkwardly shoved into mlir/Support. This commit
moves it to the Tools/ directory, which is intended for libraries used to
implement tools.
Differential Revision: https://reviews.llvm.org/D121025
There is no reason for this file to be at the top-level, and
its current placement predates the Parser/ folder's existence.
Differential Revision: https://reviews.llvm.org/D121024
Mark `parseSourceFile()` deprecated. The functions will be removed two weeks after landing this change.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D121075
The OpenMPIRBuilder has a bug. Specifically, suppose you have two nested openmp parallel regions (writing with MLIR for ease)
```
omp.parallel {
%a = ...
omp.parallel {
use(%a)
}
}
```
As OpenMP only permits pointer-like inputs, the builder will wrap all of the inputs into a stack allocation, and then pass this
allocation to the inner parallel. For example, we would want to get something like the following:
```
omp.parallel {
%a = ...
%tmp = alloc
store %tmp[] = %a
kmpc_fork(outlined, %tmp)
}
```
However, in practice, this is not what currently occurs in the context of nested parallel regions. Specifically to the OpenMPIRBuilder,
the entirety of the function (at the LLVM level) is currently inlined with blocks marking the corresponding start and end of each
region.
```
entry:
...
parallel1:
%a = ...
...
parallel2:
use(%a)
...
endparallel2:
...
endparallel1:
...
```
When the allocation is inserted, it presently inserted into the parent of the entire function (e.g. entry) rather than the parent
allocation scope to the function being outlined. If we were outlining parallel2, the corresponding alloca location would be parallel1.
This causes a variety of bugs, including https://github.com/llvm/llvm-project/issues/54165 as one example.
This PR allows the stack allocation to be created at the correct allocation block, and thus remedies such issues.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D121061
This commit adds a new hook Pass `bool canScheduleOn(RegisteredOperationName)` that
indicates if the given pass can be scheduled on operations of the given type. This makes it
easier to define constraints on generic passes without a) adding conditional checks to
the beginning of the `runOnOperation`, or b) defining a new pass type that forwards
from `runOnOperation` (after checking the invariants) to a new hook. This new hook is
used to implement an `InterfacePass` pass class, that represents a generic pass that
runs on operations of the given interface type.
The PassManager will also verify that passes added to a pass manager can actually be
scheduled on that pass manager, meaning that we will properly error when an Interface
is scheduled on an operation that doesn't actually implement that interface.
Differential Revision: https://reviews.llvm.org/D120791
RegionBranchOpInterface and BranchOpInterface are allowed to make implicit type conversions along control-flow edges. In effect, this adds an interface method, `areTypesCompatible`, to both interfaces, which should return whether the types of corresponding successor operands and block arguments are compatible. Users of the interfaces, here on forth, must be aware that types may mismatch, although current users (in MLIR core), are not affected by this change. By default, type equality is used.
`async.execute` already has unequal types along control-flow edges (`!async.value<f32>` vs. `f32`), but it opted out of calling `RegionBranchOpInterface::verifyTypes` in its verifier. That method has now been removed and `RegionBranchOpInterface` will verify types along control edges by default in its verifier.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120790
Such initializer functions can be enqueued in `BufferizationOptions`. They can be used to set up dialect-specific bufferization state.
Differential Revision: https://reviews.llvm.org/D120985
This clarifies that these methods only work in append mode, not for general insertions. This is a prospective change towards https://github.com/llvm/llvm-project/issues/51652 which also performs random-access insertions, so we want to avoid confusion.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120929
This patch extends the existing if combining canonicalization to also handle the case where a value returned by the first if is used within the body of the second if.
This patch also extends if combining to support if's whose conditions are logical negations of each other.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120924
This patch makes coalesce skip the comparison of all pairs of IntegerPolyhedrons with LocalIds rather than crash. The heuristics to handle these cases will be upstreamed later on.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120995
We can simplify an extractvalue of an insertvalue to extract out of the base of the insertvalue, if the insert and extract are at distinct and non-prefix'd indices
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120915
This patch introduces the cut case. If one polytope has only cutting and
redundant inequalities for the other and the facet of the cutting
inequalities are contained within the other polytope, then the polytopes are
be combined into a polytope consisting only of their respective
redundant constraints.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120614
Extend isLoopMemoryParallel check to include locally allocated memrefs.
This strengthens and also speeds up the dependence check used by the
utility by excluding locally allocated memrefs where appropriate.
Additional memref dialect ops can be supported exhaustively via proper
interfaces.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D120617
An OpenMP wsloop is simply a regular for loop with the bounds determined by the thread number, and the same justification to allow this for scf.for works for omp.wsloop.
An OpenMP parallel is a parallel for, per thread. Similarly the same justification for scf.parallel having recursive side effects applies here.
In both cases the general justification is that the ops themselves don't have side effects (besides inaccessible runtime-specific memory) and thus the side effects are simply that of the contained ops.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D120853