This patch changes the (not recommended) static registration API from:
static PassRegistration<MyPass> reg("my-pass", "My Pass Description.");
to:
static PassRegistration<MyPass> reg;
And the explicit registration from:
void registerPass("my-pass", "My Pass Description.",
[] { return createMyPass(); });
To:
void registerPass([] { return createMyPass(); });
It is expected that Pass implementations overrides the getArgument() method
instead. This will ensure that pipeline description can be printed and parsed
back.
Differential Revision: https://reviews.llvm.org/D104421
This allows for dialects to do different post-processing depending on operations with the inliner (my use case requires different attribute propagation rules depending on call op). This hook runs before the regular processInlinedBlocks method.
Differential Revision: https://reviews.llvm.org/D104399
This is a very careful start with alllowing sparse tensors at the
left-hand-side of tensor index expressions (viz. sparse output).
Note that there is a subtle difference between non-annotated tensors
(dense, remain n-dim, handled by classic bufferization) and all-dense
annotated "sparse" tensors (linearized to 1-dim without overhead
storage, bufferized by sparse compiler, backed by runtime support library).
This revision gently introduces some new IR to facilitate annotated outputs,
to be generalized to truly sparse tensors in the future.
Reviewed By: gussmith23, bixia
Differential Revision: https://reviews.llvm.org/D104074
The index cast operation accepts vector types. Implement its lowering in this patch.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D104280
It may be desirable to provide an interface implementation for an attribute or
a type without modifying the definition of said attribute or type. Notably,
this allows to implement interfaces for attributes and types outside of the
dialect that defines them and, in particular, provide interfaces for built-in
types. Provide the mechanism to do so.
Currently, separable registration requires the attribute or type to have been
registered with the context, i.e. for the dialect containing the attribute or
type to be loaded. This can be relaxed in the future using a mechanism similar
to delayed dialect interface registration.
See https://llvm.discourse.group/t/rfc-separable-attribute-type-interfaces/3637
Depends On D104233
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D104234
Interface patterns are unique in that they get added to every operation that also implements that interface, given that they aren't tied to individual operations. When the same interface pattern gets added to multiple operations (such as the current behavior with Linalg), an reference to each of these patterns is added to every op (meaning that an operation will now have N references to effectively the same pattern). This revision fixes this problematic behavior in Linalg, and can bring upwards of a 25% reduction in compile time in Linalg based workloads.
Differential Revision: https://reviews.llvm.org/D104160
This change adds `AutomaticAllocationScope` to the
memref.alloca_scope op. Additionally, it also clarifies
that alloca_scope is is conceptually a passthrough operation.
Reviewed By: ftynse, bondhugula
Differential Revision: https://reviews.llvm.org/D104227
Actually, no vector types are supported so far. We should add the traits once
the vector types are supported (e.g. ElementwiseMappable.traits).
Instead add Elementwise trait to each op.
Differential Revision: https://reviews.llvm.org/D104103
Up to now all structured op operands are assumed to be shaped. The patch relaxes this assumption and allows scalar input operands. In contrast to shaped operands scalar operands are not indexed and directly forwarded to the body of the operation. As all other operands, scalar operands are associated to an indexing map that in case of a scalar or a 0D-operand has an empty range.
We will use scalar operands as a replacement for the capture mechanism. In contrast to captures, the approach ensures we can generate the function signature from the operand list and it prevents outdated capture values in case a transformation updates only the capture operand but not the hidden body of a named operation.
Removing captures and updating existing operations such as linalg.fill is left for a later patch.
The patch depends on https://reviews.llvm.org/D103891 and https://reviews.llvm.org/D103890.
Differential Revision: https://reviews.llvm.org/D104109
* Add a helper function that returns the constant padding value (if applicable).
* Remove existing getConstantYieldValueFromBlock function, which does almost the same.
* Adapted from D103243.
Differential Revision: https://reviews.llvm.org/D104004
Add `tensor.insert` op to make `tensor.extract`/`tensor.insert` work in pairs
for `scalar` domain. Like `subtensor`/`subtensor_insert` work in pairs in
`tensor` domain, and `vector.transfer_read`/`vector.transfer_write` work in
pairs in `vector` domain.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D104139
Add support to Python bindings for the MLIR execution engine to load a
specified list of shared libraries - for eg. to use MLIR runtime
utility libraries.
Differential Revision: https://reviews.llvm.org/D104009
## Introduction
This proposal describes the new op to be added to the `std` (and later moved `memref`)
dialect called `alloca_scope`.
## Motivation
Alloca operations are easy to misuse, especially if one relies on it while doing
rewriting/conversion passes. For example let's consider a simple example of two
independent dialects, one defines an op that wants to allocate on-stack and
another defines a construct that corresponds to some form of looping:
```
dialect1.looping_op {
%x = dialect2.stack_allocating_op
}
```
Since the dialects might not know about each other they are going to define a
lowering to std/scf/etc independently:
```
scf.for … {
%x_temp = std.alloca …
… // do some domain-specific work using %x_temp buffer
… // and store the result into %result
%x = %result
}
```
Later on the scf and `std.alloca` is going to be lowered to llvm using a
combination of `llvm.alloca` and unstructured control flow.
At this point the use of `%x_temp` is bound to either be either optimized by
llvm (for example using mem2reg) or in the worst case: perform an independent
stack allocation on each iteration of the loop. While the llvm optimizations are
likely to succeed they are not guaranteed to do so, and they provide
opportunities for surprising issues with unexpected use of stack size.
## Proposal
We propose a new operation that defines a finer-grain allocation scope for the
alloca-allocated memory called `alloca_scope`:
```
alloca_scope {
%x_temp = alloca …
...
}
```
Here the lifetime of `%x_temp` is going to be bound to the narrow annotated
region within `alloca_scope`. Moreover, one can also return values out of the
alloca_scope with an accompanying `alloca_scope.return` op (that behaves
similarly to `scf.yield`):
```
%result = alloca_scope {
%x_temp = alloca …
…
alloca_scope.return %myvalue
}
```
Under the hood the `alloca_scope` is going to lowered to a combination of
`llvm.intr.stacksave` and `llvm.intr.strackrestore` that are going to be invoked
automatically as control-flow enters and leaves the body of the `alloca_scope`.
The key value of the new op is to allow deterministic guaranteed stack use
through an explicit annotation in the code which is finer-grain than the
function-level scope of `AutomaticAllocationScope` interface. `alloca_scope`
can be inserted at arbitrary locations and doesn’t require non-trivial
transformations such as outlining.
## Which dialect
Before memref dialect is split, `alloca_scope` can temporarily reside in `std`
dialect, and later on be moved to `memref` together with the rest of
memory-related operations.
## Implementation
An implementation of the op is available [here](https://reviews.llvm.org/D97768).
Original commits:
* Add initial scaffolding for alloca_scope op
* Add alloca_scope.return op
* Add no region arguments and variadic results
* Add op descriptions
* Add failing test case
* Add another failing test
* Initial implementation of lowering for std.alloca_scope
* Fix backticks
* Fix getSuccessorRegions implementation
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D97768
This is the first step to convert vector ops to MMA operations in order to
target GPUs tensor core ops. This currently only support simple cases,
transpose and element-wise operation will be added later.
Differential Revision: https://reviews.llvm.org/D102962
Create a ComplexUnaryOp base class and use it for AbsOp, ReOp and ImOp.
Sort all ops in lexicographic order.
Differential Revision: https://reviews.llvm.org/D104095
These interfaces allow for a composite attribute or type to opaquely provide access to any held attributes or types. There are several intended use cases for this interface. The first of which is to allow the printer to create aliases for non-builtin dialect attributes and types. In the future, this interface will also be extended to allow for SymbolRefAttr to be placed on other entities aside from just DictionaryAttr and ArrayAttr.
To limit potential test breakages, this revision only adds the new interfaces to the builtin attributes/types that are currently hardcoded during AsmPrinter alias generation. In a followup the remaining builtin attributes/types, and non-builtin attributes/types can be extended to support it.
Differential Revision: https://reviews.llvm.org/D102945
This allows for using other type interfaces in the builtin dialect, which currently results in a compile time failure (as it generates duplicate interface declarations).
This adds Sdot2d op, which is similar to the usual Neon
intrinsic except that it takes 2d vector operands, reflecting the
structure of the arithmetic that it's performing: 4 separate
4-dimensional dot products, whence the vector<4x4xi8> shape.
This also adds a new pass, arm-neon-2d-to-intr, lowering
this new 2d op to the 1d intrinsic.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D102504
This allows for building an outline of the symbols and symbol tables within the IR. This allows for easy navigations to functions/modules and other symbol/symbol table operations within the IR.
Differential Revision: https://reviews.llvm.org/D103729
This allow creating a matrix with all elements set to a given value. This is
needed to be able to implement a simple dot op.
Differential Revision: https://reviews.llvm.org/D103870
This is a roll forward of D102679.
This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse.
It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++().
Note: Compared to D102679, this patch introduces a `asSmallVector()` member function and fixes compilation issue with GCC 5.
Differential Revision: https://reviews.llvm.org/D103948
This brings us closer to replacing the LLVM data layout string with a
first-class layout modeling in MLIR.
Depends On D103945
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D103946
This reverts commit 08664d005c, which according to
https://reviews.llvm.org/D103373 was pushed accidentally, and I believe it
causes timeouts in some internal mlir tests.
A common mistake for newcomers to MLIR is to try to store extra member
on the Op class. However these are intended to be thing wrapper around
an Operation*, all the storage is meant to be encoded in attribute on
the underlying Operation. This can be confusing to debug, so better
catch it at build time.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103869
This allows us to remove the `spv.mlir.endmodule` op and
all the code associated with it.
Along the way, tightened the APIs for `spv.module` a bit
by removing some aliases. Now we use `getRegion` to get
the only region, and `getBody` to get the region's only
block.
Reviewed By: mravishankar, hanchung
Differential Revision: https://reviews.llvm.org/D103265
ArmSVE-specific memory operations are needed to generate end-to-end
code for as long as MLIR core doesn't support scalable vectors. This
instructions will be eventually unnecessary, for now they're required
for more complex testing.
Differential Revision: https://reviews.llvm.org/D103535
These `arm_sve.cmp` functions are needed to generate scalable vector
masks as long as scalable vectors are not part of the standard types.
Once in standard, these can be removed and `std.cmp` can be used
instead.
Differential Revision: https://reviews.llvm.org/D103473
As a follow-up to the discussion in https://reviews.llvm.org/D103822,
make the templated `DictionaryAttr::getAs` take the name by `&&`
reference and properly forward the argument to the underlying `get`.
Temporarily support 2D and 3D while the TOSA Matmul op is updated to support batched operations.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D103854
A common mistake for newcomers to MLIR is to try to store extra member
on the Op class. However these are intended to be thing wrapper around
an Operation*, all the storage is meant to be encoded in attribute on
the underlying Operation. This can be confusing to debug, so better
catch it at build time.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103869
This reverts commit e772216e70
(and fixup 7f6c878a2c).
The build is broken with gcc5 host compiler:
In file included from
from mlir/lib/Dialect/Utils/StructuredOpsUtils.cpp:9:
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: error: type/value mismatch at argument 1 in template parameter list for 'template<class ItTy, class FuncTy, class FuncReturnTy> class llvm::mapped_iterator'
std::function<T(ptrdiff_t)>>;
^
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: note: expected a type, got 'decltype (seq<ptrdiff_t>(0, 0))::const_iterator'
This is both more efficient and more ergonomic than going
through an std::string, e.g. when using llvm::utostr and
in string concat cases.
Unfortunately we can't just overload ::get(). This causes an
ambiguity because both twine and stringref implicitly convert
from std::string.
Differential Revision: https://reviews.llvm.org/D103754
One of the key algorithms used in the "mlir::verify(op)" method is the
dominance checker, which ensures that operand values properly dominate
the operations that use them.
The MLIR dominance implementation has a number of algorithmic problems,
and is not really set up in general to answer dense queries: it's constant
factors are really slow with multiple map lookups and scans, even in the
easy cases. Furthermore, when calling mlir::verify(module) or some other
high level operation, it makes sense to parallelize the dominator
verification of all the functions within the module.
This patch has a few changes to enact this:
1) It splits dominance checking into "IsolatedFromAbove" units. Instead
of building a monolithic DominanceInfo for everything in a module,
for example, it checks dominance for the module to all the functions
within it (noop, since there are no operands at this level) then each
function gets their own DominanceInfo for each of their scope.
2) It adds the ability for mlir::DominanceInfo (and post dom) to be
constrained to an IsolatedFromAbove region. There is no reason to
recurse into IsolatedFromAbove regions since use/def relationships
can't span this region anyway. This is already checked by the time
the verifier gets here.
3) It avoids querying DominanceInfo for trivial checks (e.g. intra Block
references) to eliminate constant factor issues).
4) It switches to lazily constructing DominanceInfo because the trivial
check case handles the vast majority of the cases and avoids
constructing DominanceInfo entirely in some cases (e.g. at the module
level or for many Regions's that contain a single Block).
5) It parallelizes analysis of collections IsolatedFromAbove operations,
e.g. each of the functions within a Module.
All together this is more than a 10% speedup on `firtool` in circt on a
large design when run in -verify-each mode (our default) since the verifier
is invoked after each pass.
Still todo is to parallelize the main verifier pass. I decided to split
this out to its own thing since this patch is already large-ish.
Differential Revision: https://reviews.llvm.org/D103373
LLVM Dialect uses builtin-integer types. The existing LLVM_AnyInteger
type constraint is a dupe of AnyInteger. This patch removes LLVM_AnyInteger
and replaces all usage with AnyInteger.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103839
This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse.
It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++().
Differential Revision: https://reviews.llvm.org/D102679
* Mark the following methods const:
* `ArrayAttr::getAsRange`
* `ArrayAttr::getAsValueRange`
* `DictionaryAttr::getAs`
* Make `DictionarAttr::getAs` generic over the name class, such that
`Identifier` and `StringRef` arguments get forwarded to the underlying
call to `get`. (Made generic to avoid introducing a dependency on
`include/mlir/IR/Identifier.h` as per the diff discussion.)
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D103822
Now that memref supports arbitrary element types, add support for memref of
memref and make sure it is properly converted to the LLVM dialect. The type
support itself avoids adding the interface to the memref type itself similarly
to other built-in types. This allows the shape, and therefore byte size, of the
memref descriptor to remain a lowering aspect that is easier to customize and
evolve as opposed to sanctifying it in the data layout specification for the
memref type itself.
Factor out the code previously in a testing pass to live in a dedicated data
layout analysis and use that analysis in the conversion to compute the
allocation size for memref of memref. Other conversions will be ported
separately.
Depends On D103827
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D103828
Historically, MemRef only supported a restricted list of element types that
were known to be storable in memory. This is unnecessarily restrictive given
the open nature of MLIR's type system. Allow types to opt into being used as
MemRef elements by implementing a type interface. For now, the interface is
merely a declaration with no methods. Later, methods to query, e.g., the type
size or whether a type can alias elements of another type may be added.
Harden the "standard"-to-LLVM conversion against memrefs with non-builtin
types.
See https://llvm.discourse.group/t/rfc-memref-of-custom-types/3558.
Depends On D103826
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D103827
These `arm_sve.cmp` functions are needed to generate scalable vector
masks as long as scalable vectors are not part of the standard types.
Once in standard, these can be removed and `std.cmp` can be used
instead.
Differential Revision: https://reviews.llvm.org/D103473
This revision adds support for hover on region operations, by temporarily removing the regions during printing. This revision also tweaks the hover format for operations to include symbol information, now that FuncOp can be shown in the hover.
Differential Revision: https://reviews.llvm.org/D103727
This patch convert the if condition on standalone data operation such as acc.update,
acc.enter_data and acc.exit_data to a scf.if with the operation in the if region.
It removes the operation when the if condition is constant and false. It removes the
the condition if it is contant and true.
Conversion to scf.if is done in order to use the translation to LLVM IR dialect out of the box.
Not sure this is the best approach or we should perform this during the translation from OpenACC
to LLVM IR dialect. Any thoughts welcome.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103325
This patch add canonicalization for the standalone data operation with constant if condition.
It is extracted from this patch D103325.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103712
Convert data operands from the acc.parallel operation using the same conversion pattern than D102170.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103337
Implements better naming for results of spv.mlir.addressof ops by making it
inherit from OpAsmOpInterface and implementing the associated
getAsmResultName(...) hook.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D103594
* Add hasUnitStride and hasZeroOffset to OffsetSizeAndStrideOpInterface. These functions are useful for various patterns. E.g., some vectorization patterns apply only for tensor ops with zero offsets and/or unit stride.
* Add getConstantIntValue and isEqualConstantInt helper functions, which are useful for implementing the two above functions, as well as various patterns.
Differential Revision: https://reviews.llvm.org/D103763
Controlled by a compiler option, if 32-bit indices can be handled
with zero/sign-extention alike (viz. no worries on non-negative
indices), scatter/gather operations can use the more efficient
32-bit SIMD version.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D103632
* Rename PadTensorOpVectorizationPattern to GenericPadTensorOpVectorizationPattern.
* Make GenericPadTensorOpVectorizationPattern a private pattern, to be instantiated via populatePadTensorOpVectorizationPatterns.
* Factor out parts of PadTensorOpVectorizationPattern into helper functions.
This commit prepares PadTensorOpVectorizationPattern for a series of subsequent commits that add more specialized PadTensorOp vectorization patterns.
Differential Revision: https://reviews.llvm.org/D103681
Convert data operands from the acc.data operation using the same conversion pattern than D102170.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103332
These started failing on one of our buildbots. I didn't completely root cause the situation and just added the explicit includes that correct the issue.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D103657
This revision adds assembly state tracking for uses of symbols, allowing for go-to-definition and references support for SymbolRefAttrs.
Differential Revision: https://reviews.llvm.org/D103585
This is a followup to D103422. The DenseMapInfo implementations for
ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h
headers, which means that these two headers no longer need to be
included by DenseMapInfo.h.
This required adding a few additional includes, as many files were
relying on various things pulled in by ArrayRef.h.
Differential Revision: https://reviews.llvm.org/D103491
Introduces a test pass that rewrites PadTensorOps with static shapes as a sequence of:
```
linalg.init_tensor // to create output
linalg.fill // to initialize with padding value
linalg.generic // to copy the original contents to the padded tensor
```
The pass can be triggered with:
- `--test-linalg-transform-patterns="test-transform-pad-tensor"`
Differential Revision: https://reviews.llvm.org/D102804
Move the core reducer algorithm into a library so that it'll be easier
for porting to different projects.
Depends On D101046
Reviewed By: jpienaar, rriddle
Differential Revision: https://reviews.llvm.org/D101607
This removes the need to define the derived Operand class before the derived
Value class. The major benefit of this refactoring is that we no longer need
the OpaqueValue class, as OpOperand can now be defined after Value. As part of
this refactoring the BlockOperand and OpOperand classes are moved out of
UseDefLists.h and to more suitable locations in BlockSupport and Value. After
this change, UseDefLists.h is almost entirely composed of generic use def utilities.
Differential Revision: https://reviews.llvm.org/D103353
This simplifies various pieces of code that interact with the pass registry, e.g. this removes the need to register passes to get accurate pass pipelines descriptions when generating crash reproducers.
Differential Revision: https://reviews.llvm.org/D101880
This revision allows for attaching "debug labels" to patterns, and provides to FrozenRewritePatternSet for filtering patterns based on these labels (in addition to the debug name of the pattern). This will greatly simplify the ability to write tests targeted towards specific patterns (in cases where many patterns may interact), will also simplify debugging pattern application by observing how application changes when enabling/disabling specific patterns.
To enable better reuse of pattern rewrite options between passes, this revision also adds a new PassUtil.td file to the Rewrite/ library that will allow for passes to easily hook into a common interface for pattern debugging. Two options are used to seed this utility, `disable-patterns` and `enable-patterns`, which are used to enable the filtering behavior indicated above.
Differential Revision: https://reviews.llvm.org/D102441
Currently the diagnostics reports the file:line:col, but some LSP
frontends require a non-empty range. Report either the range of an
identifier that starts at location, or a range of 1. Expose the id
location to range helper and reuse here.
Differential Revision: https://reviews.llvm.org/D103482
Support for tensor types in the unrolled version will follow in a separate commit.
Add a new pass option to activate lowering of transfer ops with tensor types (default: deactivated).
Differential Revision: https://reviews.llvm.org/D102666
* A Reducer is a kind of RewritePattern, so it's just the same as
writing graph rewrite.
* ReductionTreePass operates on Operation rather than ModuleOp, so that
* we are able to reduce a nested structure(e.g., module in module) by
* self-nesting.
Reviewed By: jpienaar, rriddle
Differential Revision: https://reviews.llvm.org/D101046
The previous impl densely scanned the entire region starting with an op
when dominators were created, creating a DominatorTree for every region.
This is extremely expensive up front -- particularly for clients like
Linalg/Transforms/Fusion.cpp that construct DominanceInfo for a single
query. It is also extremely memory wasteful for IRs that use single
block regions commonly (e.g. affine.for) because it's making a
dominator tree for a region that has trivial dominance. The
implementation also had numerous unnecessary minor efficiencies, e.g.
doing multiple walks of the region tree or tryGetBlocksInSameRegion
building a DenseMap that it didn't need.
This patch switches to an approach where [Post]DominanceInfo is free
to construct, and which lazily constructs DominatorTree's for any
multiblock regions that it needs. This avoids the up-front cost
entirely, making its runtime proportional to the complexity of the
region tree instead of # ops in a region. This also avoids the memory
and time cost of creating DominatorTree's for single block regions.
Finally this rewrites the implementation for simplicity and to avoids
the constant factor problems the old implementation had.
Differential Revision: https://reviews.llvm.org/D103384
Depthwise convolution should support kernel dilation and non-dilation should
not be a special case. Updated op definition to include a dilation attribute.
This also adds a tosa.depthwise_conv2d lowering to linalg to support the new
linalg behavior.
Differential Revision: https://reviews.llvm.org/D103219
Prefix all operations from the ODS of the `Math` dialect with `Math_`
in order to avoid name clashes when including `MathOps.td` in other
TableGen files (e.g., for `FloatUnaryOp`, which also exists in
`Standard`).
Reviewed By: jpienaar, mehdi_amini
Differential Revision: https://reviews.llvm.org/D103248
The `::mlir` namespace for operations from standard is currently
defined by enclosing the header file generated from the ODS in
`Ops.td` in a namespace in `Ops.h`. However, when referencing
operations from `Ops.td` in other TableGen files, this causes the
generated C++ code to refer to classes from the global namespace
instead of `::mlir`.
By defining the namespace through the `cppNamespace` field for
`StandardOps_Dialect` directly in `Ops.td` instead, the ODS
becomes reusable in other TableGen files through simple
inclusion.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D103234
Adding methods to access operand properties via OpOperands and mark outdated methods as deprecated.
Differential Revision: https://reviews.llvm.org/D103394
CSE is the only client of this API, refactor it a bit to pull the query
internally to make changes to DominanceInfo a bit easier. This commit
also improves comments a bit.
The implementation had a couple of problems, including checking
"isProperAncestor" in an inefficient way. It also recursed into
other "isolated from above" ops. In the case of CIRCT, we get
three levels of isolated ops:
mlir::ModuleOp
firrtl::CircuitOp
firrtl::FModuleOp
The verification for module would recurse into the circuits and
fmodules checking them. The verifier hook for circuit would
recurse into all the modules reverifying them, fmoduleop would
then reverify them. The same happens for mlir::ModuleOp and Func.
While here, fix an old design problem: IsolatedFromAbove checking
was implemented by a method on the Region class, which isn't
actually general and isn't used by anything else. Move it over
to be a trait impl verifier method like the others and specialize
it for its task.
Differential Revision: https://reviews.llvm.org/D103345
Implements better naming for results of `spv.Constant` ops by making it
inherit from OpAsmOpInterface and implementing the associated
getAsmResultName(...) hook.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D103152
This allows for checking if a given operation may modify/reference/or both a given value. Right now this API is limited to Value based memory locations, but we should expand this to include attribute based values at some point. This is left for future work because the rest of the AliasAnalysis API also has this restriction.
Differential Revision: https://reviews.llvm.org/D101673
Depends On D103109
If any of the tokens/values added to the `!async.group` switches to the error state, than the group itself switches to the error state.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D103203
Depends On D103102
Not yet implemented:
1. Error handling after synchronous await
2. Error handling for async groups
Will be addressed in the followup PRs
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D103109
In order to allow large matmul operations using the MMA ops we need to chain
operations this is not possible unless "DOp" and "COp" type have matching
layout so remove the "DOp" layout and force accumulator and result type to
match.
Added a test for the case where the MMA value is accumulated.
Differential Revision: https://reviews.llvm.org/D103023
This revision refactors and simplifies the pattern detection logic: thanks to SSA value properties, we can actually look at all the uses of a given value and avoid having to pattern-match specific chains of operations.
A bufferization pattern for subtensor is added and specific inplaceability analysis is implemented for the simple case of subtensor. More advanced use cases will follow.
Differential revision: https://reviews.llvm.org/D102512
* Add `hasCanonicalizer` option to Dialect.
* Initialize canonicalizer with dialect-wide canonicalization patterns.
* Add test case to TestDialect.
Dialect-wide canonicalization patterns are useful if a canonicalization pattern does not conceptually associate with any single operation, i.e., it should not be registered as part of an operation's `getCanonicalizationPatterns` function. E.g., this is the case for canonicalization patterns that match an op interface.
Differential Revision: https://reviews.llvm.org/D103226
Allow support for specifying empty IVs in an `affine.parallel`.
For example:
```
affine.parallel () = () to () {
affine.yield
}
```
Reviewed By: bondhugula, jbruestle
Differential Revision: https://reviews.llvm.org/D102895
Currently, passes are registered on a per-dialect basis, which
provides the smallest footprint obviously. But for prototyping
and experimentation, a convenience "all passes" module is provided,
which registers all known MLIR passes in one run.
Usage in Python:
import mlir.all_passes_registration
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D103130
MLIRContext holds a few special case values that occur frequently like empty
dictionary and NoneType, which allow us to avoid taking locks to get an instance
of them. Give the empty StringAttr this treatment as well. This cuts several
percent off compile time for CIRCT.
Differential Revision: https://reviews.llvm.org/D103117
This provides a sizable compile time improvement by seeding
the worklist in an order that leads to less iterations of the
worklist.
This patch only changes the behavior of the Canonicalize pass
itself, it does not affect other passes that use the
GreedyPatternRewrite driver
Differential Revision: https://reviews.llvm.org/D103053
This allows C++ clients of the Canonicalize pass to specify their own
Config option struct to control how Canonicalize works, increasing reusability.
This also allows controlling these settings for the default Canonicalize pass
using command line options. This is useful for testing and for playing with
things on the command line.
Differential Revision: https://reviews.llvm.org/D103069
Currently, AbstractOperation fields are function pointers.
Modifying them to unique_function allow them to contain
runtime information.
For instance, this allows operations to be defined at runtime.
Differential Revision: https://reviews.llvm.org/D103031
Removed some of the older raw "MLIRized" versions that are
no longer needed now that the sparse runtime support library
can focus on the proper sparse tensor types rather than the
opague pointer approach of the past. This avoids legacy...
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D102960
This patch is the first in a series of patches fixing markdown links and references inside the mlir documentation. I chose to split it in a few reviews to be able to iterate quicker and to ease review.
This patch addresses all broken references to other markdown files and sections inside the Dialects folder.
One change that was also done was to insert '/' between the markdown files and section:
Example:
Builtin.md#integertype
was changed to:
Builtin.md/#integertype
After compilation, hugo then translates the later to jump directly to the integer type section, but not the former. Not inserting the slash would simply jump to just the Builtin page, instead of the integertype section. I therefore changed occurrences of the former version to the later as well.
Differential Revision: https://reviews.llvm.org/D103011
This exposes the iterations and top-down processing as flags, and also
allows controlling whether region simplification is desirable for a client.
This allows deleting some duplicated entrypoints to
applyPatternsAndFoldGreedily.
This also deletes the Constant Preprocessing pass, which isn't worth it
on balance.
All defaults are all kept the same, so no one should see a behavior change.
Differential Revision: https://reviews.llvm.org/D102988
This is the fourth and final patch in a series of patches fixing markdown links and references inside the mlir documentation. This patch combined with the other three should fix almost every broken link on mlir.llvm.org as far as I can tell.
This patch in particular addresses all Markdown files in the top level docs directory.
Differential Revision: https://reviews.llvm.org/D103032
Deconstrains several TOSA operators to align with the current TOSA spec, including all the elementwise ops.
Note: some more ops are under consideration for further cleanup; they will follow once the spec has been updated.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D102958
Steps for normalizing dynamic memrefs for tiled layout map
1. Check if original map is tiled layout. Only tiled layout is supported.
2. Create normalized memrefType. Dimensions that include dynamic dimensions
in the map output will be dynamic dimensions.
3. Create new maps to calculate each dimension size of new memref.
In tiled layout, the dimension size can be calculated by replacing
"floordiv <tile size>" with "ceildiv <tile size>" and
"mod <tile size>" with "<tile size>".
4. Create AffineApplyOp to apply the new maps. The output of AffineApplyOp is
dynamicSizes for new AllocOp.
5. Add the new dynamic sizes in new AllocOp.
This patch also set MemRefsNormalizable trant in CastOp and DimOp since
they used with dynamic memrefs.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D97655
This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).
This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).
The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode). This is complete for generic operations, but requires manual
adoption for custom ops.
I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor. I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.
This is a reapply of the patch here: https://reviews.llvm.org/D102567
with an additional change: we now never defer block argument locations,
guaranteeing that we can round trip correctly.
This isn't required in all cases, but allows us to hill climb here and
works around unrelated bugs like https://bugs.llvm.org/show_bug.cgi?id=50451
Differential Revision: https://reviews.llvm.org/D102991
All lines after the first are currently indented by one char further to the left than the first line. This leads to the first character of each sentence being cut from the resulting Markdown file after compilation. The text also contains 3 references to sections of other markdown files. One was missing the file, while the other two had outdated files, leading to 404 errors in the documentation.
Differential Revision: https://reviews.llvm.org/D102983
This revision completes the "dimension ordering" feature
of sparse tensor types that enables the programmer to
define a preferred order on dimension access (other than
the default left-to-right order). This enables e.g. selection
of column-major over row-major storage for sparse matrices,
but generalized to any rank, as in:
dimOrdering = affine_map<(i,j,k,l,m,n,o,p) -> (p,o,j,k,i,l,m,n)>
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102856
This pattern inlines operands to a linalg.generic operation that use a constant
index and hence are loop-invariant scalars. This reduces the number of
linalg.generic operands and unlocks some canonicalizations that rely on seeing
an explicit tensor.extract.
Differential Revision: https://reviews.llvm.org/D102682
The current implementation has several key limitations and weirdness, e.g local reproducers don't support dynamic pass pipelines, error messages don't include the passes that failed, etc. This revision refactors the implementation to support more use cases, and also be much cleaner.
The main change in this revision, aside from moving the implementation out of Pass.cpp and into its own file, is the addition of a crash recovery pass instrumentation. For local reproducers, this instrumentation handles setting up the recovery context before executing each pass. For global reproducers, the instrumentation is used to provide a more detailed error message, containing information about which passes are running and on which operations.
Example of new message:
```
error: Failures have been detected while processing an MLIR pass pipeline
note: Pipeline failed while executing [`TestCrashRecoveryPass` on 'module' operation: @foo]: reproducer generated at `crash-recovery.mlir.tmp`
```
Differential Revision: https://reviews.llvm.org/D101854
This flag will print the IR after a pass only in the case where the pass failed. This can be useful to more easily view the invalid IR, without needing to print after every pass in the pipeline.
Differential Revision: https://reviews.llvm.org/D101853
Also, fix a small typo where the "unsigned" splat variants were not
being created with an unsigned type.
Differential Revision: https://reviews.llvm.org/D102797
The patch extends the yaml code generation to support the following new OpDSL constructs:
- captures
- constants
- iteration index accesses
- predefined types
These changes have been introduced by revision
https://reviews.llvm.org/D101364.
Differential Revision: https://reviews.llvm.org/D102075
VectorTransferPermutationMapLoweringPatterns can be enabled via a pass option. These additional patterns lower permutation maps to minor identity maps with broadcasting, if possible, allowing for more efficient vector load/stores. The option is deactivated by default.
Differential Revision: https://reviews.llvm.org/D102593
Original interfaces are not safe to be called during dialect conversion.
This is because some ops (e.g. `dynamic_reshape(input, target_shape)`)
depend on the values of their operands to calculate the output shape.
However the operands may be out of reach during dialect conversion (e.g.
converting from tensor world to buffer world). This patch provides a new
kind of interface which accpets user-provided operands to solve this
problem.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D102317
"[mlir] Speed up Lexer::getEncodedSourceLocation"
This reverts commit 3043be9d2d and commit
861d69a525.
This change resulted in printing textual MLIR that can't be parsed; see
review thread https://reviews.llvm.org/D102567 for details.
At present, a lot of code contains main function bodies like "return failed(mlir::MlirOptMain(...);". This is unfortunate for two reasons: a) it uses ADL, which is maybe not what the free "failed" function was designed for; and b) it is a bit awkward to read, requring the reader to both understand the boolean nature of the value and the semantics of main's return value. (And it's also not portable, since 1 is not a portable success value.)
The replacement code, `return mlir::AsMainReturnCode(mlir::MlirOptMain(...))` is a bit more self-explanatory.
The change applies the new function to a few internal uses of MlirOptMain, too.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D102641
This is a hook that allows for providing custom initialization of the pattern, e.g. if it has bounded recursion, setting the debug name, etc., without needing to define a custom constructor. A non-virtual hook was chosen to avoid polluting the vtable with code that we really just want to be inlined when constructing the pattern. The alternative to this would be to just define a constructor for each pattern, this unfortunately creates a lot of otherwise unnecessary boiler plate for a lot of patterns and a hook provides a much simpler/cleaner interface for the very common case.
Differential Revision: https://reviews.llvm.org/D102440
The FIRRTL dialect in CIRCT uses inherently signful types, and APSInt
is the best way to model that. Add a couple of helpers that make it
easier to work with an IntegerAttr that carries a sign.
This follows the example of getZExt() and getSExt() which assert when
the underlying type of the attribute is unexpected. In this case
we assert fail when the underlying type of the attribute is signless.
This is strictly additive, so it is NFC. It is tested in the CIRCT
repo.
Differential Revision: https://reviews.llvm.org/D102701
This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).
This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).
The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode). This is complete for generic operations, but requires manual
adoption for custom ops.
I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor. I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.
Differential Revision: https://reviews.llvm.org/D102567
- Enables inferring return type for ConstShape, takes into account valid return types;
- The compatible return type function could be reused, leaving that for next use refactoring;
Differential Revision: https://reviews.llvm.org/D102182
The experimental flag for "inplace" bufferization in the sparse
compiler can be replaced with the new inplace attribute. This gives
a uniform way of expressing the more efficient way of bufferization.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102538
This change makes the conversion of an mlir::OpState to bool `explicit`. Idiomatic boolean uses continue to work as before, but questionable implicit uses (e.g. accumulating over a range of OpStates to count "true" states) become ill-formed. This makes the class interface a lilttle less error-prone.
I tested this change on our internal (fairly large) codebase, and only one fix was needed, which was ultimately an improvement of the affected code.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D101989
Broadcast dimensions of vector transfer ops are always in-bounds. This is consistent with the fact that the starting position of a transfer is always in-bounds.
Differential Revision: https://reviews.llvm.org/D102566
This brings it in line with the bultin unrealized_conversion_cast,
which memref.buffer_cast is a specialized version of.
Differential Revision: https://reviews.llvm.org/D102608
At the moment `MlirModule`s can be converted to `MlirOperation`s, but not
the other way around (at least not without going around the C API). This
makes it impossible to e.g. run passes over a `ModuleOp` created through
`mlirOperationCreate`.
Reviewed By: nicolasvasilache, mehdi_amini
Differential Revision: https://reviews.llvm.org/D102497
Splitting the memref dialect lead to an introduction of several dependencies
to avoid compilation issues. The canonicalize pass also depends on the
memref dialect, but it shouldn't. This patch resolves the dependencies
and the unintuitive includes are removed. However, the dependency moves
to the constructor of the std dialect.
Differential Revision: https://reviews.llvm.org/D102060
Replace the templated linalgLowerOpToLoops method by three specialized methods linalgOpToLoops, LinalgOpToParallelLoops, and linalgOpToAffineLoops.
Differential Revision: https://reviews.llvm.org/D102324
Provide an option to specify optimization level when creating an
ExecutionEngine via the MLIR JIT Python binding. Not only is the
specified optimization level used for code generation, but all LLVM
optimization passes at the optimization level are also run prior to
machine code generation (akin to the mlir-cpu-runner tool).
Default opt level continues to remain at level two (-O2).
Contributions in part from Prashant Kumar <prashantk@polymagelabs.com>
as well.
Differential Revision: https://reviews.llvm.org/D102551
This change allows the SRC and DST of dma_start operations to be located in the
same memory space. This applies to both the Affine dialect and Memref dialect
versions of these Ops. The documention has been updated to reflect this by
explicitly stating overlapping memory locations are not supported (undefined
behavior).
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D102274
Make "target rank" a pass option of VectorToSCF.
Depends On D102101
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D102123
This covers the extremely common case of replacing all uses of a Value
with a new op that is itself a user of the original Value.
This should also be a little bit more efficient than the
`SmallPtrSet<Operation *, 1>{op}` idiom that was being used before.
Differential Revision: https://reviews.llvm.org/D102373
Support OpImageQuerySize in spirv dialect
co-authored-by: Alan Liu <alanliu.yf@gmail.com>
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D102029
Instead of an SCF for loop, these pattern generate fully unrolled loops with no temporary buffer allocations.
Differential Revision: https://reviews.llvm.org/D101981
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.
This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.
Differential Revision: https://reviews.llvm.org/D101745
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.
This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.
Differential Revision: https://reviews.llvm.org/D101745
First set of "boilerplate" to get sparse tensor
passes available through CAPI and Python.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D102362
This allows for diagnostics emitted during parsing/verification to be surfaced to the user by the language client, as opposed to just being emitted to the logs like they are now.
Differential Revision: https://reviews.llvm.org/D102293
This patch begins to translate acc.enter_data operation to call to tgt runtime call.
It currently only translate create/copyin operands of memref type. This acts as a basis to add support
for FIR types in the Flang/OpenACC support. It follows more or less a similar path than clang
with `omp target enter data map` directives.
This patch is taking a different approach than D100678 and perform a translation to LLVM IR
and make use of the OpenMPIRBuilder instead of doing a conversion to the LLVMIR dialect.
OpenACC support in Flang will rely on the current OpenMP runtime where 1:1 lowering can be
applied. Some extension will be added where features are not available yet.
Big part of this code will be shared for other standalone data operations in the OpenACC
dialect such as acc.exit_data and acc.update.
It is likely that parts of the lowering can also be shared later with the ops for
standalone data directives in the OpenMP dialect when they are introduced.
This is an initial translation and it probably needs more work.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D101504
This factors out the pass timing code into a separate `TimingManager`
that can be plugged into the `PassManager` from the outside. Users are
able to provide their own implementation of this manager, and use it to
time additional code paths outside of the pass manager. Also allows for
multiple `PassManager`s to run and contribute to a single timing report.
More specifically, moves most of the existing infrastructure in
`Pass/PassTiming.cpp` into a new `Support/Timing.cpp` file and adds a
public interface in `Support/Timing.h`. The `PassTiming` instrumentation
becomes a wrapper around the new timing infrastructure which adapts the
instrumentation callbacks to the new timers.
Reviewed By: rriddle, lattner
Differential Revision: https://reviews.llvm.org/D100647
Add a conversion pass to convert higher-level type before translation.
This conversion extract meangingful information and pack it into a struct that
the translation (D101504) will be able to understand.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D102170
DialectAsmParser already allows converting an llvm::SMLoc location to a
mlir::Location location. This commit adds the same functionality to OpAsmParser.
Implementation is copied from DialectAsmParser.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D102165
First step in adding alignment as an attribute to MLIR global definitions. Alignment can be specified for global objects in LLVM IR. It can also be specified as a named attribute in the LLVMIR dialect of MLIR. However, this attribute has no standing and is discarded during translation from MLIR to LLVM IR. This patch does two things: First, it adds the attribute to the syntax of the llvm.mlir.global operation, and by doing this it also adds accessors and verifications. The syntax is "align=XX" (with XX being an integer), placed right after the value of the operation. Second, it allows transforming this operation to and from LLVM IR. It is checked whether the value is an integer power of 2.
Reviewed By: ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D101492
OpAsmParser (and DialectAsmParser) supports a pair of
parseInteger/parseOptionalInteger methods, which allow parsing a bare
integer into a C type of your choice (e.g. int8_t) using templates. It
was implemented in terms of a virtual method call that is hard coded to
int64_t because "that should be big enough".
Change the virtual method hook to return an APInt instead. This allows
asmparsers for custom ops to parse large integers if they want to, without
changing any of the clients of the fixed size C API.
Differential Revision: https://reviews.llvm.org/D102120
All glue and clutter in the linalg ops has been replaced by proper
sparse tensor type encoding. This code is no longer needed. Thanks
to ntv@ for giving us a temporary home in linalg.
So long, and thanks for all the fish.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102098
A very elaborate, but also very fun revision because all
puzzle pieces are finally "falling in place".
1. replaces lingalg annotations + flags with proper sparse tensor types
2. add rigorous verification on sparse tensor type and sparse primitives
3. removes glue and clutter on opaque pointers in favor of sparse tensor types
4. migrates all tests to use sparse tensor types
NOTE: next CL will remove *all* obsoleted sparse code in Linalg
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102095
* The PybindAdaptors.h file has been evolving across different sub-projects (npcomp, circt) and has been successfully used for out of tree python API interop/extensions and defining custom types.
* Since sparse_tensor.encoding is the first in-tree custom attribute we are supporting, it seemed like the right time to upstream this header and use it to define the attribute in a way that we can support for both in-tree and out-of-tree use (prior, I had not wanted to upstream dead code which was not used in-tree).
* Adapted the circt version of `mlir_type_subclass`, also providing an `mlir_attribute_subclass`. As we get a bit of mileage on this, I would like to transition the builtin types/attributes to this mechanism and delete the old in-tree only `PyConcreteType` and `PyConcreteAttribute` template helpers (which cannot work reliably out of tree as they depend on internals).
* Added support for defaulting the MlirContext if none is passed so that we can support the same idioms as in-tree versions.
There is quite a bit going on here and I can split it up if needed, but would prefer to keep the first use and the header together so sending out in one patch.
Differential Revision: https://reviews.llvm.org/D102144
* Adds dialect registration, hand coded 'encoding' attribute and test.
* An MLIR CAPI tablegen backend for attributes does not exist, and this is a relatively complicated case. I opted to hand code it in a canonical way for now, which will provide a reasonable blueprint for building out the tablegen version in the future.
* Also added a (local) CMake function for declaring new CAPI tests, since it was getting repetitive/buggy.
Differential Revision: https://reviews.llvm.org/D102141
In the buffer deallocation pass, unranked memref types are not properly supported.
After investigating this issue, it turns out that the Clone and Dealloc operation
does not support unranked memref types in the current implementation.
This patch adds the missing feature and enables the transformation of any memref
type.
This patch solves this bug: https://bugs.llvm.org/show_bug.cgi?id=48385
Differential Revision: https://reviews.llvm.org/D101760
Previously, the OpenMP to LLVM IR conversion was setting the alloca insertion
point to the same position as the main compuation when converting OpenMP
`parallel` operations. This is problematic if, for example, the `parallel`
operation is placed inside a loop and would keep allocating on stack on each
iteration leading to stack overflow.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D101307
Inside a templated function, other class members need to be called with
this->.
Otherwise we get: explicit qualification required to use member
'setDebugName' from dependent base class.
We are able to bind the result from native function while rewriting
pattern. In matching pattern, if we want to get some values back, we can
do that by passing parameter as return value placeholder. Besides, add
the semantic of '$_self' in NativeCodeCall while matching, it'll be the
operation that defines certain operand.
Differential Revision: https://reviews.llvm.org/D100746
For `AffineLoopFusion` pass, add `memref` dialect as a dependent
dialect. Since the fusion pass can create `memref::AllocOp`s, the
dialect must be registered in its dependent dialects.
The missing dependency was not discovered until now because the above
said op creation happes only when the input already has
`memref::AllocOp`s in it, and all dialects in the input are
automatically added to the context.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D102104
Motivation: we have passes with lot of rewrites and when one one them segfaults or asserts, it is very hard to find waht exactly pattern failed without debug info.
Differential Revision: https://reviews.llvm.org/D101443
The current design uses a unique entry for each argument/result attribute, with the name of the entry being something like "arg0". This provides for a somewhat sparse design, but ends up being much more expensive (from a runtime perspective) in-practice. The design requires building a string every time we lookup the dictionary for a specific arg/result, and also requires N attribute lookups when collecting all of the arg/result attribute dictionaries.
This revision restructures the design to instead have an ArrayAttr that contains all of the attribute dictionaries for arguments and another for results. This design reduces the number of attribute name lookups to 1, and allows for O(1) lookup for individual element dictionaries. The major downside is that we can end up with larger memory usage, as the ArrayAttr contains an entry for each element even if that element has no attributes. If the memory usage becomes too problematic, we can experiment with a more sparse structure that still provides a lot of the wins in this revision.
This dropped the compilation time of a somewhat large TensorFlow model from ~650 seconds to ~400 seconds.
Differential Revision: https://reviews.llvm.org/D102035
This provides information when the user hovers over a part of the source .mlir file. This revision adds the following hover behavior:
* Operation:
- Shows the generic form.
* Operation Result:
- Shows the parent operation name, result number(s), and type(s).
* Block:
- Shows the parent operation name, block number, predecessors, and successors.
* Block Argument:
- Shows the parent operation name, parent block, argument number, and type.
Differential Revision: https://reviews.llvm.org/D101113
This it to make more clear the difference between this and
an AliasAnalysis.
For example, given a sequence of subviews that create values
A -> B -> C -> d:
BufferViewFlowAnalysis::resolve(B) => {B, C, D}
AliasAnalysis::resolve(B) => {A, B, C, D}
Differential Revision: https://reviews.llvm.org/D100838
Replace all `linalg.indexed_generic` ops by `linalg.generic` ops that access the iteration indices using the `linalg.index` op.
Differential Revision: https://reviews.llvm.org/D101612
Nearly complete alignment to spec v0.22
- Adds Div op
- Concat inputs now variadic
- Removes Placeholder op
Note: TF side PR https://github.com/tensorflow/tensorflow/pull/48921 deletes Concat legalizations to avoid breaking TensorFlow CI. This must be merged only after the TF PR has merged.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D101958
It is currently stored in the high bits, which is disallowed on certain
platforms (e.g. android). This revision switches the representation to use
the low bits instead, fixing crashes/breakages on those platforms.
Differential Revision: https://reviews.llvm.org/D101969
This expose a lambda control instead of just a boolean to control unit
dimension folding.
This however gives more control to user to pick a good heuristic.
Folding reshapes helps fusion opportunities but may generate sub-optimal
generic ops.
Differential Revision: https://reviews.llvm.org/D101917
These instructions map to SVE-specific instrinsics that accept a
predicate operand to support control flow in vector code.
Differential Revision: https://reviews.llvm.org/D100982
This patch adds support for vectorizing loops with 'iter_args'
implementing known reductions along the vector dimension. Comparing to
the non-vector-dimension case, two additional things are done during
vectorization of such loops:
- The resulting vector returned from the loop is reduced to a scalar
using `vector.reduce`.
- In some cases a mask is applied to the vector yielded at the end of
the loop to prevent garbage values from being written to the
accumulator.
Vectorization of reduction loops is disabled by default. To enable it, a
map from loops to array of reduction descriptors should be explicitly passed to
`vectorizeAffineLoops`, or `vectorize-reductions=true` should be passed
to the SuperVectorize pass.
Current limitations:
- Loops with a non-unit step size are not supported.
- n-D vectorization with n > 1 is not supported.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100694
While we figure out how to best add Standard support for scalable
vectors, these instructions provide a workaround for basic arithmetic
between scalable vectors.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100837
This revision migrates more code from Linalg into the new permanent home of
SparseTensor. It replaces the test passes with proper compiler passes.
NOTE: the actual removal of the last glue and clutter in Linalg will follow
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D101811
Given the source and destination shapes, if they are static, or if the
expanded/collapsed dimensions are unit-extent, it is possible to
compute the reassociation maps that can be used to reshape one type
into another. Add a utility method to return the reassociation maps
when possible.
This utility function can be used to fuse a sequence of reshape ops,
given the type of the source of the producer and the final result
type. This pattern supercedes a more constrained folding pattern added
to DropUnitDims pass.
Differential Revision: https://reviews.llvm.org/D101343
Convert subtensor and subtensor_insert operations to use their
rank-reduced versions to drop unit dimensions.
Differential Revision: https://reviews.llvm.org/D101495
The current implementation had a bug as it was relying on the target vector
dimension sizes to calculate where to insert broadcast. If several dimensions
have the same size we may insert the broadcast on the wrong dimension. The
correct broadcast cannot be inferred from the type of the source and
destination vector.
Instead when we want to extend transfer ops we calculate an "inverse" map to the
projected permutation and insert broadcast in place of the projected dimensions.
Differential Revision: https://reviews.llvm.org/D101738
Move TransposeOp lowering in its own populate function as in some cases
it is better to keep it during ContractOp lowering to better
canonicalize it rather than emiting scalar insert/extract.
Differential Revision: https://reviews.llvm.org/D101647
Added canonicalization for vector_load and vector_store. An existing
pattern SimplifyAffineOp can be reused to compose maps that supplies
result into them. Added AffineVectorStoreOp and AffineVectorLoadOp
into static_assert of SimplifyAffineOp to allow operation to use it.
This fixes the bug filed: https://bugs.llvm.org/show_bug.cgi?id=50058
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D101691
(1) migrates the encoding from TensorDialect into the new SparseTensorDialect
(2) replaces dictionary-based storage and builders with struct-like data
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D101669
Three patterns are added to convert into vector.multi_reduction into a
sequence of vector.reduction as the following:
- Transpose the inputs so inner most dimensions are always reduction.
- Reduce rank of vector.multi_reduction into 2d with inner most
reduction dim (get the 2d canical form)
- 2D canonical form is converted into a sequence of vector.reduction.
There are two things we might worth in a follow up diff:
- An scf.for (maybe optionally) around vector.reduction instead of unrolling it.
- Breakdown the vector.reduction into a sequence of vector.reduction
(e.g tree-based reduction) instead of relying on how downstream dialects
handle it.
Note: this will requires passing target-vector-length
Differential Revision: https://reviews.llvm.org/D101570
This is the very first step toward removing the glue and clutter from linalg and
replace it with proper sparse tensor types. This revision migrates the LinalgSparseOps
into SparseTensorOps of a sparse tensor dialect. This also provides a new home for
sparse tensor related transformation.
NOTE: the actual replacement with sparse tensor types (and removal of linalg glue/clutter)
will follow but I am trying to keep the amount of changes per revision manageable.
Differential Revision: https://reviews.llvm.org/D101573
This is the very first step toward removing the glue and clutter from linalg and
replace it with proper sparse tensor types. This revision migrates the LinalgSparseOps
into SparseTensorOps of a sparse tensor dialect. This also provides a new home for
sparse tensor related transformation.
NOTE: the actual replacement with sparse tensor types (and removal of linalg glue/clutter)
will follow but I am trying to keep the amount of changes per revision manageable.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D101488
This enables to express more complex parallel loops in the affine framework,
for example, in cases of tiling by sizes not dividing loop trip counts perfectly
or inner wavefront parallelism, among others. One can't use affine.max/min
and supply values to the nested loop bounds since the results of such
affine.max/min operations aren't valid symbols. Making them valid symbols
isn't an option since they would introduce selection trees into memref
subscript arithmetic as an unintended and undesired consequence. Also
add support for converting such loops to SCF. Drop some API that isn't used in
the core repo from AffineParallelOp since its semantics becomes ambiguous in
presence of max/min bounds. Loop normalization is currently unavailable for
such loops.
Depends On D101171
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D101172
Introduce a basic support for parallelizing affine loops with reductions
expressed using iteration arguments. Affine parallelism detector now has a flag
to assume such reductions are parallel. The transformation handles a subset of
parallel reductions that are can be expressed using affine.parallel:
integer/float addition and multiplication. This requires to detect the
reduction operation since affine.parallel only supports a fixed set of
reduction operators.
Reviewed By: chelini, kumasento, bondhugula
Differential Revision: https://reviews.llvm.org/D101171
FillOp allows complex ops, and filling a properly sized buffer with
a default zero complex number is implemented.
Differential Revision: https://reviews.llvm.org/D99939
This revision adds support for vectorizing more general linalg operations with projected permutation maps.
This is achieved by eagerly broadcasting the intermediate vector to the common size
of the iteration domain of the linalg op. This allows a much more natural expression of
generalized vectorization but may introduce additional computations until all the
proper canonicalizations are implemented.
This generalization modifies the vector.transfer_read/write permutation logic and
exposes the fact that the logic employed in vector.contract was too ad-hoc.
As a consequence, changes occur in the permutation / transposition logic for contraction. In turn this prompts supporting more cases in the lowering of contract
to matrix intrinsics, which is required to make the corresponding tests pass.
Differential revision: https://reviews.llvm.org/D101165
Canonicalizations for subtensor operations defaulted to use the
rank-reduced version of the operation, but the cast inserted to get
back the original type would be illegal if the rank was actually
reduced. Instead make the canonicalization not reduce the rank of the
operation.
Differential Revision: https://reviews.llvm.org/D101258