Reimplement this function in terms of `composeMatchingMap`.
Also fix a bug in `composeMatchingMap` where local dims of `this` could be missing in `localCst`.
Differential Revision: https://reviews.llvm.org/D107813
This function overload is similar to the existing `FlatAffineConstraints::addLowerOrUpperBound`. It constrains a dimension based on an affine map. However, in contrast to the other overloading, it does not attempt to align dimensions/symbols of the affine map with the dimensions/symbols of the constraint set. Instead, dimensions/symbols are expected to already be aligned.
Differential Revision: https://reviews.llvm.org/D107727
This function aligns an affine map (and operands) with given dims and syms SSA values.
This is useful in conjunction with `FlatAffineConstraints::addLowerOrUpperBound`, which requires the `boundMap` to be aligned with the constraint set's dims and syms.
Differential Revision: https://reviews.llvm.org/D107728
Some folding cases are trivial to fold away, specifically no-op cases where
an operation's input and output are the same. Canonicalizing these away
removes unneeded operations.
The current version includes tensor cast operations to resolve shape
discreprencies that occur when an operation's result type differs from the
input type. These are resolved during a tosa shape propagation pass.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D107321
Dilation only requires increasing the padding on the left/right side of the
input, and including dilation in the convolution. This implementation still
lacks support for strided convolutions.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D107680
When using an attribute where a value is expected previously this would fail
complaining about unbound symbol. Instead make error clear and mention common
failure reason.
This patch enables normalizing memrefs with MemRef_ReinterpretCastOp by
adding MemRefsNormalizable trait in the Op definition.
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D107425
This enables querying shapes/values as shapes without mutating the IR
directly (e.g., towards enabling doing inference in analysis &
application steps, inferring function shape with constant from callsite,
...). Add a new ShapeAdaptor that abstracts over whether shape is from
Type or ShapedTypeComponents or DenseIntElementsAttribute. This adds new
accessors to ValueShapeRange to get Shape and value as shape, but
doesn't restrict or remove the previous way of accessing Type via the
Value for now, that does mean a less refined shape could be accidentally
queried and will be restricted in follow up.
Currently restricted Value query to what can be represented as Shape. So
only supports cases where constant subgraph evaluation's output is a
shape. I had considered making it more general, but without TBD extern
attribute concept or some such a user cannot today uniformly avoid
overhead.
Update TOSA ops and also the shape inference pass.
Differential Revision: https://reviews.llvm.org/D107768
Replace some code snippets With scf::ForOp methods. Additionally,
share a listener at one more point (although this pattern is still
not safe to roll back currently)
Differential Revision: https://reviews.llvm.org/D107754
The following constructor call (and others) used to be ambiguous:
```
FlatAffineConstraints constraints(0, 0, 0);
```
Differential Revision: https://reviews.llvm.org/D107726
Looks "under the hood" of the sparse stogage schemes.
Users should typically not be interested in these details
(hey, that is why we have "sparse compilers"!) but this
test makes sure the compact contents are as expected.
Reviewed By: ThomasRaoux, bixia
Differential Revision: https://reviews.llvm.org/D107683
Implements lowering dense to sparse conversion, for static tensor types only.
First step towards general sparse_tensor.convert support.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107681
These ops were not ported to the nD vector conversion when it was introduced
and nobody needed them so far.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D107750
Instead, include `<cstdlib>` which is the canonical header containing
the declaration of `alloca()`.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D107699
Perform scalar constant propagation for FPTruncOp only if the resulting value can be represented without precision loss or rounding.
Example:
%cst = constant 1.000000e+00 : f32
%0 = fptrunc %cst : f32 to bf16
-->
%cst = constant 1.000000e+00 : bf16
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D107518
Use new return type for `OpAsmDialectInterface::getAlias`:
* `AliasResult::NoAlias` if an alias was not provided.
* `AliasResult::OverridableAlias` if an alias was provided, but it might be overriden by other hook.
* `AliasResult::FinalAlias` if an alias was provided and it should be used (no other hooks will be checked).
In that case `AsmPrinter` will use either the first alias with `FinalAlias` result or
the last alias with `OverridableAlias` result (it depends on dialect array order).
Used `OverridableAlias` result for `BuiltinOpAsmDialectInterface`.
Use case: provide more informative alias for built-in attributes like `AffineMapAttr`
instead of generic "map<N>".
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D107437
Avoiding absolute imports allows the code to be relocatable (which is used for out of tree integrations).
Differential Revision: https://reviews.llvm.org/D107617
CastOp::areCastCompatible does not check whether casts are definitely compatible.
When going from dynamic to static offset or stride, the canonicalization cannot
know whether it is really cast compatible. In that case, it can only canonicalize
to an alloc plus copy.
Differential Revision: https://reviews.llvm.org/D107545
Tested with gcc-10. Other compilers may generate additional warnings. This does not fix all warnings. There are a few extra ones in LLVMCore and MLIR.
* `OpEmitter::getAttrNameIndex`: -Wunused-function (function is private and not used anywhere)
* `PrintOpPass` copy constructor: -Wextra ("Base class should be explicitly initialized in the copy constructor")
* `LegalizeForLLVMExport.cpp`: -Woverflow (overflow is expected, silence warning by making the cast explicit)
Differential Revision: https://reviews.llvm.org/D107525
The file in which `Region::viewGraph` is defined has changed. This should have been updated with D106342.
Differential Revision: https://reviews.llvm.org/D107517
There is a case in EmitDiagnostics where the filter check is bypassed (when locationStack is empty). Filter might also be bypassed when loc instead of showableLoc is added to the locationStack.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D106522
A text file may be comprised of many different "chunks", when
the input file contains the `// -----` split markers. We don't
need to use a unique MLIRContext per chunk, as having
separate contexts is intended to allow for easy unloading of
unused data and all chunks have the same lifetime (tied to the
input file). This commit uses one context for the entire file,
greatly reducing memory consumption in certain situations (up
to 70%).
Differential Revision: https://reviews.llvm.org/D107488
'mlir::DiagnosticEngine::DiagnosticEngine(const mlir::DiagnosticEngine&)' is implicitly deleted because the default definition would be ill-formed.
Reviewed By: rdzhabarov
Differential Revision: https://reviews.llvm.org/D107287
Making sure the AMX dialect webpage reads better with a short introduction on the purpose of this dialect.
Reviewed By: grosul1, bondhugula
Differential Revision: https://reviews.llvm.org/D107419
There are many downstream users of llvm::dbgs, which is defined in Debug.h. Before D106342, many users included that dependency transitively via the now deleted ViewRegionGraph.h. Adding it back to Transforms/Passes.h for convenience.
Differential Revision: https://reviews.llvm.org/D107451
* Add new pass option `print-data-flow-edges`, default value `true`.
* Add new pass option `print-control-flow-edges`, default value `false`.
* Remove `PrintCFGPass`. Same functionality now provided by
`PrintOpPass`.
Differential Revision: https://reviews.llvm.org/D106342
The existing vector transforms reduce the dimension of transfer_read
ops. However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector. This handles this case.
Note that in the longer term, it may be preferaby to support
zero-dimension vectors. see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.
Differential Revision: https://reviews.llvm.org/D103432
This test ensures that an error is generated from the Python side when running a module pass on a function. The test used to instantiate ViewOpGraph, however, this pass was changed into a general "any op" pass in D106253. Therefore, a different pass must be used in this test.
Differential Revision: https://reviews.llvm.org/D107424
* New pass option `max-label-len`: Truncate attributes/result types that have more #chars.
* New pass option `print-attrs`: Activate/deactivate rendering of attributes.
* New pass option `printResultTypes`: Activate/deactivate rendering of result types.
Differential Revision: https://reviews.llvm.org/D106337
* Visualize blocks and regions as subgraphs.
* Generate DOT file directly instead of using `GraphTraits`. `GraphTraits` does not support subgraphs.
Differential Revision: https://reviews.llvm.org/D106253
Also makes style consistent with the "surrounding"
text that appears on one webpage in MLIR doc
Reviewed By: grosul1
Differential Revision: https://reviews.llvm.org/D107418
We can propagate the shape from tosa.cond_if operands into the true/false
regions then through the connected blocks. Then, using the tosa.yield ops
we can determine what all possible return types are.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105940
Handles shape inference for identity, cast, and rescale. These were missed
during the initialy elementwise work. This includes resize shape propagation
which includes both attribute and input type based propagation.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105845
This prevents an explosion of threads, given that each file gets its own context and thus its own thread pool. We don't really need a thread pool for the LSP contexts anyways, so it's better to just disable threading.
mlir/test/transforms/loop-fusion.mlir is too big and is split into mlir/test/transforms/loop-fusion.mlir, mlir/test/transforms/loop-fusion-2.mlir, mlir/test/transforms/loop-fusion-3.mlir
and mlir/test/transforms/loop-fusion-4.mlir. Further tests can be added in mlir/test/transforms/loop-fusion-4.mlir
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D106473
This patch adds the critical construct to the OpenMP dialect. The
implementation models the definition in 2.17.1 of the OpenMP 5 standard.
A name and hint can be specified. The name is a global entity or has
external linkage, it is modelled as a FlatSymbolRefAttr. Hint is
modelled as an integer enum attribute.
Also lowering to LLVM IR using the OpenMP IRBuilder.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107135
Store both interfaceID and objectID as key for interface registration callback.
Otherwise the implementation allows to register only one external model per one object in the single dialect.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107274
Silence clang-tidy warning in AffineOps.cpp due to the inability to see
through the typeswitch. NFC.
Differential Revision: https://reviews.llvm.org/D106125
There are cases in which it is not desirable to fully compose the bound map with the operands when adding lower/upper bounds to a `FlatAffineConstraints`.
E.g., this is the case when bounds should be expressed in terms of the operands only (and not the operands' dependencies). This also makes `addLowerOrUpperBound` useable together with operands that are defined through semi-affine expressions.
Differential Revision: https://reviews.llvm.org/D107221
Bounds such as `dim_{pos} <= c_1 * dim_x + ...` where `x == pos` are invalid. `addLowerOrUpperBound` previously added an incorrect inequality to the set. Such cases are now explicitly rejected.
Differential Revision: https://reviews.llvm.org/D107220
Add ForLoopBoundSpecialization pass, which specializes scf.for loops into a "main loop" where `step` divides the iteration space evenly and into an scf.if that handles the last iteration.
This transformation is useful for vectorization and loop tiling. E.g., when vectorizing loads/stores, programs will spend most of their time in the main loop, in which only unmasked loads/stores are used. Only the in the last iteration (scf.if), slower masked loads/stores are used.
Subsequent commits will apply this transformation in the SparseDialect and in Linalg's loop tiling.
Differential Revision: https://reviews.llvm.org/D105804
There was a slightly mismatch between the double COO and actual numerical
type in the final sparse tensor storage (due to external formats always
using double). This minor revision removes that inconsistency by using a
properly typed COO and casting during the "add" method instead. This also
prepares alternative ways of initializing the COO object.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D107310
We had a [bad bug](69655864ee) over in CIRCT
caused by accidentally passing around PatternRewriter
by value. There is no reason to support copy/assignment
of the pattern rewriter, so disable it.
Differential Revision: https://reviews.llvm.org/D107232
This patch fixes a bug in the existing implementation of detectAsFloorDiv,
where floordivs with numerator with non-zero constant term and floordivs with
numerator only consisting of a constant term were not being detected.
Reviewed By: vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D107214
The `DataLayout` class currently contains the member `layoutStack` which is hidden behind a preprocessor region dependant on the NDEBUG macro. Code wise this makes a lot of sense, as the `layoutStack` is used for extra assertions that users will want when compiling a debug build.
It however has the uncomfortable consequence of leading to a different ABI in Debug and Release builds. This I think is a bit annoying for downstream projects and others as they may want to build against a stable Release of MLIR in Release mode, but be able to debug their own project depending on MLIR.
This patch changes the related uses of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS. As the macro is computed at configure time of LLVM, it may not change based on compiler settings of a downstream projects like NDEBUG would.
Differential Revision: https://reviews.llvm.org/D107227
Introduces a conversion from one (sparse) tensor type to another
(sparse) tensor type. See the operation doc for details. Actual
codegen for all cases is still TBD.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107205
NFC. Clean up stale doc comments on memref replacement utility and some
variable renaming in it to avoid confusion.
Differential Revision: https://reviews.llvm.org/D107144
This allows for reusing the same output channel when the extension reloads after updating the server. Currently, whenever the extension restarts a new output channel is created (which can lead to a large number of seemingly dead output channels).
Quite a few things were out-of-date, or just not
organized well. This revision updates the extension
name, repo, icon, and many other components in
preperation for publishing the extension to the
marketplace.
If the source value to load is bool, and we have native storage
capability support for the source bitwidth, we still cannot directly
rewrite uses; we need to perform casting to bool first.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D107119
If the source value to store is bool, and we have native storage
capability support for the target bitwidth, we still cannot directly
store; we need to perform casting to match the target memref
element's bitwidth.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D107114
Rationale:
External file formats always store the values as doubles, so this was
hard coded in the memory resident COO scheme used to pass data into the
final sparse storage scheme during setup. However, with alternative methods
on the horizon of setting up these temporary COO schemes, it is time to
properly template this data structure.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D107001
The effect name is used by tablegen when generating the getEffects method of the SideEffectInterfaces. It is currently unqualified even though the class is contained within the mlir namespace, leading to compiler errors when using namespace mlir; isn't used before including the generated cpp file.
This patch fixes that by simply fully qualifying the class name.
Differential Revision: https://reviews.llvm.org/D107171
The presence of AffineIfOp inside AffineFor prevents fusion of the other loops to happen. For example:
```
affine.for %i0 = 0 to 10 {
affine.store %cf7, %a[%i0] : memref<10xf32>
}
affine.for %i1 = 0 to 10 {
%v0 = affine.load %a[%i1] : memref<10xf32>
affine.store %v0, %b[%i1] : memref<10xf32>
}
affine.for %i2 = 0 to 10 {
affine.if #set(%i2) {
%v0 = affine.load %b[%i2] : memref<10xf32>
}
}
```
The first two loops were not be fused because of `affine.if` inside the last `affine.for`.
The issue seems to come from a conservative constraint that does not allow fusion if there are ops whose number of regions != 0 (affine.if is one of them).
This patch just removes such a constraint when`affine.if` is inside `affine.for`. The existing `canFuseLoops` method is able to handle `affine.if` correctly.
Reviewed By: bondhugula, vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D105963
When we vectorize a scalar constant, the vector constant is inserted before its
first user if the scalar constant is defined outside the loops to be vectorized.
It is possible that the vector constant does not dominate all its users. To fix
the problem, we find the innermost vectorized loop that encloses that first user
and insert the vector constant at the top of the loop body.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106609
Make broadcastable needs the output shape to determine whether the operation
includes additional broadcasting. Include some canonicalizations for TOSA
to remove unneeded reshape.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D106846
`PadTensorOp` has verification logic to make sure
result dim must be static if all the padding values are static.
Cast folding might add more static information for the src operand
of `PadTensorOp` which might change a valid operation to be invalid.
Change the canonicalizing pattern to fix this.
* Adds source targets (not included in the full set that downstreams use by default) to bundle mlir-c/ headers into the mlir/_mlir_libs/include directory.
* Adds a minimal entry point to get include and library directories.
* Used by npcomp to export a full CAPI (which is then used by the Torch extension to link npcomp).
Reviewed By: mikeurbach
Differential Revision: https://reviews.llvm.org/D107090
Currently TFRT does not support top-level coroutines, so this functionality will allow to have a single blocking await at the top level until TFRT implements the necessary functionality.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D106730
Change the formatting of the debug print outs to elide unnecessary information.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106661
This is redundant with the callback variant and untested. Also remove
the callback-less methods for adding a dynamically legal op, as they
are no longer useful.
Differential Revision: https://reviews.llvm.org/D106786
This allows to use OperationEquivalence to track structural comparison for equality
between two operations.
Differential Revision: https://reviews.llvm.org/D106422
* For python projects that don't need JIT/ExecutionEngine, cuts the number of files to compile roughly in half (with similar reduction in end binary size).
Differential Revision: https://reviews.llvm.org/D106992
By making an explicit template specialization for the TypeID provided by these classes,
the compiler will not emit an inline weak definition and rely on the linker to unique it.
Instead a single definition will be emitted in the C++ file alongside the implementation
for these classes. That will turn into a linker error what is now a hard-to-debug runtime
behavior where instances of the same class may be using a different TypeID inside of
different DSOs.
Recommit 660a56956c after fixing gcc5
build.
Differential Revision: https://reviews.llvm.org/D105903
Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though)
Differential Revision: https://reviews.llvm.org/D105149
Interop parallelism requires needs awaiting on results. Blocking awaits are bad for performance. TFRT supports lightweight resumption on threads, and coroutines are an abstraction than can be used to lower the kernels onto TFRT threads.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D106508
By making an explicit template specialization for the TypeID provided by these classes,
the compiler will not emit an inline weak definition and rely on the linker to unique it.
Instead a single definition will be emitted in the C++ file alongside the implementation
for these classes. That will turn into a linker error what is now a hard-to-debug runtime
behavior where instances of the same class may be using a different TypeID inside of
different DSOs.
Differential Revision: https://reviews.llvm.org/D105903
In translation from MLIR to another IR, run the MLIR verifier on the parsed
module to ensure only valid modules are given to the translation. Previously,
we would send any module that could be parsed to the translation, including
semantically invalid modules, leading to surprising errors or lack thereof down
the pipeline.
Depends On D106937
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106938
The verifier of the llvm.call operation was not checking for mismatches between
the number of operation results and the number of results in the signature of
the callee. Furthermore, it was possible to construct an llvm.call operation
producing an SSA value of !llvm.void type, which should not exist. Add the
verification and treat !llvm.void result type as absence of call results.
Update the GPU conversions to LLVM that were mistakenly assuming that it was
fine for llvm.call to produce values of !llvm.void type and ensure these calls
do not produce results.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106937
- Fixed symbol insertion into `symNameToModuleMap`. Insertion
needs to happen whether symbols are renamed or not.
- Added check for the VCE triple and avoid dropping it.
- Disabled function deduplication. It requires more careful
rules. Right now it can remove different functions.
- Added tests for symbol rename listener.
- And some other code/comment cleanups.
Reviewed By: ergawy
Differential Revision: https://reviews.llvm.org/D106886
Specialize the DeduplicateInputs and RemoveIdentityLinalgOps patterns for GenericOp instead of implementing them for the LinalgOp interface.
This revsion is based on https://reviews.llvm.org/D105622 that moves the logic to erase identity CopyOps in a separate pattern.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105291
Split out an EraseIdentityCopyOp from the existing RemoveIdentityLinalgOps pattern. Introduce an additional check to ensure the pattern checks the permutation maps match. This is a preparation step to specialize RemoveIdentityLinalgOps to GenericOp only.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105622
`memref.collapse_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.
Cast folding might add more static information for the src operand
of `memref.collapse_shape` which might change a valid collapsing
operation to be invalid. Add `CollapseShapeOpMemRefCastFolder` pattern
to fix this.
Minor changes to `convertReassociationIndicesToExprs` to use `context`
instead of `builder` to avoid extra steps to construct temporary
builders.
Reviewed By: nicolasvasilache, mravishankar
Differential Revision: https://reviews.llvm.org/D106670
By making an explicit template specialization for the TypeID provided by these classes,
the compiler will not emit an inline weak definition and rely on the linker to unique it.
Instead a single definition will be emitted in the C++ file alongside the implementation
for these classes. That will turn into a linker error what is now a hard-to-debug runtime
behavior where instances of the same class may be using a different TypeID inside of
different DSOs.
Differential Revision: https://reviews.llvm.org/D105903
RewriteEndOp was a fake terminator operation that is no longer needed now that blocks are not required to have terminators.
Differential Revision: https://reviews.llvm.org/D106911
* Implements all of the discussed features:
- Links against common CAPI libraries that are self contained.
- Stops using the 'python/' directory at the root for everything, opening the namespace up for multiple projects to embed the MLIR python API.
- Separates declaration of sources (py and C++) needed to build the extension from building, allowing external projects to build custom assemblies from core parts of the API.
- Makes the core python API relocatable (i.e. it could be embedded as something like 'npcomp.ir', 'npcomp.dialects', etc). Still a bit more to do to make it truly isolated but the main structural reset is done.
- When building statically, installed python packages are completely self contained, suitable for direct setup and upload to PyPi, et al.
- Lets external projects assemble their own CAPI common runtime library that all extensions use. No more possibilities for TypeID issues.
- Begins modularizing the API so that external projects that just include a piece pay only for what they use.
* I also rolled in a re-organization of the native libraries that matches how I was packaging these out of tree and is a better layering (i.e. all libraries go into a nested _mlir_libs package). There is some further cleanup that I resisted since it would have required source changes that I'd rather do in a followup once everything stabilizes.
* Note that I made a somewhat odd choice in choosing to recompile all extensions for each project they are included into (as opposed to compiling once and just linking). While not leveraged yet, this will let us set definitions controlling the namespacing of the extensions so that they can be made to not conflict across projects (with preprocessor definitions).
* This will be a relatively substantial breaking change for downstreams. I will handle the npcomp migration and will coordinate with the circt folks before landing. We should stage this and make sure it isn't causing problems before landing.
* Fixed a couple of absolute imports that were causing issues.
Differential Revision: https://reviews.llvm.org/D106520
The order of testing in two sparse tensor ops was incorrect,
which could cause an invalid cast (crashing the compiler instead
of reporting the error). This revision fixes that bug.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106841
This aligns the structure of the Affine dialect on all the other dialects.
In particular this makes the ODS C++ generated code independent of the
enclosing namespace.
Retaining old interface and should be constructable as previous, change would have been NFC except it this doesn't implicitly work with OpAdaptor generated in C++14.
Differential Revision: https://reviews.llvm.org/D106772
Tosa shape verification prevent shape propagation when coming from a dialect
of known shape. Relax this constraint to allow ingestion / shape propagation
from these other dialects.
Differential Revision: https://reviews.llvm.org/D106610
Add 'enconding' attribute visitor.
Without it ASM printer doesn't use attribute aliases for 'enconding'.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D105554
This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be
passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the
involved operands. However, in contrast to the BranchOpInterface, it expects an additional region
number to distinguish between various use cases which might require different operands passed to
different regions.
Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and
getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the
ReturnLike trait and/or implementing the newly added interface. This simplifies reasoning about
terminators in the scope of the nested regions.
We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities.
Differential Revision: https://reviews.llvm.org/D105018
* Get rid of Optional<std::function> as std::function already have a null state
* Add private setLegalityCallback function to set legality callback for unknown ops
* Get rid of unknownOpsDynamicallyLegal flag, use unknownLegalityFn state insted. This causes behavior change when user first calls markUnknownOpDynamicallyLegal with callback and then without but I am not sure is the original behavior was really a 'feature', or just oversignt in the original implementation.
Differential Revision: https://reviews.llvm.org/D105496
- Rename isLastUse to isDeadAfter to reflect what the function does.
- Avoid a second walk over all operations in BlockInfoBuilder constructor.
- use std::move() to save the new in set.
Differential Revision: https://reviews.llvm.org/D106702
When the output indexing map has a permutation we need to consider in
the contraction vector type.
Differential Revision: https://reviews.llvm.org/D106469
When the source tensor of a tensor.insert_slice is not equivalent to an inplace buffer an extra copy is necessary. This revision adds the missing copy.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D106587
If this pass executes without shape inference its possible for unranked tensors
to appear in the IR. This pass should gracefully handle unranked tensors.
Differential Revision: https://reviews.llvm.org/D106617
Includes a version of a quantized conv2D operations with a lowering from TOSA
to linalg with corresponding test. We keep the quantized and quantized variants
as separate named ops to avoid the additional operations for non-quantized
convolutions.
Differential Revision: https://reviews.llvm.org/D106407
- Change findDealloc() to return Optional<Operation *> and return None if > 1
dealloc is associated with the given alloc.
- Add findDeallocs() to return all deallocs associated with the given alloc.
- Fix current uses of findDealloc() to bail out if > 1 dealloc is found.
Differential Revision: https://reviews.llvm.org/D106456
More specifically:
1) Use variable after move.
2) steady_clock needs to be used for measuring time intervals, but not the system_clock.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106513
Fix affine.for empty loop body folder in the presence of yield values.
The existing pattern ignored iter_args/yield values and thus crashed
when yield values had uses.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106121
Restore 499571ea83
reverted by 0082764605.
A compiler slightly older than
"[clang][Sema] removes -Wfree-nonheap-object reference param false positive"
may report the false positive.
We need to retain the workaround a bit longer so that such compilers
can be used to compile MLIR in a warning-free way.
Type conversion and argument materialization are context-free: there is no available information on which op / branch is currently being converted.
As a consequence, bare ptr convention cannot be handled as an argument materialization: it would apply irrespectively of the parent op.
This doesn't typecheck in the case of non-funcOp and we would see cases where a memref descriptor would be inserted in place of the pointer in another memref descriptor.
For now the proper behavior is to revert to a specific BarePtrFunc implementation and drop the blanket argument materialization logic.
This reverts the relevant piece of the conversion to LLVM to what it was before https://reviews.llvm.org/D105880 and adds a relevant test and documentation to avoid the mistake by whomever attempts this again in the future.
Reviewed By: arpith-jacob
Differential Revision: https://reviews.llvm.org/D106495
Range type that allows for wrapping different value & shape ranges with
correspondence to Shape's ValueShape type - initially aliased to
ValueRange (which corresponds to the trivial mapping from a ShapedType's
Value's shape to shape). Just plain alias, before expanding.
Differential Revision: https://reviews.llvm.org/D99133
AffineForOp's folding hook is expected to fold away trivially empty
affine.for. This allows simplification to happen as part of the
canonicalizer and from wherever the folding hook is used. While more
complex analysis based zero trip count detection is available from other
passes in analysis and transforms, simple and inexpensive folding had
been missing.
Also, update/improve affine.for op documentation clarifying semantics of
the result values for zero trip count loops.
Differential Revision: https://reviews.llvm.org/D106123
Introduce a new rewrite driver (MultiOpPatternRewriteDriver) to rewrite
a supplied list of ops and other ops. Provide a knob to restrict
rewrites strictly to those ops or also to affected ops (but still not to
completely related ops).
This rewrite driver is commonly needed to run any simplification and
cleanup at the end of a transforms pass or transforms utility in a way
that only simplifies relevant IR. This makes it easy to write test cases
while not performing unrelated whole IR simplification that may
invalidate other state at the caller.
The introduced utility provides more freedom to developers of transforms
and transform utilities to perform focussed and local simplification. In
several cases, it provides greater efficiency as well as more
simplification when compared to repeatedly calling
`applyOpPatternsAndFold`; in other cases, it avoids the need to
undesirably call `applyPatternsAndFoldGreedily` to do unrelated
simplification in a FuncOp.
Update a few transformations that were earlier using
applyOpPatternsAndFold (SimplifyAffineStructures,
affineDataCopyGenerate, a linalg transform).
TODO:
- OpPatternRewriteDriver can be removed as it's a special case of
MultiOpPatternRewriteDriver, i.e., both can be merged.
Differential Revision: https://reviews.llvm.org/D106232
This patch allows iterating typed enum via the ADT/Sequence utility.
It also changes the original design to better separate concerns:
- `StrongInt` only deals with safe `intmax_t` operations,
- `SafeIntIterator` presents the iterator and reverse iterator
interface but only deals with safe `StrongInt` internally.
- `iota_range` only deals with `SafeIntIterator` internally.
This design ensures that operations are always valid. In particular,
"Out of bounds" assertions fire when:
- the `value_type` is not representable as an `intmax_t`
- iterator operations make internal computation underflow/overflow
- the internal representation cannot be converted back to `value_type`
Differential Revision: https://reviews.llvm.org/D106279
We are able to bind NativeCodeCall result as binding operation. To make
table-gen have better understanding in the form of helper function,
we need to specify the number of return values in the NativeCodeCall
template. A VoidNativeCodeCall is added for void case.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D102160
libMLIRPublicAPI.so came into existence early when the Python and C-API were being co-developed because the Python extensions need a single DSO which exports the C-API to link against. It really should never have been exported as a mondo library in the first place, which has caused no end of problems in different linking modes, etc (i.e. the CAPI tests depended on it).
This patch does a mechanical move that:
* Makes the C-API tests link directly to their respective libraries.
* Creates a libMLIRPythonCAPI as part of the Python bindings which assemble to exact DSO that they need.
This has the effect that the C-API is no longer monolithic and can be subset and used piecemeal in a modular fashion, which is necessary for downstreams to only pay for what they use. There are additional, more fundamental changes planned for how the Python API is assembled which should make it more out of tree friendly, but this minimal first step is necessary to break the fragile dependency between the C-API and Python API.
Downstream actions required:
* If using the C-API and linking against MLIRPublicAPI, you must instead link against its constituent components. As a reference, the Python API dependencies are in lib/Bindings/Python/CMakeLists.txt and approximate the full set of dependencies available.
* If you have a Python API project that was previously linking against MLIRPublicAPI (i.e. to add its own C-API DSO), you will want to `s/MLIRPublicAPI/MLIRPythonCAPI/` and all should be as it was. There are larger changes coming in this area but this part is incremental.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106369
This allows caller to use non-const functions, e.g., `getOperandNumber`, etc. It
is expected that OpOperand is not modified in a callback function.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D106322
The unstrided transposed conv can be represented as a regular convolution.
Lower to this variant to handle the basic case. This includes transitioning from
the TC defined convolution operation and a yaml defined one.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D106389
Added the named op variants for quantized matmul and quantized batch matmul
with the necessary lowerings/tests from tosa's matmul/fully connected ops.
Current version does not use the contraction op interface as its verifiers
are not compatible with scalar operations.
Differential Revision: https://reviews.llvm.org/D105063
Allows for grouping OpTraits with list of OpTrait to make it easier to group OpTraits together without needing to use list concats (e.g., enable using `[Traits, ..., UsefulGroupOfTraits, Others, ...]` instead of `[Traits, ...] # UsefulGroupOfTraits # [Others, ...]`). Flatten in construction of Operation. This recurses here as the expectation is that these aren't expected to be deeply nested (most likely only 1 level of nesting).
Differential Revision: https://reviews.llvm.org/D106223
- Change walkReturnOperations() to be a non-template and look at block terminator
for ReturnLike trait.
- Clarify description of validateSupportedControlFlow
- Eliminate unused argument in Backedges::recurse.
- Eliminate repeated calls to getFunction()
- Fix wording for non-SCF loop failure
Differential Revision: https://reviews.llvm.org/D106373
Bufferization handles all unknown ops conservative. The patch ensures accessing the dimension of an output tensor does not prevent in place bufferization.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106356
For example, we will generate incorrect code for the pattern,
def : Pat<((FooOp (FooOp, $a, $b), $b)), (...)>;
We didn't allow $b to be bond twice with same operand of same op.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105677
The `reifyReturnTypeShapesPerResultDim` method supports shape
inference for rsults that are ranked types. These are used lower in
the codegeneration stack than its counter part `reifyReturnTypeShapes`
which also supports unranked types, and is more suited for use higher
up the compilation stack. To have separation of concerns, this method
is split into its own interface.
See discussion : https://llvm.discourse.group/t/better-layering-for-infershapedtypeopinterface/3823
Differential Revision: https://reviews.llvm.org/D106133
This is the first step to support software pipeline for scf.for loops.
This is only the transformation to create pipelined kernel and
prologue/epilogue.
The scheduling needs to be given by user as many different algorithm
and heuristic could be applied.
This currently doesn't handle loop arguments, this will be added in a
follow up patch.
Differential Revision: https://reviews.llvm.org/D105868
This makes it more explicit what the scope of this pass is. The name
of this pass predates fusion on tensors using tile + fuse, and hence
the confusion.
Differential Revision: https://reviews.llvm.org/D106132
Added shape inference handles cases for convolution operations. This includes
conv2d, conv3d, depthwise_conv2d, and transpose_conv2d. With transpose conv
we use the specified output shape when possible however will shape propagate
if the output shape attribute has dynamic values.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105645
This deletes all the pooling ops in LinalgNamedStructuredOpsSpec.tc. All the
uses are replaced with the yaml pooling ops.
Reviewed By: gysit, rsuderman
Differential Revision: https://reviews.llvm.org/D106181
Add pattern to fold a TensorCast into a PadTensorOp if the cast removes static size information.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106278
Insert ops replacing pad_tensor in front of the associated tansfer_write / insert_slice op. Otherwise we may end up with invalid ir if one of the remaining tansfer_write / insert_slice operands is defined after the pad_tensor op.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106162
Remove uses of to-be-deprecated API. In cases where the correct
element type was not immediately obvious to me, fall back to
explicit getPointerElementType().
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.
* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)
Differential Revision: https://reviews.llvm.org/D106118
Removed inconsistent name prefixes, added consistency checks
on debug strings, added more assertions to verify assumptions
that may be lifted in the future.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106108
The dialect-specific cast between builtin (ex-standard) types and LLVM
dialect types was introduced long time before built-in support for
unrealized_conversion_cast. It has a similar purpose, but is restricted
to compatible builtin and LLVM dialect types, which may hamper
progressive lowering and composition with types from other dialects.
Replace llvm.mlir.cast with unrealized_conversion_cast, and drop the
operation that became unnecessary.
Also make unrealized_conversion_cast legal by default in
LLVMConversionTarget as the majority of convesions using it are partial
conversions that actually want the casts to persist in the IR. The
standard-to-llvm conversion, which is still expected to run last, cleans
up the remaining casts standard-to-llvm conversion, which is still
expected to run last, cleans up the remaining casts
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105880
This may be necessary in partial multi-stage conversion when a container type
from dialect A containing types from dialect B goes through the conversion
where only dialect A is converted to the LLVM dialect. We will need to keep a
pointer-to-non-LLVM type in the IR until a further conversion can convert
dialect B types to LLVM types.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D106076
In cases where an operation has an argument or result named 'property', the
ODS-generated python fails on import because the `@property` resolves to the
`property` operation argument instead of the builtin `@property` decorator. We
should always use the fully qualified decorator name.
Reviewed By: mikeurbach
Differential Revision: https://reviews.llvm.org/D106106
Changes include the following:
1. Single iteration reduction loops being sibling fused at innermost insertion level
are skipped from being considered as sequential loops.
Otherwise, the slice bounds of these loops is reset.
2. Promote loops that are skipped in previous step into outer loops.
3. Two utility function - buildSliceTripCountMap, getSliceIterationCount - are moved from
mlir/lib/Transforms/Utils/LoopFusionUtils.cpp to mlir/lib/Analysis/Utils.cpp
Reviewed By: bondhugula, vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D104249
This format was missing from the support library. Although there are some
subtleties reading in an external format for int64 as double, there is no
good reason to omit support for this data type form the support library.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106016
Arbitrary shifts have some complications, but shift by invariants
(viz. tensor index exp only at left hand side) can be easily
handled with the conjunctive rule.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106002
With the migration from linalg.copy to memref.copy, this pass
(which was there solely to handle the linalg.copy op) is no
longer required for the end-to-end path for sparse compilation.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D106073
Instead of relying on adhoc bounds calculations, use a projection-based
implementation. This simplifies the implementation and finds more static
constant sizes than previously/
Differential Revision: https://reviews.llvm.org/D106054
While replacing linalg.copy with the more desired memref.copy
I found a bug in the support library for rank 0 memref copying.
The code would loop for something like the following, since there
is code for no-rank and rank > 0, but rank == 0 was unexpected.
memref.copy %0, %1: memref<f32> to memref<f32>
Note that a "regression test" for this will follow using the
sparse compiler migration to memref.copy which exercises this
case many times.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D106036
Adds `owner` python call to `mlir.ir.Value`.
Assuming that `PyValue.parentOperation` is intended to be the value's owner, this fixes the construction of it from `PyOpOperandList`.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D103853
Tiling can be enabled with `linalg-tile-pad-tensor-ops`. Only scf::ForOp can be generated at the moment.
Differential Revision: https://reviews.llvm.org/D105460
Factor out the functionality into a new function, so that it can be used for creating PadTensorOp tiles.
Differential Revision: https://reviews.llvm.org/D105458
The documentation on these was out of sync with the implementation. Also
the declaration of inputs was repeated when it is already part of the
ArithmeticCastOp definition.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D105934
Adds zero-preserving unary operators from std. Also adds xor.
Performs minor refactoring to remove "zero" node, and pushed
the irregular logic for negi (not support in std) into one place.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105928
Clean up corner cases related to elemental tensor / buffer type return values that would previously fail.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D105857
Previously, linalg bufferization always had to be conservative at function boundaries and assume the most dynamic strided memref layout.
This revision introduce the mechanism to specify a linalg.buffer_layout function argument attribute that carries an affine map used to set a less pessimistic layout.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D105859
Integral AND and OR follow the simple conjunction and disjuction rules
for lattice building. This revision also completes some of the Merge
refactoring by moving the remainder parts that are merger specific from
sparsification into utils files.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105851
Pool operations perform the same shape propagation. Included the shape
propagation and tests for these avg_pool2d and max_pool2d.
Differential Revision: https://reviews.llvm.org/D105665
Right now, we only accept x/c with nonzero c, since this
conceptually can be treated as a x*(1/c) conjunction for both
FP and INT as far as lattice computations go. The codegen
keeps the division though to preserve precise semantics.
See discussion:
https://llvm.discourse.group/t/sparse-tensors-in-mlir/3389/28
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105731
This is a fix of https://reviews.llvm.org/D104956, which broke the gcc5 build.
We opt to use unit tests rather than check tests as the lattice/merger code is a small C++ component with a well-defined API. Testing this API via check tests would be far less direct and readable. In addition, as the check tests will only be able to test the API indirectly, the tests may break based on unrelated changes; e.g. changes in linalg.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D105828
This is now the same as isIntAttrKind(), so use that instead, as
it does not require manual maintenance. The naming is also more
accurate in that both int and type attributes have an argument,
but this method was only targeting int attributes.
I initially wanted to tighten the AttrBuilder assertion, but we
have some in-tree uses that would violate it.
Annotate LinalgNamedStructuredOps.yaml with a comment stating the file is auto-generated and should not be edited manually.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105809
Previously, comprehensive bufferization of scf.yield did not have enough information
to detect whether an enclosing scf::for bbargs would bufferize to a buffer equivalent
to that of the matching scf::yield operand.
As a consequence a separate sanity check step would be required to determine whether
bufferization occured properly.
This late check would miss the case of calling a function in an loop.
Instead, we now pass and update aliasInfo during bufferization and it is possible to
imrpove bufferization of scf::yield and drop that post-pass check.
Add an example use case that was failing previously.
This slightly modifies the error conditions, which are also updated as part of this
revision.
Differential Revision: https://reviews.llvm.org/D105803
After the Math has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the Math
dialect operations to LLVM into a separate library and a separate
conversion pass.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D105702
The trait was inconsistent with the other broadcasting logic here. And
also fix printing here to use ? rather than -1 in the error.
Differential Revision: https://reviews.llvm.org/D105748
This enables checking the printing flags when formatting names
in SSANameState.
Depends On D105299
Reviewed By: mehdi_amini, bondhugula
Differential Revision: https://reviews.llvm.org/D105300
`interfaces` is passed through to the `numberValuesIn*` functions with exactly
the same value as when SSANameState is constructed. This just seems cleaner.
Also, a dependent PR adds `printerFlags` which follows similar code paths.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D105299
Use a modeling similar to SCF ParallelOp to support arbitrary parallel
reductions. The two main differences are: (1) reductions are named and declared
beforehand similarly to functions using a special op that provides the neutral
element, the reduction code and optionally the atomic reduction code; (2)
reductions go through memory instead because this is closer to the OpenMP
semantics.
See https://llvm.discourse.group/t/rfc-openmp-reduction-support/3367.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D105358
LLVM IR allows globals with external linkage to have initializers, including
undef. The translation was incorrectly using undef as a indicator that the
initializer should be ignored in translation, leading to the impossibility to
create an external global with an explicit undef initializer. Fix this and use
nullptr as a marker instead.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D105631
After the MemRef has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the MemRef
dialect operations to LLVM into a separate library and a separate
conversion pass.
Reviewed By: herhut, silvas
Differential Revision: https://reviews.llvm.org/D105625
We opt to use unit tests rather than check tests as the lattice/merger code is a small C++ component with a well-defined API. Testing this API via check tests would be far less direct and readable. In addition, as the check tests will only be able to test the API indirectly, the tests may break based on unrelated changes; e.g. changes in linalg.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D104956
`GeneralizePadTensorOpPattern` might generate `tensor.dim` op so the
TensorDialect should be marked legal. This pattern should also
transform all `linalg.pad_tensor` ops so mark those as illegal. Those
changes are missed from a previous change in
https://reviews.llvm.org/D105293
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D105642
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
For the getters, it is bad practice to keep the reference
around for too long, as explained in the new comment
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105599
This patch adds full qualification to the types and function calls used inside of (Static)InterfaceMethod in OpInterfaces. Without this patch using many of these interfaces in a downstream project yields compiler errors as the types and default implementations are mostly copied verbatim. Without then putting using namespace mlir; in the header file of the implementations of those interfaces, compilation is impossible.
Using fully qualified lookup fixes this issue.
Differential Revision: https://reviews.llvm.org/D105619
This reverts commit 6c0fd4db79.
This simple implementation is unfortunately not extensible and needs to be reverted.
The extensible way should be to extend https://reviews.llvm.org/D104321.
Introduce the exp and log function in OpDSL. Add the soft plus operator to test the emitted IR in Python and C++.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105420
Remove the GenericOpBase class formerly used to factor out common logic shared be GenericOp and IndexedGenericOp. After removing IndexedGenericOp, the base class is not used anymore.
Differential Revision: https://reviews.llvm.org/D105307
This class and classes that extend it are general utilities for any dialect
that is being converted into the LLVM dialect. They are in no way specific to
Standard-to-LLVM conversion and should not make their users depend on it.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105542
Verify the number of results matches exactly the number of output tensors. Simplify the FillOp verification since part of it got redundant.
Differential Revision: https://reviews.llvm.org/D105427
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.
This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.
Differential Revision: https://reviews.llvm.org/D105381
Add the rewrite of PadTensorOp to InitTensor + InsertSlice before the
bufferization analysis starts.
This is exercised via a more advanced integration test.
Since the new behavior triggers folding, 2 tests need to be updated.
One of those seems to exhibit a folding issue with `switch` and is modified.
Differential Revision: https://reviews.llvm.org/D105549
Refactor the original code to rewrite a PadTensorOp into a
sequence of InitTensorOp, FillOp and InsertSliceOp without
vectorization by default. `GenericPadTensorOpVectorizationPattern`
provides a customized OptimizeCopyFn to vectorize the
copying step.
Reviewed By: silvas, nicolasvasilache, springerm
Differential Revision: https://reviews.llvm.org/D105293
The `bufferizesToMemoryRead` condition was too optimistics in the case
of operands that map to a block argument.
This is the case for ForOp and TiledLoopOp.
For such ops, forward the call to all uses of the matching BBArg.
Differential Revision: https://reviews.llvm.org/D105540
"Standard-to-LLVM" conversion is one of the oldest passes in existence. It has
become quite large due to the size of the Standard dialect itself, which is
being split into multiple smaller dialects. Furthermore, several conversion
features are useful for any dialect that is being converted to the LLVM
dialect, which, without this refactoring, creates a dependency from those
conversions to the "standard-to-llvm" one.
Put several of the reusable utilities from this conversion to a separate
library, namely:
- type converter from builtin to LLVM dialect types;
- utility for building and accessing values of LLVM structure type;
- utility for building and accessing values that represent memref in the LLVM
dialect;
- lowering options applicable everywhere.
Additionally, remove the type wrapping/unwrapping notion from the type
converter that is no longer relevant since LLVM types has been reimplemented as
first-class MLIR types.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D105534
When an affine.if operation is returning/yielding results and has a
trivially true or false condition, then its 'then' or 'else' block,
respectively, is promoted to the affine.if's parent block and then, the
affine.if operation is replaced by the correct results/yield values.
Relevant test cases are also added.
Signed-off-by: Srishti Srivastava <srishti.srivastava@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D105418
Split out GPU ops library from GPU transforms. This allows libraries to
depend on GPU Ops without needing/building its transforms.
Differential Revision: https://reviews.llvm.org/D105472