This revision adds support for passing a functor to SourceMgrDiagnosticHandler for filtering out FileLineColLocs when emitting a diagnostic. More specifically, this can be useful in situations where there may be large CallSiteLocs with locations that aren't necessarily important/useful for users.
For now the filtering support is limited to FileLineColLocs, but conceptually we could allow filtering for all locations types if a need arises in the future.
Differential Revision: https://reviews.llvm.org/D103649
At present, a lot of code contains main function bodies like "return failed(mlir::MlirOptMain(...);". This is unfortunate for two reasons: a) it uses ADL, which is maybe not what the free "failed" function was designed for; and b) it is a bit awkward to read, requring the reader to both understand the boolean nature of the value and the semantics of main's return value. (And it's also not portable, since 1 is not a portable success value.)
The replacement code, `return mlir::AsMainReturnCode(mlir::MlirOptMain(...))` is a bit more self-explanatory.
The change applies the new function to a few internal uses of MlirOptMain, too.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D102641
This revision migrates more code from Linalg into the new permanent home of
SparseTensor. It replaces the test passes with proper compiler passes.
NOTE: the actual removal of the last glue and clutter in Linalg will follow
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D101811
Right now Elementwise operations fusion in Linalg fuses everything it
can. This can run up against resource limits of the target hardware
without some checks. This patch adds a callback function that clients
can use to implement a cost function. When two elementwise operations
are deemed structurally fusable, the callback can be used to control
if the fusion applies.
Differential Revision: https://reviews.llvm.org/D99820
This change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396.
I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though.
Reviewed By: whchung
Differential Revision: https://reviews.llvm.org/D98447
Data layout information allows to answer questions about the size and alignment
properties of a type. It enables, among others, the generation of various
linear memory addressing schemes for containers of abstract types and deeper
reasoning about vectors. This introduces the subsystem for modeling data
layouts in MLIR.
The data layout subsystem is designed to scale to MLIR's open type and
operation system. At the top level, it consists of attribute interfaces that
can be implemented by concrete data layout specifications; type interfaces that
should be implemented by types subject to data layout; operation interfaces
that must be implemented by operations that can serve as data layout scopes
(e.g., modules); and dialect interfaces for data layout properties unrelated to
specific types. Built-in types are handled specially to decrease the overall
query cost.
A concrete default implementation of these interfaces is provided in the new
Target dialect. Defaults for built-in types that match the current behavior are
also provided.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97067
Clean-up after D98279, remove one call to createConvertGPUKernelToBlobPass().
Depends On D98203
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D98360
This patch is a follow-up on D97217. It adds a new 'Skip' result to the Operation visitor
so that a callback can stop the ongoing visit of an operation/block/region and
continue visiting the next one without fully interrupting the walk. Skipping is
needed to be able to erase an operation/block in pre-order and do not continue
visiting the internals of that operation/block.
Related to the skipping mechanism, the patch also introduces the following changes:
* Added new TestIRVisitors pass with basic testing for the IR visitors.
* Fixed missing early increment ranges in visitor implementation.
* Updated documentation of walk methods to include erasure information and walk
order information.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97820
This revision adds a new `AliasAnalysis` class that represents the main alias analysis interface in MLIR. The purpose of this class is not to hold the aliasing logic itself, but to provide an interface into various different alias analysis implementations. As it evolves this should allow for users to plug in specialized alias analysis implementations for their own needs, and have them immediately usable by other analyses and transformations.
This revision also adds an initial simple generic alias, LocalAliasAnalysis, that provides support for performing stateless local alias queries between values. This class is similar in scope to LLVM's BasicAA.
Differential Revision: https://reviews.llvm.org/D92343
Adds rewrite patterns to convert select+cmp instructions into clamp
instructions whenever possible. Support is added to convert:
- FOrdLessThan, FOrdLessThanEqual to GLSLFClampOp.
- SLessThan, SLessThanEqual to GLSLSClampOp.
- ULessThan, ULessThanEqual to GLSLUClampOp.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D93618
PDL patterns are now supported via a new `PDLPatternModule` class. This class contains a ModuleOp with the pdl::PatternOp operations representing the patterns, as well as a collection of registered C++ functions for native constraints/creations/rewrites/etc. that may be invoked via the pdl patterns. Instances of this class are added to an OwningRewritePatternList in the same fashion as C++ RewritePatterns, i.e. via the `insert` method.
The PDL bytecode is an in-memory representation of the PDL interpreter dialect that can be efficiently interpreted/executed. The representation of the bytecode boils down to a code array(for opcodes/memory locations/etc) and a memory buffer(for storing attributes/operations/values/any other data necessary). The bytecode operations are effectively a 1-1 mapping to the PDLInterp dialect operations, with a few exceptions in cases where the in-memory representation of the bytecode can be more efficient than the MLIR representation. For example, a generic `AreEqual` bytecode op can be used to represent AreEqualOp, CheckAttributeOp, and CheckTypeOp.
The execution of the bytecode is split into two phases: matching and rewriting. When matching, all of the matched patterns are collected to avoid the overhead of re-running parts of the matcher. These matched patterns are then considered alongside the native C++ patterns, which rewrite immediately in-place via `RewritePattern::matchAndRewrite`, for the given root operation. When a PDL pattern is matched and has the highest benefit, it is passed back to the bytecode to execute its rewriter.
Differential Revision: https://reviews.llvm.org/D89107
Op with mapping from ops to corresponding shape functions for those op
in the library and mechanism to associate shape functions to functions.
The mapping of operand to shape function is kept separate from the shape
functions themselves as the operation is associated to the shape
function and not vice versa, and one could have a common library of
shape functions that can be used in different contexts.
Use fully qualified names and require a name for shape fn lib ops for
now and an explicit print/parse (based around the generated one & GPU
module op ones).
This commit reverts d9da4c3e73. Fixes
missing headers (don't know how that was working locally).
Differential Revision: https://reviews.llvm.org/D91672
Op with mapping from ops to corresponding shape functions for those op
in the library and mechanism to associate shape functions to functions.
The mapping of operand to shape function is kept separate from the shape
functions themselves as the operation is associated to the shape
function and not vice versa, and one could have a common library of
shape functions that can be used in different contexts.
Use fully qualified names and require a name for shape fn lib ops for
now and an explicit print/parse (based around the generated one & GPU
module op ones).
Differential Revision: https://reviews.llvm.org/D91672
Enhance the tile+fuse logic to allow fusing a sequence of operations.
Make sure the value used to obtain tile shape is a
SubViewOp/SubTensorOp. Current logic used to get the bounds of loop
depends on the use of `getOrCreateRange` method on `SubViewOp` and
`SubTensorOp`. Make sure that the value/dim used to compute the range
is from such ops. This fix is a reasonable WAR, but a btter fix would
be to make `getOrCreateRange` method be a method of `ViewInterface`.
Differential Revision: https://reviews.llvm.org/D90991
This reverts commit f8284d21a8.
Revert "[mlir][Linalg] NFC: Expose some utility functions used for promotion."
This reverts commit 0c59f51592.
Revert "Remove unused isZero function"
This reverts commit 0f9f0a4046.
Change f8284d21 led to multiple failures in IREE compilation.
As discussed in https://llvm.discourse.group/t/mlir-support-for-sparse-tensors/2020
this CL is the start of sparse tensor compiler support in MLIR. Starting with a
"dense" kernel expressed in the Linalg dialect together with per-dimension
sparsity annotations on the tensors, the compiler automatically lowers the
kernel to sparse code using the methods described in Fredrik Kjolstad's thesis.
Many details are still TBD. For example, the sparse "bufferization" is purely
done locally since we don't have a global solution for propagating sparsity
yet. Furthermore, code to input and output the sparse tensors is missing.
Nevertheless, with some hand modifications, the generated MLIR can be
easily converted into runnable code already.
Reviewed By: nicolasvasilache, ftynse
Differential Revision: https://reviews.llvm.org/D90994
This replaces the old type decomposition logic that was previously mixed
into bufferization, and makes it easily accessible.
This also deletes TestFinalizingBufferize, because after we remove the type
decomposition, it doesn't do anything that is not already provided by
func-bufferize.
Differential Revision: https://reviews.llvm.org/D90899
The pass combines patterns of ExpandAtomic, ExpandMemRefReshape,
StdExpandDivs passes. The pass is meant to legalize STD for conversion to LLVM.
Differential Revision: https://reviews.llvm.org/D91082
* Wires them in the same way that peer-dialect test passes are registered.
* Fixes the build for -DLLVM_INCLUDE_TESTS=OFF.
Differential Revision: https://reviews.llvm.org/D91022
This functionality is superceded by BufferResultsToOutParams pass (see
https://reviews.llvm.org/D90071) for users the require buffers to be
out-params. That pass should be run immediately after all tensors are gone from
the program (before buffer optimizations and deallocation insertion), such as
immediately after a "finalizing" bufferize pass.
The -test-finalizing-bufferize pass now defaults to what used to be the
`allowMemrefFunctionResults=true` flag. and the
finalizing-bufferize-allowed-memref-results.mlir file is moved
to test/Transforms/finalizing-bufferize.mlir.
Differential Revision: https://reviews.llvm.org/D90778
TestDialect has many operations and they all live in ::mlir namespace.
Sometimes it is not clear whether the ops used in the code for the test passes
belong to Standard or to Test dialects.
Also, with this change it is easier to understand what test passes registered
in mlir-opt are actually passes in mlir/test.
Differential Revision: https://reviews.llvm.org/D90794
BufferPlacement is no longer part of bufferization. However, this test
is an important test of "finalizing" bufferize passes.
A "finalizing" bufferize conversion is one that performs a "full"
conversion and expects all tensors to be gone from the program. This in
particular involves rewriting funcs (including block arguments of the
contained region), calls, and returns. The unique property of finalizing
bufferization passes is that they cannot be done via a local
transformation with suitable materializations to ensure composability
(as other bufferization passes do). For example, if a call is
rewritten, the callee needs to be rewritten otherwise the IR will end up
invalid. Thus, finalizing bufferization passes require an atomic change
to the entire program (e.g. the whole module).
This new designation makes it clear also that it shouldn't be testing
bufferization of linalg ops, so the tests have been updated to not use
linalg.generic ops. (linalg.copy is still used as the "copy" op for
copying into out-params)
Differential Revision: https://reviews.llvm.org/D89979
This commit adds a new library that merges/combines a number of spv
modules into a combined one. The library has a single entry point:
combine(...).
To combine a number of MLIR spv modules, we move all the module-level ops
from all the input modules into one big combined module. To that end, the
combination process can proceed in 2 phases:
(1) resolving conflicts between pairs of ops from different modules
(2) deduplicate equivalent ops/sub-ops in the merged module. (TODO)
This patch implements only the first phase.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D90477
This commit adds a new library that merges/combines a number of spv
modules into a combined one. The library has a single entry point:
combine(...).
To combine a number of MLIR spv modules, we move all the module-level ops
from all the input modules into one big combined module. To that end, the
combination process can proceed in 2 phases:
(1) resolving conflicts between pairs of ops from different modules
(2) deduplicate equivalent ops/sub-ops in the merged module. (TODO)
This patch implements only the first phase.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D90477
This commit adds a new library that merges/combines a number of spv
modules into a combined one. The library has a single entry point:
combine(...).
To combine a number of MLIR spv modules, we move all the module-level ops
from all the input modules into one big combined module. To that end, the
combination process can proceed in 2 phases:
(1) resolving conflicts between pairs of ops from different modules
(2) deduplicate equivalent ops/sub-ops in the merged module. (TODO)
This patch implements only the first phase.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D90022
Linalg "tile-and-fuse" is currently exposed as a Linalg pass "-linalg-fusion" but only the mechanics of the transformation are currently relevant.
Instead turn it into a "-test-linalg-greedy-fusion" pass which performs canonicalizations to enable more fusions to compose.
This allows dropping the OperationFolder which is not meant to be used with the pattern rewrite infrastructure.
Differential Revision: https://reviews.llvm.org/D90394
This revision adds a programmable codegen strategy from linalg based on staged rewrite patterns. Testing is exercised on a simple linalg.matmul op.
Differential Revision: https://reviews.llvm.org/D89374
This is the same diff as https://reviews.llvm.org/D88809/ except side effect
free check is removed for involution and a FIXME is added until the dependency
is resolved for shared builds. The old diff has more details on possible fixes.
Reviewed By: rriddle, andyly
Differential Revision: https://reviews.llvm.org/D89333
This reverts commit 1ceaffd95a.
The build is broken with -DBUILD_SHARED_LIBS=ON ; seems like a possible
layering issue to investigate:
tools/mlir/lib/IR/CMakeFiles/obj.MLIRIR.dir/Operation.cpp.o: In function `mlir::MemoryEffectOpInterface::hasNoEffect(mlir::Operation*)':
Operation.cpp:(.text._ZN4mlir23MemoryEffectOpInterface11hasNoEffectEPNS_9OperationE[_ZN4mlir23MemoryEffectOpInterface11hasNoEffectEPNS_9OperationE]+0x9c): undefined reference to `mlir::MemoryEffectOpInterface::getEffects(llvm::SmallVectorImpl<mlir::SideEffects::EffectInstance<mlir::MemoryEffects::Effect> >&)'
This change allows folds to be done on a newly introduced involution trait rather than having to manually rewrite this optimization for every instance of an involution
Reviewed By: rriddle, andyly, stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D88809
The pattern is structured similar to other patterns like
LinalgTilingPattern. The fusion patterns takes options that allows you
to fuse with producers of multiple operands at once.
- The pattern fuses only at the level that is known to be legal, i.e
if a reduction loop in the consumer is tiled, then fusion should
happen "before" this loop. Some refactoring of the fusion code is
needed to fuse only where it is legal.
- Since the fusion on buffers uses the LinalgDependenceGraph that is
not mutable in place the fusion pattern keeps the original
operations in the IR, but are tagged with a marker that can be later
used to find the original operations.
This change also fixes an issue with tiling and
distribution/interchange where if the tile size of a loop were 0 it
wasnt account for in these.
Differential Revision: https://reviews.llvm.org/D88435
Instead of performing a transformation, such pass yields a new pass pipeline
to run on the currently visited operation.
This feature can be used for example to implement a sub-pipeline that
would run only on an operation with specific attributes. Another example
would be to compute a cost model and dynamic schedule a pipeline based
on the result of this analysis.
Discussion: https://llvm.discourse.group/t/rfc-dynamic-pass-pipeline/1637
Recommit after fixing an ASAN issue: the callback lambda needs to be
allocated to a temporary to have its lifetime extended to the end of the
current block instead of just the current call expression.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D86392
This reverts commit 385c3f43fc.
Test mlir/test/Pass:dynamic-pipeline-fail-on-parent.mlir.test fails
when run with ASAN:
ERROR: AddressSanitizer: stack-use-after-scope on address ...
Reviewed By: bkramer, pifon2a
Differential Revision: https://reviews.llvm.org/D88079
Instead of performing a transformation, such pass yields a new pass pipeline
to run on the currently visited operation.
This feature can be used for example to implement a sub-pipeline that
would run only on an operation with specific attributes. Another example
would be to compute a cost model and dynamic schedule a pipeline based
on the result of this analysis.
Discussion: https://llvm.discourse.group/t/rfc-dynamic-pass-pipeline/1637
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D86392
Add support to tile affine.for ops with parametric sizes (i.e., SSA
values). Currently supports hyper-rectangular loop nests with constant
lower bounds only. Move methods
- moveLoopBody(*)
- getTileableBands(*)
- checkTilingLegality(*)
- tilePerfectlyNested(*)
- constructTiledIndexSetHyperRect(*)
to allow reuse with constant tile size API. Add a test pass -test-affine
-parametric-tile to test parametric tiling.
Differential Revision: https://reviews.llvm.org/D87353