Cross function boundary bufferization support is added.
This is enabled by cross-function boundary alias analysis, for which the bufferization process is extended: it can now modify the BufferizationAliasInfo as new ops are introduced.
A number of simplifying assumptions are made:
1. by default we bufferize to the most dynamic strided memref type, further memref::CastOp canonicalizations are expected to clean up the IR.
2. in the current implementation, the stride information is always erased at function boundaries. A subsequent pass will be required to analyze the meet of all call ops to a function and decide whether more static buffer types can be used. This will potentially clone functions when it is deemed profitable to do so (e.g. when the stride-1 dimension may vary).
3. external function always bufferize to the most dynamic strided memref version. This may require special annotations for specifying that particular operands of top-level functions have contiguous buffer layout.
An alternative to point 3. would be to support tensor layout annotations, which is currently not supported in MLIR.
Differential revision: https://reviews.llvm.org/D104873
Move the OpDSL doc to a linalg sub folder and updated the integration in the main linalg documentation.
Differential Revision: https://reviews.llvm.org/D105188
Add helpers to facilitate adding arguments and results to operations
that implement the `FunctionLike` trait. These operations already have a
convenient argument and result *erasure* mechanism, but a corresopnding
utility for insertion is missing. This introduces such a utility.
* Split memref.dim into two operations: memref.dim and tensor.dim. Both ops have the same builder interface and op argument names, so that they can be used with templates in patterns that apply to both tensors and memrefs (e.g., some patterns in Linalg).
* Add constant materializer to TensorDialect (needed for folding in affine.apply etc.).
* Remove some MemRefDialect dependencies, make some explicit.
Differential Revision: https://reviews.llvm.org/D105165
Uses elementwise interface to generalize canonicalization pattern and add a new
pattern for vector.contract case.
Differential Revision: https://reviews.llvm.org/D104343
Similarly to batch_mat vec outer most dim is a batching dim
and this op does |b| matrix-vector-products :
C[b, i] = sum_k(A[b, i, k] * B[b, k])
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D104739
The executeregionop is used to allow multiple blocks within SCF constructs. If the container allows multiple blocks, inline the region
Differential Revision: https://reviews.llvm.org/D104960
Deduce circumstances where an affine load could not possibly be read by an operation (such as an affine load), and if so, eliminate the load
Differential Revision: https://reviews.llvm.org/D105041
Update the OpDSL documentation to reflect recent changes. In particular, the updated documentation discusses:
- Attributes used to parameterize index expressions
- Shape-only tensor support
- Scalar parameters
Differential Revision: https://reviews.llvm.org/D105123
Extend the OpDSL syntax with an optional `domain` function to specify an explicit dimension order. The extension is needed to provide more control over the dimension order instead of deducing it implicitly depending on the formulation of the tensor comprehension. Additionally, the patch also ensures the symbols are ordered according to the operand definitions of the operation.
Differential Revision: https://reviews.llvm.org/D105117
This was missing and also there was a bug in the lowering itself, which went unnoticed due to it.
Differential Revision: https://reviews.llvm.org/D105122
Fix generateCopyForMemRefRegion for a missing check: in some cases, when
the thing to generate copies for itself is empty, no fast buffer/copy
loops would have been allocated/generated. Add an extra assertion there
while at this.
Differential Revision: https://reviews.llvm.org/D105170
This reverts commit 652f4b5140.
Re-enable MLLIR JIT tests.
The MLIR Bot was updated to export LD_LIBRARY_PATH=/usr/lib64, which
seem to fix this issue.
* Previously, we were only generating .h.inc files. We foresee the need to also generate implementations and this is a step towards that.
* Discussed in https://llvm.discourse.group/t/generating-cpp-inc-files-for-dialects/3732/2
* Deviates from the discussion above by generating a default constructor in the .cpp.inc file (and adding a tablegen bit that disables this in case if this is user provided).
* Generating the destructor started as a way to flush out the missing includes (produces a link error), but it is a strict improvement on its own that is worth doing (i.e. by emitting key methods in the .cpp file, we root vtables in one translation unit, which is a non-controversial improvement).
Differential Revision: https://reviews.llvm.org/D105070
Depends On D105037
Avoid creating too many tasks when the number of workers is large.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D105126
Depends On D104999
Automatic reference counting based on the liveness analysis can add a lot of reference counting overhead at runtime. If the IR is known to be constrained to few particular "shapes", it's much more efficient to provide a custom reference counting policy that will specify where it is required to update the async value reference count.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D105037
This revision adds the minimal plumbing to create a simple ComprehensiveModuleBufferizePass that can behave conservatively in the presence of CallOps.
A topological sort of caller/callee is performed and, if the call-graph is cycle-free, analysis can proceed.
Differential revision: https://reviews.llvm.org/D104859
Depends On D104998
Function calls "transfer ownership" to the callee and it puts additional constraints on the reference counting optimization pass
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D104999
Depends On D104891
Outlining scf.parallel body as a function requires async-parallel-for pass to be a ModuleOp pass
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D104998
Depends On D104850
Add a test that verifies that canonicalization removes all async overheads if it is statically known that the scf.parallel operation will be computed using a single block.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D104891
The case where a non-dominating read can be found is captured by slightly generalizing `AliasInfo::wouldCreaateReadAfterWriteInterference`.
This simplification will make it easier to implement bufferization across function call.
APIs are also simplified were possible.
Differential revision: https://reviews.llvm.org/D104845
This patch brings support for setting runtime preemption specifiers of
LLVM's GlobalValues. In LLVM semantics, if the `dso_local` attribute
is not explicitly requested, then it is inferred based on linkage and
visibility. We model this same behavior with a UnitAttribute: if it is
present, then we explicitly request the GlobalValue to marked as
`dso_local`, otherwise we rely on the GlobalValue itself to make this
decision.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D104983
Without it BufferDeallocationPass process only CloneOps created during pass itself and ignore all CloneOps that were already present in IR.
For our specific usecase:
```
func @dealloc_existing_clones(%arg0: memref<?x?xf64>, %arg1: memref<?x?xf64>) -> memref<?x?xf64> {
return %arg0 : memref<?x?xf64>
}
```
Input arguments will be freed immediately after return from function and we want to prolong lifetime for the returned argument.
To achieve this we explicitly add clones to all input memrefs and expect that BufferDeallocationPass will add correct deallocs to them (unnessesary clone+dealloc pairs will be canonicalized away later).
Differential Revision: https://reviews.llvm.org/D104973
Adapt the StructuredOp verifier to ensure all operands are either in the input or the output group. The change is possible after adding support for scalar input operands (https://reviews.llvm.org/D104220).
Differential Revision: https://reviews.llvm.org/D104783
The current code does not preserve the order of the parallel
dimensions when doing multi-reductions and thus we can end
up in scenarios where the result shape does not match the
desired shape after reduction.
This patch fixes that by ensuring that the parallel indices
are in order and then concatenates them to the reduction dimensions
so that the reduction dimensions are innermost.
Differential Revision: https://reviews.llvm.org/D104884
This revision adds detection for changes to either the mlir-lsp-server binary or the setting, and prompts the user to restart the server. Whether the user gets prompted or not is a configurable setting in the extension, and this setting may updated based on the user response to the prompt.
Differential Revision: https://reviews.llvm.org/D104501
This enables creating a replacement rule where range of positional replacements
need not be spelled out, or are not known (e.g., enable having a rewrite that
forward all operands to a call generically).
Differential Revision: https://reviews.llvm.org/D104955
Input/output types can be integers, which represent a quantized convolution.
Update verifier to expect this behavior.
Reviewed By: sjarus
Differential Revision: https://reviews.llvm.org/D104949
The executeregionop is used to allow multiple blocks within SCF constructs. If the container allows multiple blocks, inline the region
Differential Revision: https://reviews.llvm.org/D104960
MemRefDataFlow performs mem2reg style operations for affine load/stores. Unfortunately, it is not presently correct in the presence of external operations such as memref.cast, or function calls. This diff extends the functionality of the pass to remain correct in the presence of such ops.
Differential Revision: https://reviews.llvm.org/D104053
A canonicalization accidentally will remove a memref allocation if it is only stored into. However, this is incorrect if the allocation is the value being stored, not the allocation being stored into.
Differential Revision: https://reviews.llvm.org/D104947
Given a select that returns the logical negation of the condition, replace it with a not of the condition.
Differential Revision: https://reviews.llvm.org/D104966