This commit adds a new hook Pass `bool canScheduleOn(RegisteredOperationName)` that
indicates if the given pass can be scheduled on operations of the given type. This makes it
easier to define constraints on generic passes without a) adding conditional checks to
the beginning of the `runOnOperation`, or b) defining a new pass type that forwards
from `runOnOperation` (after checking the invariants) to a new hook. This new hook is
used to implement an `InterfacePass` pass class, that represents a generic pass that
runs on operations of the given interface type.
The PassManager will also verify that passes added to a pass manager can actually be
scheduled on that pass manager, meaning that we will properly error when an Interface
is scheduled on an operation that doesn't actually implement that interface.
Differential Revision: https://reviews.llvm.org/D120791
RegionBranchOpInterface and BranchOpInterface are allowed to make implicit type conversions along control-flow edges. In effect, this adds an interface method, `areTypesCompatible`, to both interfaces, which should return whether the types of corresponding successor operands and block arguments are compatible. Users of the interfaces, here on forth, must be aware that types may mismatch, although current users (in MLIR core), are not affected by this change. By default, type equality is used.
`async.execute` already has unequal types along control-flow edges (`!async.value<f32>` vs. `f32`), but it opted out of calling `RegionBranchOpInterface::verifyTypes` in its verifier. That method has now been removed and `RegionBranchOpInterface` will verify types along control edges by default in its verifier.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120790
Such initializer functions can be enqueued in `BufferizationOptions`. They can be used to set up dialect-specific bufferization state.
Differential Revision: https://reviews.llvm.org/D120985
This clarifies that these methods only work in append mode, not for general insertions. This is a prospective change towards https://github.com/llvm/llvm-project/issues/51652 which also performs random-access insertions, so we want to avoid confusion.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120929
This patch extends the existing if combining canonicalization to also handle the case where a value returned by the first if is used within the body of the second if.
This patch also extends if combining to support if's whose conditions are logical negations of each other.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120924
This patch makes coalesce skip the comparison of all pairs of IntegerPolyhedrons with LocalIds rather than crash. The heuristics to handle these cases will be upstreamed later on.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120995
We can simplify an extractvalue of an insertvalue to extract out of the base of the insertvalue, if the insert and extract are at distinct and non-prefix'd indices
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120915
This patch introduces the cut case. If one polytope has only cutting and
redundant inequalities for the other and the facet of the cutting
inequalities are contained within the other polytope, then the polytopes are
be combined into a polytope consisting only of their respective
redundant constraints.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120614
Extend isLoopMemoryParallel check to include locally allocated memrefs.
This strengthens and also speeds up the dependence check used by the
utility by excluding locally allocated memrefs where appropriate.
Additional memref dialect ops can be supported exhaustively via proper
interfaces.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D120617
An OpenMP wsloop is simply a regular for loop with the bounds determined by the thread number, and the same justification to allow this for scf.for works for omp.wsloop.
An OpenMP parallel is a parallel for, per thread. Similarly the same justification for scf.parallel having recursive side effects applies here.
In both cases the general justification is that the ops themselves don't have side effects (besides inaccessible runtime-specific memory) and thus the side effects are simply that of the contained ops.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D120853
This commit adds support for processing tablegen include files, and importing
various information from ODS. This includes operations, attribute+type constraints,
attribute/operation/type interfaces, etc. This will allow for much more robust tooling,
and also allows for referencing ODS constructs directly within PDLL (imported interfaces
can be used as constraints, operation result names can be used for member access, etc).
Differential Revision: https://reviews.llvm.org/D119900
The SerializeToHsaco pass does not depend on ROCm being available on
the build system - it only requires ROCm to be present at runtime.
However, the CMake file that built it tested for
MLIR_ENABLE_ROCM_RUNNER , which implies that ROCm is currently
available and is used to control building ROCm integration tests.
Referencing MLIR_ENABLE_ROCM_RUNNER instead of
MLIR_ENABLE_ROCM_CONVERSIONS in the SerializeToHsaco build therefore
causes problems for clients who wish to build projects that depend on
this pass on a system without an AMD GPU present.
Reviewed By: whchung
Differential Revision: https://reviews.llvm.org/D120663
Ensure that `Handler` within the class is interpreted as the as the current template instantiation (instead the class template itself).
Fixes#53447
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120852
This ensures that we generate memref types with matching layout maps. (Especially when using partial bufferization passes.)
Differential Revision: https://reviews.llvm.org/D120893
This commit deletes the old dialect conversion-based bufferization patterns, which are now obsolete.
Differential Revision: https://reviews.llvm.org/D120883
The loopInfos gets invalidated after collapsing nested loops. Use the
saved afterIP since the returned afterIP by applyDynamicWorkshareLoop
may be not valid.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D120294
StandardToSPIRV currently contains an assortment of patterns converting from
different dialects to SPIRV. This commit splits up StandardToSPIRV into separate
conversions for each of the dialects involved (some of which already exist).
Differential Revision: https://reviews.llvm.org/D120767
Add support for extensible dialects, which are dialects that can be
extended at runtime with new operations and types.
These operations and types cannot at the moment implement traits
or interfaces.
Differential Revision: https://reviews.llvm.org/D104554
LLVM defines several default datalayouts for integer and floating point types that are not being considered when importing into MLIR. This patch remedies this.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120832
https://reviews.llvm.org/D120423 replaced the use of stacksave/restore with memref.alloca_scope, but kept the save/restore at the same location. This PR places the allocation scope within the wsloop, thus keeping the same allocation scope as the original scf.parallel (e.g. no longer over stack allocating).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120772
This patch moves all functionality from IntegerPolyhedron to IntegerRelation.
IntegerPolyhedron is now implemented as a relation with no domain. All existing
functionality is extended to work on relations.
This patch does not affect external users like FlatAffineConstraints as they
can still continue to use IntegerPolyhedron abstraction.
This patch is part of a series of patches to support relations in Presburger
library.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120652
Add support for translating data layout specifications for integer and float
types between MLIR and LLVM IR. This is a first step towards removing the
string-based LLVM dialect data layout attribute on modules. The latter is still
available and will remain so until the first-class MLIR modeling can fully
replace it.
Depends On D120739
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D120740
Add support for integer and float types into the data layout subsystem with
default logic similar to LLVM IR. Given the flexibility of the sybsystem, the
logic can be easily overwritten by operations if necessary. This provides the
connection necessary, e.g., for the GPU target where alignment requirements for
integers and floats differ from those provided by default (although still
compatible with the LLVM IR model). Previously, it was impossible to use
non-default alignment requirements for integer and float types, which could
lead to incorrect address and size calculations when targeting GPUs.
Depends On D120737
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D120739
This patch adds assemblyFormat for `omp.critical.declare`, `omp.atomic.read`,
`omp.atomic.write`, `omp.atomic.update` and `omp.atomic.capture`.
Also removing those clauses from `parseClauses` that aren't needed
anymore, thanks to the new assemblyFormats.
Reviewed By: NimishMishra, rriddle
Differential Revision: https://reviews.llvm.org/D120248
sparsity values.
Previously, we can't properly handle input tensors with a dimension
ordering that is different from the natural ordering or with a mixed of
compressed and dense dimensions. This change fixes the problems by
passing the dimension ordering and sparsity values to the runtime
routine.
Modify an existing test to test the situation.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120777
This adds an option to configure the CMake python search priming
behaviour that was introduced in D118148. In some environments the
priming would cause the "real" search to fail. The default behaviour is
unchanged, i.e. the search will be primed.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D120765
The remanants of Standard was renamed to Func, but the test directory
remained named as Standard. In adidition to fixing the name, this commit
also moves the tests for operations not in the Func dialect to the proper
parent dialect test directory.
The Func has a large number of legacy dependencies carried over from the old
Standard dialect, which was pervasive and contained a large number of varied
operations. With the split of the standard dialect and its demise, a lot of lingering
dead dependencies have survived to the Func dialect. This commit removes a
large majority of then, greatly reducing the dependence surface area of the
Func dialect.
The last remaining operations in the standard dialect all revolve around
FuncOp/function related constructs. This patch simply handles the initial
renaming (which by itself is already huge), but there are a large number
of cleanups unlocked/necessary afterwards:
* Removing a bunch of unnecessary dependencies on Func
* Cleaning up the From/ToStandard conversion passes
* Preparing for the move of FuncOp to the Func dialect
See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061
Differential Revision: https://reviews.llvm.org/D120624
Previously, convertToMLIRSparseTensor assumes identity storage ordering and all
compressed dimensions. This change extends the function with two parameters for
users to specify the storage ordering and the sparsity of each dimension.
Modify PyTACO to reflect this change.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120643
As discussed in https://reviews.llvm.org/D119743 scf.parallel would continuously stack allocate since the alloca op was placd in the wsloop rather than the omp.parallel. This PR is the second stage of the fix for that problem. Specifically, we now introduce an alloca scope around the inlined body of the scf.parallel and enable a canonicalization to hoist the allocations to the surrounding allocation scope (e.g. omp.parallel).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120423
The llvm.mlir.global operation accepts a region as initializer. This region
corresponds to an LLVM IR constant expression and therefore should not accept
operations with side effects. Add a corresponding verifier.
Reviewed By: wsmoses, bondhugula
Differential Revision: https://reviews.llvm.org/D120632
The revision renames the following OpDSL functions:
```
TypeFn.cast -> TypeFn.cast_signed
BinaryFn.min -> BinaryFn.min_signed
BinaryFn.max -> BinaryFn.max_signed
```
The corresponding enum values on the C++ side are renamed accordingly:
```
#linalg.type_fn<cast> -> #linalg.type_fn<cast_signed>
#linalg.binary_fn<min> -> #linalg.binary_fn<min_signed>
#linalg.binary_fn<max> -> #linalg.binary_fn<max_signed>
```
Depends On D120110
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120562
The revision extends OpDSL with unary and binary function attributes. A function attribute, makes the operations used in the body of a structured operation configurable. For example, a pooling operation may take an aggregation function attribute that specifies if the op shall implement a min or a max pooling. The goal of this revision is to define less and more flexible operations.
We may thus for example define an element wise op:
```
linalg.elem(lhs, rhs, outs=[out], op=BinaryFn.mul)
```
If the op argument is not set the default operation is used.
Depends On D120109
Reviewed By: nicolasvasilache, aartbik
Differential Revision: https://reviews.llvm.org/D120110
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.
This patch includes the following related changes:
* Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
* Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
* Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
* Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.
Differential Revision: https://reviews.llvm.org/D114413
Add a pattern matcher for ExtractSliceOp when its source is a constant.
The matching heuristics can be governed by the control function since
generating a new constant is not always beneficial.
Differential Revision: https://reviews.llvm.org/D119605
If we have a chain of `tensor.insert_slice` ops inserting some
`tensor.pad` op into a `linalg.fill` and ranges do not overlap,
we can also elide the `tensor.pad` later.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D120446
Fold tensor.insert_slice(tensor.pad(<input>), linalg.fill) into
tensor.insert_slice(<input>, linalg.fill) if the padding value and
the filling value are the same.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D120410
Improve the LinalgOp verification to ensure the iterator types is known. Previously, unknown iterator types have been ignored without warning, which can lead to confusing bugs.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120649
This patch removes the builders for `omp.wsloop` operation that aren't
specifically needed anywhere. We can add them later if the need arises.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D120533
This patch moves IntegerPolyhedron::reset to FlatAffineConstraints::reset. This
function is not required in IntegerPolyhedron and creates ambiguity while
shifting implementations to IntegerRelation.
This patch is part of a series of patches to introduce relations in Presburger
library.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120628
PDL currently doesn't support result values from constraints, meaning we need
to error out until this is actually supported to avoid crashes.
Differential Revision: https://reviews.llvm.org/D119782
This commits adds a C++ generator to PDLL that generates wrapper PDL patterns
directly usable in C++ code, and also generates the definitions of native constraints/rewrites
that have code bodies specified in PDLL. This generator is effectively the PDLL equivalent of
the current DRR generator, and will allow easy replacement of DRR patterns with PDLL patterns.
A followup will start to utilize this for end-to-end integration testing and show case how to
use this as a drop-in replacement for DRR tablegen usage.
Differential Revision: https://reviews.llvm.org/D119781
If the operand list or result list of an operation expression is not specified, we interpret
this as meaning that the operands/results are "unconstraint" (i.e. "could be anything").
We currently don't properly handle differentiating this case from the case of
"no operands/results". This commit adds the insertion of implicit value/type range
variables when these lists are unspecified. This allows for adding proper support
for when zero operands or results are expected.
Differential Revision: https://reviews.llvm.org/D119780
This commits starts to plumb PDLL down into MLIR and adds an initial
PDL generator. After this commit, we will have conceptually support
end-to-end execution of PDLL. Followups will add CPP generation to
match the current DRR setup, and begin to add various end-to-end
tests to test PDLL execution.
Differential Revision: https://reviews.llvm.org/D119779
This patch removes a redundant check in hasConsistentState which is always true
after introduction of PresburgerSpace.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120615
This patch moves identifier kind specific insert/append functions like
`insertDimId`, `appendSymbolId`, etc. from IntegerPolyhedron to
FlatAffineConstraints.
This change allows for a smoother transition to IntegerRelation.
This change is part of a series of patches to introduce Relations in Presburger
library.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120576
This patch factors out various checks for dimension compatibility to
PresburgerSpace::isEqual and PresburgerLocalSpace::isEqual (for local
identifiers).
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120565
In the main-loop of the current coalesce implementation `i` was incremented
twice for some cases. This patch fixes this bug and adds a regression
testcase.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120613
The PyTACO DSL doesn't support reduction to scalars. This change
enhances the MLIR-PyTACO implementation to support reduction to scalars.
Extend an existing test to show the syntax of reduction to scalars and
two methods to retrieve the scalar values.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120572
The AVX2 lowering for transpose operations is only applicable to f32 vector types.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120427
The existing AVX2 lowering patterns for the transpose op only triggers if the
input vector is 2-D. This patch extends the patterns to trigger for n-D vectors
which are effectively 2-D vectors (e.g., vector<1x4x1x8x1). The main constraint
for the generalized AVX2 patterns to be applicable to these vectors is that the
dimensions that are greater than one must be transposed. Otherwise, the existing
patterns are not applicable.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D119505
This change gives explicit order of verifier execution and adds
`hasRegionVerifier` and `verifyWithRegions` to increase the granularity
of verifier classification. The orders are as below,
1. InternalOpTrait will be verified first, they can be run independently.
2. `verifyInvariants` which is constructed by ODS, it verifies the type,
attributes, .etc.
3. Other Traits/Interfaces that have marked their verifier as
`verifyTrait` or `verifyWithRegions=0`.
4. Custom verifier which is defined in the op and has marked
`hasVerifier=1`
If an operation has regions, then it may have the second phase,
5. Traits/Interfaces that have marked their verifier as
`verifyRegionTrait` or
`verifyWithRegions=1`. This implies the verifier needs to access the
operations in its regions.
6. Custom verifier which is defined in the op and has marked
`hasRegionVerifier=1`
Note that the second phase will be run after the operations in the
region are verified. Based on the verification order, you will be able to
avoid verifying duplicate things.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D116789
Fix MLIR-PyTACO and some tests to use np.array_equal to compare integer
values.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120526
Split arithmetic function into unary and binary functions. The revision prepares the introduction of unary and binary function attributes that work similar to type function attributes.
Depends On D120108
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120109
This change allows the use of scalar tensors with index 0 in tensor index
expressions. In this case, the scalar value is broadcast to match the
dimensions of other tensors in the same expression.
Using scalar tensors as a destination in tensor index expressions is not
supported in the PyTACO DSL.
Add a PyTACO test to show the use of scalar tensors.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120524
Prepare the OpDSL function handling to introduce more function classes. A follow up commit will split ArithFn into UnaryFn and BinaryFn. This revision prepares the split by adding a function kind enum to handle different function types using a single class on the various levels of the stack (for example, there is now one TensorFn and one ScalarFn).
Depends On D119718
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120108
This patch refactors the looping strategy of coalesce for future patches. The new strategy works in-place and uses IneqType to organize inequalities into vectors of the same type. Future coalesce cases will pattern match on this organization. E.g. the contained case needs all inequalities and equalities to be redundant, so this case becomes checking whether the respective vectors are empty. For other cases, the patterns consider the types of all inequalities of both sets making it wasteful to only consider whether a can be coalesced with b in one step, as inequalities would need to be typed again for the opposite case. Therefore, the new strategy tries to coalesce a with b and b with a in a single step.
Reviewed By: Groverkss, arjunp
Differential Revision: https://reviews.llvm.org/D120392
This patch replaces various functions over inequalities/equalities in
IntegerPolyhedron with Matrix functions already implementing them or refactors
them to a Matrix function.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120482
This patch moves the Presburger library to a new `presburger` namespace.
This allows to shorten some names, helps to avoid polluting the mlir namespace,
and also provides some structure.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120505
This patch removes redundant code from fourierMotzkinEliminate implementation
using existing functions in IntegerPolyhedron.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120502
Previously, OpDSL operation used hardcoded type conversion operations (cast or cast_unsigned). Supporting signed and unsigned casts thus meant implementing two different operations. Type function attributes allow us to define a single operation that has a cast type function attribute which at operation instantiation time may be set to cast or cast_unsigned. We may for example, defina a matmul operation with a cast argument:
```
@linalg_structured_op
def matmul(A=TensorDef(T1, S.M, S.K), B=TensorDef(T2, S.K, S.N), C=TensorDef(U, S.M, S.N, output=True),
cast=TypeFnAttrDef(default=TypeFn.cast)):
C[D.m, D.n] += cast(U, A[D.m, D.k]) * cast(U, B[D.k, D.n])
```
When instantiating the operation the attribute may be set to the desired cast function:
```
linalg.matmul(lhs, rhs, outs=[out], cast=TypeFn.cast_unsigned)
```
The revsion introduces a enum in the Linalg dialect that maps one-by-one to the type functions defined by OpDSL.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D119718
This patch removes the `const` from `usingBigM` to enable the implicit copy assignment operator for Simplex.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D120542
This transformation is useful to break dependency between consecutive loop
iterations by increasing the size of a temporary buffer. This is usually
combined with heavy software pipelining.
Differential Revision: https://reviews.llvm.org/D119406
This adds a variable op, emitted as C/C++ locale variable, which can be
used if the `emitc.constant` op is not sufficient.
As an example, the canonicalization pass would transform
```mlir
%0 = "emitc.constant"() {value = 0 : i32} : () -> i32
%1 = "emitc.constant"() {value = 0 : i32} : () -> i32
%2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
%3 = emitc.apply "&"(%1) : (i32) -> !emitc.ptr<i32>
emitc.call "write"(%2, %3) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> ()
```
into
```mlir
%0 = "emitc.constant"() {value = 0 : i32} : () -> i32
%1 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
%2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
emitc.call "write"(%1, %2) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> ()
```
resulting in pointer aliasing, as %1 and %2 point to the same address.
In such a case, the `emitc.variable` operation can be used instead.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D120098
This patch removes binary operator enum which was introduced with `omp.atomic.update`. Now the update operation handles update in a region so this is no longer required.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D120458
Documentation exists about the details of the API but is missing a
description of the overall structure per dialect.
Reviewed By: shabalin
Differential Revision: https://reviews.llvm.org/D117002
The current implementation of ShuffleVectorOp assumes all vectors are
scalable. LLVM IR allows shufflevector operations on scalable vectors,
and the current translation between LLVM Dialect and LLVM IR does the
rigth thing when the shuffle mask is all zeroes. This is required to
do a splat operation on a scalable vector, but it doesn't make sense
for scalable vectors outside of that operation, i.e.: with non-all zero
masks.
Differential Revision: https://reviews.llvm.org/D118371
In D115022, we introduced an optimization where OpResults of a `linalg.generic` may bufferize in-place with an "in" OpOperand if the corresponding "out" OpOperand is not used in the computation.
This optimization can lead to unexpected behavior if the newly chosen OpOperand is in the same alias set as another OpOperand (that is used in the computation). In that case, the newly chosen OpOperand must bufferize out-of-place. This can be confusing to users, as always choosing the "out" OpOperand (regardless of whether it is used) would be expected when having the notion of "destination-passing style" in mind.
With this change, we go back to always bufferizing in-place with "out" OpOperands by default, but letting users override the behavior with a bufferization option.
Differential Revision: https://reviews.llvm.org/D120182
Previously only accessing values for `index` and signless int types
would work; signed and unsigned ints would hit an assert in
`IntegerAttr::getInt`. This exposes `IntegerAttr::get{S,U}Int` to the C
API and calls the appropriate function from the python bindings.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120194
Previously, we only support float64. We now support float32 and float64. When
constructing a tensor without providing a data type, the default is float32.
Fix the tests to data type consistency. All PyTACO application tests now use
float32 to match the default data type of TACO. Other tests may use float32 or
float64.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120356
Now that sparse tensor types are first-class citizens and the sparse compiler
is taking shape, it is time to make sure other compiler optimizations compose
well with sparse tensors. Mostly, this should be completely transparent (i.e.,
dense and sparse take the same path). However, in some cases, optimizations
only make sense in the context of sparse tensors. This is a first example of
such an optimization, where fusing a sampled elt-wise multiplication only makes
sense when the resulting kernel has a potential lower asymptotic complexity due
to the sparsity.
As an extreme example, running SDDMM with 1024x1024 matrices and a sparse
sampling matrix with only two elements runs in 463.55ms in the unfused
case but just 0.032ms in the fused case, with a speedup of 14485x that
is only possible in the exciting world of sparse computations!
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D120429
By specifying a sectionMemoryMapper, users can control how
memory for JIT code is allocated.
In particular, I need this in order to use a named memory
region so that profilers such as perf(1) can correctly label
execution cycles coming from JIT'ed code.
Reviewed-by: ezhulenev
Differential Revision: https://reviews.llvm.org/D120415
Given a cmpf of either uitofp or sitofp and a constant, attempt to canonicalize it to a cmpi.
This PR rewrites equivalent code within LLVM to now apply to MLIR arith.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117257
+ compare block size with the unrollable inner dimension
+ reduce nesting in the code and simplify a bit IR building
Reviewed By: cota
Differential Revision: https://reviews.llvm.org/D120075
This eliminates the requirement that pass-related strings outlive pass
instances, which will facilitate future work enabling dynamic passes
written in other languages.
Differential Revision: https://reviews.llvm.org/D120341
Its number of optional parameters has grown too large,
which makes adding new optional parameters quite a chore.
Fix this by using an options struct.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D120380
Use an `MLIRContext` declared in a single place in the `parsePoly` function that almost all Presburger unit tests use for parsing sets. This function is only used in tests.
This saves us from having to declare and pass a new `MLIRContext` in every test.
Reviewed By: bondhugula, mehdi_amini
Differential Revision: https://reviews.llvm.org/D119251
This trait results in PDL ops being erroneously CSE'd. These ops are side-effect free in the rewriter but not in the matcher (where unused values aren't allowed anyways). These ops should have a more nuanced side-effect modeling, this is fixing a bug introduced by a previous change.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120222
Header class in SPIR-V HTML spec has changed. Update script to reflect that.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D120179
The related functionality is moved over to the bufferization dialect. Test cases are cleaned up a bit.
Differential Revision: https://reviews.llvm.org/D120191
This patch adds a class to represent a relation in Presburger Library.
This patch only adds the skeleton class. Functionality from IntegerPolyhedron
will be moved to IntegerRelation in later patches to make it easier to review.
This patch is a part of a series of patches adding support for relations in
Presburger Library.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120156
The `.def` and `.def_property_readonly` functions in PybindAdaptors.h should
construct the functions as method of the current class rather than as method of
pybind11:none(), which is an object and not even a class.
Depends On D117658
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D117659
This commit adds canonicalization pattern in `linalg.generic` op
for static shape inference. If any of the inputs or outputs have
static shape or is casted from a tensor of static shape, then
shapes of all the inputs and outputs can be inferred by using the
affine map of the static shape input/output.
Signed-Off-By: Prateek Gupta <prateek@nod-labs.com>
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D118929
This patch adds assemblyFormat for omp.sections operation.
Some existing functions have been altered to fit the custom directive
in assemblyFormat. This has led to their callsites to get modified too,
but those will be removed in later patches, when other operations get
their assemblyFormat. All operations were not changed in one patch for
ease of review.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D120176
This patch adds typing of inequalities to the simplex. This is a cental part of the coalesce algorithm and will be heavily used in later coalesce patches. Currently, only the three most basic types are supported with more to be introduced when they are needed.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D119925
This allows to differentiate between the cases where the optimum does not
exist due to being unbounded and due to the polytope being empty.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D120127
Add `BufferizableOpInterface::verifyAnalysis`. Ops can implement this method to check for expected invariants and limitations.
The purpose of this change is to introduce a modular way of checking assertions such as `assertScfForAliasingProperties`.
Differential Revision: https://reviews.llvm.org/D120189
This patch adds assemblyFormat for omp.parallel operation.
Some existing functions have been altered to fit the custom directive
in assemblyFormat. This has led to their callsites to get modified too,
but those will be removed in later patches, when other operations get
their assemblyFormat. All operations were not changed in one patch for
ease of review.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D120157
These routines will need to be specialized a lot more based on value types,
index types, pointer types, and permutation/dimension ordering. This is a
careful first step, providing some functionality needed in PyTACO bridge.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D120154
This patch removes the following clauses from OpenMP Dialect:
- private
- firstprivate
- lastprivate
- shared
- default
- copyin
- copyprivate
The privatization clauses are being handled in the flang frontend. The
data copying clauses are not being handled anywhere for now. Once
we have a better picture of how to handle these clauses in OpenMP
Dialect, we can add these. For the time being, removing unneeded
clauses.
For detailed discussion about this refer to [[ https://discourse.llvm.org/t/rfc-privatisation-in-openmp-dialect/3526 | Privatisation in OpenMP dialect ]]
Reviewed By: kiranchandramohan, clementval
Differential Revision: https://reviews.llvm.org/D120029
This patch introducing seperating dimensions into two types: Domain and Range.
This allows building relations over PresburgerSpace.
This patch is part of a series of patches to introduce relations in Presburger
library.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D119709
MLIR has the notion of allocation scopes which specify that stack allocations (e.g. memref.alloca, llvm.alloca) should be freed or equivalently aren't available at the end of the corresponding region.
Currently neither OpenMP parallel nor SCF parallel regions have the notion of such a scope.
This clearly makes sense for an OpenMP parallel as this is implemented in with a new function which outlines the region, and clearly any allocations in that newly outlined function have a lifetime that ends at the return of the function, by definition.
While SCF.parallel doesn't have a guaranteed runtime which it is implemented with, this similarly makes sense for SCF.parallel since otherwise an allocation within an SCF.parallel will needlessly continue to allocate stack memory that isn't cleaned up until the function (or other allocation scope op) which contains the SCF.parallel returns. This means that it is impossible to represent thread or iteration-local memory without causing a stack blow-up. In the case that this stack-blow-up behavior is intended, this can be equivalently represented with an allocation outside of the SCF.parallel with a size equal to the number of iterations.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119743