This will help refactoring some of the tools to prepare for the explicit registration of
Dialects.
Differential Revision: https://reviews.llvm.org/D86023
This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand:
- the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context.
- Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline.
This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled.
Differential Revision: https://reviews.llvm.org/D85622
It appears in this case that an implicit cast from StringRef to std::string
doesn't happen. Fixed with an explicit cast.
Differential Revision: https://reviews.llvm.org/D85986
This library does not depend on all the dialects, conceptually. This is
changing the recently introduced `mlirContextLoadAllDialects()` function
to not call `registerAllDialects()` itself, which aligns it better with
the C++ code anyway (and this is deprecated and will be removed soon).
This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand:
- the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context.
- Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline.
This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled.
The convresion of memref cast operaitons from the Standard dialect to the LLVM
dialect has been emitting bitcasts from a struct type to itself. Beyond being
useless, such casts are invalid as bitcast does not operate on aggregate types.
This kept working by accident because LLVM IR bitcast construction API skips
the construction if types are equal before it verifies that the types are
acceptable in a bitcast. Do not emit such bitcasts, the memref cast that only
adds/erases size information is in fact a noop on the current descriptor as it
always contains dynamic values for all sizes.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D85899
Masked loading/storing in various forms can be optimized
into simpler memory operations when the mask is all true
or all false. Note that the backend does similar optimizations
but doing this early may expose more opportunities for further
optimizations. This further prepares progressively lowering
transfer read and write into 1-D memory operations.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D85769
-- This commit handles the returnOp in memref map layout normalization.
-- An initial filter is applied on FuncOps which helps us know which functions can be
a suitable candidate for memref normalization which doesn't lead to invalid IR.
-- Handles memref map normalization for external function assuming the external function
is normalizable.
Differential Revision: https://reviews.llvm.org/D85226
This revision removes all of the lingering usages of Type::getKind. A consequence of this is that FloatType is now split into 4 derived types that represent each of the possible float types(BFloat16Type, Float16Type, Float32Type, and Float64Type). Other than this split, this revision is NFC.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D85566
These hooks were introduced before the Interfaces mechanism was available.
DialectExtractElementHook is unused and entirely removed. The
DialectConstantFoldHook is used a fallback in the
operation fold() method, and is replaced by a DialectInterface.
The DialectConstantDecodeHook is used for interpreting OpaqueAttribute
and should be revamped, but is replaced with an interface in 1:1 fashion
for now.
Differential Revision: https://reviews.llvm.org/D85595
- Add "using namespace mlir::tblgen" in several of the TableGen/*.cpp files and
eliminate the tblgen::prefix to reduce code clutter.
Differential Revision: https://reviews.llvm.org/D85800
Provide printing functions for most IR objects in C API (except Region that
does not have a `print` function, and Module that is expected to be printed as
Operation instead). The printing is based on a callback that is called with
chunks of the string representation and forwarded user-defined data.
Reviewed By: stellaraccident, Jing, mehdi_amini
Differential Revision: https://reviews.llvm.org/D85748
Using intptr_t is a consensus for MLIR C API, but the change was missing
from 75f239e975 (that was using unsigned initially) due to a
misrebase.
Reviewed By: stellaraccident, mehdi_amini
Differential Revision: https://reviews.llvm.org/D85751
This patch adds the translation of the proc_bind clause in a
parallel operation.
The values that can be specified for the proc_bind clause are
specified in the OMP.td tablegen file in the llvm/Frontend/OpenMP
directory. From this single source of truth enumeration for
proc_bind is generated in llvm and mlir (used in specification of
the parallel Operation in the OpenMP dialect). A function to return
the enum value from the string representation is also generated.
A new header file (DirectiveEmitter.h) containing definitions of
classes directive, clause, clauseval etc is created so that it can
be used in mlir as well.
Reviewers: clementval, jdoerfert, DavidTruby
Differential Revision: https://reviews.llvm.org/D84347
Inital conversion of `spv._address_of` and `spv.globalVariable`.
In SPIR-V, the global returns a pointer, whereas in LLVM dialect
the global holds an actual value. This difference is handled by
`spv._address_of` and `llvm.mlir.addressof`ops that both return
a pointer. Moreover, only current invocation is in conversion's
scope.
Reviewed By: antiagainst, mravishankar
Differential Revision: https://reviews.llvm.org/D84626
Now that LLVM dialect types are implemented directly in the dialect, we can use
MLIR hooks for verifying type construction invariants. Implement the verifiers
and use them in the parser.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D85663
Linalg to processors.
This changes adds infrastructure to distribute the loops generated in
Linalg to processors at the time of generation. This addresses use
case where the instantiation of loop is done just to distribute
them. The option to distribute is added to TilingOptions for now and
will allow specifying the distribution as a transformation option,
just like tiling and promotion are specified as options.
Differential Revision: https://reviews.llvm.org/D85147
- Fix ODS framework to suppress build methods that infer result types and are
ambiguous with collective variants. This applies to operations with a single variadic
inputs whose result types can be inferred.
- Extended OpBuildGenTest to test these kinds of ops.
Differential Revision: https://reviews.llvm.org/D85060
Previously, the memory leaks on heap. Since the MlirOperationState is not intended to be used again after mlirOperationCreate, the patch simplify frees the memory in mlirOperationCreate instead of creating any new API.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D85629
This diff attempts to resolve the TODO in `getOpIndexSet` (formerly
known as `getInstIndexSet`), which states "Add support to handle IfInsts
surronding `op`".
Major changes in this diff:
1. Overload `getIndexSet`. The overloaded version considers both
`AffineForOp` and `AffineIfOp`.
2. The `getInstIndexSet` is updated accordingly: its name is changed to
`getOpIndexSet` and its implementation is based on a new API `getIVs`
instead of `getLoopIVs`.
3. Add `addAffineIfOpDomain` to `FlatAffineConstraints`, which extracts
new constraints from the integer set of `AffineIfOp` and merges it to
the current constraint system.
4. Update how a `Value` is determined as dim or symbol for
`ValuePositionMap` in `buildDimAndSymbolPositionMaps`.
Differential Revision: https://reviews.llvm.org/D84698
This patch also fixes a minor issue that shape.rank should allow
returning !shape.size. The dialect doc has such an example for
shape.rank.
Differential Revision: https://reviews.llvm.org/D85556
This reverts commit 9f24640b7e.
We hit some dead-locks on thread exit in some configurations: TLS exit handler is taking a lock.
Temporarily reverting this change as we're debugging what is going on.
This revision aims to provide a new API, `checkTilingLegality`, to
verify that the loop tiling result still satisifes the dependence
constraints of the original loop nest.
Previously, there was no check for the validity of tiling. For instance:
```
func @diagonal_dependence() {
%A = alloc() : memref<64x64xf32>
affine.for %i = 0 to 64 {
affine.for %j = 0 to 64 {
%0 = affine.load %A[%j, %i] : memref<64x64xf32>
%1 = affine.load %A[%i, %j - 1] : memref<64x64xf32>
%2 = addf %0, %1 : f32
affine.store %2, %A[%i, %j] : memref<64x64xf32>
}
}
return
}
```
You can find more information about this example from the Section 3.11
of [1].
In general, there are three types of dependences here: two flow
dependences, one in direction `(i, j) = (0, 1)` (notation that depicts a
vector in the 2D iteration space), one in `(i, j) = (1, -1)`; and one
anti dependence in the direction `(-1, 1)`.
Since two of them are along the diagonal in opposite directions, the
default tiling method in `affine`, which tiles the iteration space into
rectangles, will violate the legality condition proposed by Irigoin and
Triolet [2]. [2] implies two tiles cannot depend on each other, while in
the `affine` tiling case, two rectangles along the same diagonal are
indeed dependent, which simply violates the rule.
This diff attempts to put together a validator that checks whether the
rule from [2] is violated or not when applying the default tiling method
in `affine`.
The canonical way to perform such validation is by examining the effect
from adding the constraint from Irigoin and Triolet to the existing
dependence constraints.
Since we already have the prior knowlegde that `affine` tiles in a
hyper-rectangular way, and the resulting tiles will be scheduled in the
same order as their respective loop indices, we can simplify the
solution to just checking whether all dependence components are
non-negative along the tiling dimensions.
We put this algorithm into a new API called `checkTilingLegality` under
`LoopTiling.cpp`. This function iterates every `load`/`store` pair, and
if there is any dependence between them, we get the dependence component
and check whether it has any negative component. This function returns
`failure` if the legality condition is violated.
[1]. Bondhugula, Uday. Effective Automatic parallelization and locality optimization using the Polyhedral model. https://dl.acm.org/doi/book/10.5555/1559029
[2]. Irigoin, F. and Triolet, R. Supernode Partitioning. https://dl.acm.org/doi/10.1145/73560.73588
Differential Revision: https://reviews.llvm.org/D84882
Implement the Reduction Tree Pass framework as part of the MLIR Reduce tool. This is a parametarizable pass that allows for the implementation of custom reductions passes in the tool.
Implement the FunctionReducer class as an example of a Reducer class parameter for the instantiation of a Reduction Tree Pass.
Create a pass pipeline with a Reduction Tree Pass with the FunctionReducer class specified as parameter.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D83969
This also beefs up the test coverage:
- Make unranked memref testing consistent with ranked memrefs.
- Add testing for the invalid element type cases.
This is not quite NFC: index types are now allowed in unranked memrefs.
Differential Revision: https://reviews.llvm.org/D85541
This simple patch translates the num_threads and if clauses of the parallel
operation. Also includes test cases.
A minor change was made to parsing of the if clause to parse AnyType and
return the parsed type. Updates to test cases also.
Reviewed by: SouraVX
Differential Revision: https://reviews.llvm.org/D84798
This revision refactors the default definition of the attribute and type `classof` methods to use the TypeID of the concrete class instead of invoking the `kindof` method. The TypeID is already used as part of uniquing, and this allows for removing the need for users to define any of the type casting utilities themselves.
Differential Revision: https://reviews.llvm.org/D85356
Subclass data is useful when a certain amount of memory is allocated, but not all of it is used. In the case of Type, that hasn't been the case for a while and the subclass is just taking up a full `unsigned`. Removing this frees up ~8 bytes for almost every type instance.
Differential Revision: https://reviews.llvm.org/D85348
This class allows for defining thread local objects that have a set non-static lifetime. This internals of the cache use a static thread_local map between the various different non-static objects and the desired value type. When a non-static object destructs, it simply nulls out the entry in the static map. This will leave an entry in the map, but erase any of the data for the associated value. The current use cases for this are in the MLIRContext, meaning that the number of items in the static map is ~1-2 which aren't particularly costly enough to warrant the complexity of pruning. If a use case arises that requires pruning of the map, the functionality can be added.
This is especially useful in the context of MLIR for implementing thread-local caching of context level objects that would otherwise have very high lock contention. This revision adds a thread local cache in the MLIRContext for attributes, identifiers, and types to reduce some of the locking burden. This led to a speedup of several hundred miliseconds when compiling a conversion pass on a very large mlir module(>300K operations).
Differential Revision: https://reviews.llvm.org/D82597
This allows for bucketing the different possible storage types, with each bucket having its own allocator/mutex/instance map. This greatly reduces the amount of lock contention when multi-threading is enabled. On some non-trivial .mlir modules (>300K operations), this led to a compile time decrease of a single conversion pass by around half a second(>25%).
Differential Revision: https://reviews.llvm.org/D82596
This change adds initial support needed to generate OpenCL compliant SPIRV.
If Kernel capability is declared then memory model becomes OpenCL.
If Addresses capability is declared then addressing model becomes Physical64.
Additionally for Kernel capability interface variable ABI attributes are not
generated as entry point function is expected to have normal arguments.
Differential Revision: https://reviews.llvm.org/D85196