Conversions from dialect A to dialect B depend on both A and B. Therefore, it
is reasonable for them to live in a separate library that depends on both
DialectA and DialectB library, and does not forces dependees of DialectA or
DialectB to also link in the conversion. Create the directory layout for the
conversions and move the Standard to LLVM dialect conversion as the first
example.
PiperOrigin-RevId: 253312252
This CL adds a generic CopyOp to Linalg and its lowering to loops.
The CopyOp supports input and output permutation maps.
When combined with tiling and allocating a new local buffer, this should provide basic support for implementing simple memory transfers with coalescing.
At the moment, lowering copies to a library call is not supported.
PiperOrigin-RevId: 253250497
Some compilers find initializer list constructors from boolean literals
ambiguous between ArrayRef<bool> and ArrayRef<Attribute>. Call the
ArrayRef<bool> constructor explicitly to disambiguate.
PiperOrigin-RevId: 253224859
This converts entire loops into threads/blocks. No check on the size of the
block or grid, or on the validity of parallelization is performed, it is under
the responsibility of the caller to strip-mine the loops and to perform the
dependence analysis before calling the conversion.
PiperOrigin-RevId: 253189268
`affine.apply` is supposed to operate on values of index types in context of
affine loops. It is possible to programmatically constuct an `affine.apply`
that takes values of other types as operands or returns them, but it would not
be parseable. Disallow such cases in the verifier.
PiperOrigin-RevId: 253021704
This terminator operation should appear at the end of the blocks in the body
region of `gpu.launch` when the control flow needs to be returned from the
kernel. Using `std.return` in this place is ambiguous: it may exit the body
region or the enclosing function. Furthermore, this allows the GPU dialect to
impose the absence of return values as required by the underlying kernel
execution models.
Update outlining transformation from `gpu.launch` to `gpu.launch_func` so that
it replaces `gpu.return` with `std.return`.
PiperOrigin-RevId: 252985992
Pointer types need to specify the storage class. We use the utility functions
generated from SPV_StorageClassAttr to parse and print the storage classes.
Also improved the case that no element type is provided for (runtime) array.
PiperOrigin-RevId: 252935599
This CL adds a generic FillOp to Linalg and its lowering to loops.
This is achieved by avoiding to specify the static NLoopTypes and ViewRanks type traits but instead defines the relevant methods as `extraClassDeclaration`.
The relevant AffineMap and scalar emission code are added, with relevant tests.
This gives us a first rank-agnostic Linalg op with its generic lowering to loops that should compose with view-based tiling and fusion.
PiperOrigin-RevId: 252869205
llvm::maskTrailingOnes<char> runs into a static assertion on the type not being
unsigned. Use `unsigned char` instead of `char`.
PiperOrigin-RevId: 252827214
Missing a spot with std::make_pair causes a compiler error in OSS.
Also fixes the warning:
```
warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
it->getSecond()->getType().isa<BufferType>() &&
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
"Buffer or block argument expected");
```
PiperOrigin-RevId: 252738323
1) Lowest minimum pattern stack depth when legalizing.
- This leads the system to favor patterns that have lower legalization stacks, i.e. represent a more direct mapping to the target.
2) Pattern benefit.
- When considering multiple patterns with the same legalization depth, this favors patterns with a larger specified benefit.
PiperOrigin-RevId: 252713470
This CL adds a fusion pass for the Linalg dialect.
Fusion is backed by a simple analysis on SSA values and proceeds as follows:
1. A dependence and alias analyses are performed on views.
2. A Linalg op is tiled by a particular tile size. This creates a new Linalg op operating on tiled loops and tiled views.
3. The dependence analysis is used to obtain ops that produce views that are consumed by the original Linalg op.
4. Dependence analysis is used to determine whether op-level fusion would violate any dependence.
5. If fusion is safe, matching tiled views are sliced for the producing op.
6. A tiled clone of the producer op is written before the tiled consumer op.
If a producer is fused, its entire output view has been computed in tiled form.
The original producer op is then erased.
PiperOrigin-RevId: 252695194
This CL adds a lowering to LLVM for MamulOp and a corresponding integration test.
View descriptor manipulation is moved from MLIR's LLVM dialect to C++ code compiled on the side. To this end a separation is introduced between `cblas.cpp` and `cblas_interface.cpp`, the latter operating on view types whose ABI correspond to the LLVM signature generated by MLIR.
An intermediary step is introduced that allocates a new descriptor on the MLIR side for the purpose of passing it to LLVM. The reason for this extra step is that the ABI for by-value ViewType objects wants aligned descriptors, e.g.:
```
extern "C" void linalg_dot_impl(ViewType<float, 1> X, ViewType<float, 1> Y,
BaseViewType<float> Z) {
...
}
```
produces LLVM IR with the signature:
```
%struct.ViewType = type { %struct.BaseViewType, [1 x i64], [1 x i64] }
%struct.BaseViewType = type { float*, i64 }
define void @linalg_dot_impl(%struct.ViewType* byval align 8, %struct.ViewType* byval align 8, float*, i64) tensorflow/mlir#0 {
...
}
```
We don't seem to be able to make such aligned allocations in the MLIR -> LLVM converter atm.
Going through a level of indirection allows the test to pass.
The temporary tradeoff is that the MLIR shims have to be written by hand.
They will disappear in the future.
PiperOrigin-RevId: 252670672
This introduces the support for region-containing operations to the dialect
conversion framework in order to support the conversion of affine control-flow
operations into the standard control flow with branches. Regions that belong
to an operation are converted before the operation itself. The
DialectConversionPattern can therefore access the converted regions of the
original operation and process them further if necessary. In particular, the
conversion is allowed to move the blocks from the original region to other
regions and to split blocks into multiple blocks. All block manipulations must
be performed through the PatternRewriter to ensure they will be undone if the
conversion fails.
Port the pass converting from the affine dialect (loops and ifs with bodies as
regions) to the standard dialect (branch-based cfg) to use DialectConversion in
order to exercise this new functionality. The modification to the lowering
functions are minor and are focused on using the PatterRewriter instead of
directly modifying the IR.
PiperOrigin-RevId: 252625169
This allows us to have SPIRVOps.td as the single entry point for
all SPIR-V ops, which simplifies downstream users and build rules.
PiperOrigin-RevId: 252609258
This CL exposes a parseType method which allows standalone reuse of the MLIR type parsing mechanism. This is a free function for now because the underlying MLIR parser is not guaranteed to receive a StringRef which lives in the proper MemBuffer. This requires building a new MemBuffer/SourceMgr and modifying the Parser constructor to not require an mlir::Module.
The error diagnostic emitted by parseType has context limited to the local string.
For now the dialect has the additional option to emit its own extra error that has the FileLineColLoc context.
In the future, both error messages should be combined into a single error.
PiperOrigin-RevId: 252468911
This CL enables verification code generation for variadic operands and results.
In verify(), we use fallback getter methods to access all the dynamic values
belonging to one static variadic operand/result to reuse the value range
calculation there.
PiperOrigin-RevId: 252288219
This CL added getODSOperands() and getODSResults() as fallback getter methods for
getting all the dynamic values corresponding to a static operand/result (which
can be variadic). It should provide a uniform way of calculating the value ranges.
All named getter methods are layered on top of these methods now.
PiperOrigin-RevId: 252284270
Enum attributes can be defined using `EnumAttr`, which requires all its cases
to be defined with `EnumAttrCase`. To facilitate the interaction between
`EnumAttr`s and their C++ consumers, add a new EnumsGen TableGen backend
to generate a few common utilities, including an enum class, `llvm::DenseMapInfo`
for the enum class, conversion functions from/to strings.
This is controlled via the `-gen-enum-decls` and `-gen-enum-defs` command-line
options of `mlir-tblgen`.
PiperOrigin-RevId: 252209623