This CL also updates to use containing region as a fallback way to find
context since functions will eventually become ops with regions.
PiperOrigin-RevId: 253627322
Index types integers of platform-specific bit width. They are used to index
memrefs and as loop induction variables, however they could not be obtained
from an integer until now, making it virtually impossible to express indirect
accesses (given that memrefs of indices are not allowed) or data-dependent
loops. Introduce `std.index_cast` to transform indices into integers and vice
versa. The semantics of this cast is to sign-extend when casting to a wider
integer, and to truncate when casting to a narrower integer. It belongs to
StandardOps because both types it operates on are standard types, and because
its results are likely to be used in std.load and std.store.
Introduce llvm.sext, llvm.zext and llvm.trunc operations to the LLVM dialect.
Provide the conversion of `std.index_cast` to llvm.sext or llvm.trunc,
depending on the actual bitwidth of `index` known during the conversion.
PiperOrigin-RevId: 253624100
Arguably, this function is only useful for transformations and should not
pollute the main IR. Also make sure it accepts a the resulting container
by-reference instead of returning it.
PiperOrigin-RevId: 253622981
Conversions from dialect A to dialect B depend on both A and B. Therefore, it
is reasonable for them to live in a separate library that depends on both
DialectA and DialectB library, and does not forces dependees of DialectA or
DialectB to also link in the conversion. Create the directory layout for the
conversions and move the Standard to LLVM dialect conversion as the first
example.
PiperOrigin-RevId: 253312252
This CL adds a generic CopyOp to Linalg and its lowering to loops.
The CopyOp supports input and output permutation maps.
When combined with tiling and allocating a new local buffer, this should provide basic support for implementing simple memory transfers with coalescing.
At the moment, lowering copies to a library call is not supported.
PiperOrigin-RevId: 253250497
Some compilers find initializer list constructors from boolean literals
ambiguous between ArrayRef<bool> and ArrayRef<Attribute>. Call the
ArrayRef<bool> constructor explicitly to disambiguate.
PiperOrigin-RevId: 253224859
This converts entire loops into threads/blocks. No check on the size of the
block or grid, or on the validity of parallelization is performed, it is under
the responsibility of the caller to strip-mine the loops and to perform the
dependence analysis before calling the conversion.
PiperOrigin-RevId: 253189268
`affine.apply` is supposed to operate on values of index types in context of
affine loops. It is possible to programmatically constuct an `affine.apply`
that takes values of other types as operands or returns them, but it would not
be parseable. Disallow such cases in the verifier.
PiperOrigin-RevId: 253021704
This terminator operation should appear at the end of the blocks in the body
region of `gpu.launch` when the control flow needs to be returned from the
kernel. Using `std.return` in this place is ambiguous: it may exit the body
region or the enclosing function. Furthermore, this allows the GPU dialect to
impose the absence of return values as required by the underlying kernel
execution models.
Update outlining transformation from `gpu.launch` to `gpu.launch_func` so that
it replaces `gpu.return` with `std.return`.
PiperOrigin-RevId: 252985992
Pointer types need to specify the storage class. We use the utility functions
generated from SPV_StorageClassAttr to parse and print the storage classes.
Also improved the case that no element type is provided for (runtime) array.
PiperOrigin-RevId: 252935599
This CL adds a generic FillOp to Linalg and its lowering to loops.
This is achieved by avoiding to specify the static NLoopTypes and ViewRanks type traits but instead defines the relevant methods as `extraClassDeclaration`.
The relevant AffineMap and scalar emission code are added, with relevant tests.
This gives us a first rank-agnostic Linalg op with its generic lowering to loops that should compose with view-based tiling and fusion.
PiperOrigin-RevId: 252869205
llvm::maskTrailingOnes<char> runs into a static assertion on the type not being
unsigned. Use `unsigned char` instead of `char`.
PiperOrigin-RevId: 252827214
Missing a spot with std::make_pair causes a compiler error in OSS.
Also fixes the warning:
```
warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
it->getSecond()->getType().isa<BufferType>() &&
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
"Buffer or block argument expected");
```
PiperOrigin-RevId: 252738323
1) Lowest minimum pattern stack depth when legalizing.
- This leads the system to favor patterns that have lower legalization stacks, i.e. represent a more direct mapping to the target.
2) Pattern benefit.
- When considering multiple patterns with the same legalization depth, this favors patterns with a larger specified benefit.
PiperOrigin-RevId: 252713470
This CL adds a fusion pass for the Linalg dialect.
Fusion is backed by a simple analysis on SSA values and proceeds as follows:
1. A dependence and alias analyses are performed on views.
2. A Linalg op is tiled by a particular tile size. This creates a new Linalg op operating on tiled loops and tiled views.
3. The dependence analysis is used to obtain ops that produce views that are consumed by the original Linalg op.
4. Dependence analysis is used to determine whether op-level fusion would violate any dependence.
5. If fusion is safe, matching tiled views are sliced for the producing op.
6. A tiled clone of the producer op is written before the tiled consumer op.
If a producer is fused, its entire output view has been computed in tiled form.
The original producer op is then erased.
PiperOrigin-RevId: 252695194
This CL adds a lowering to LLVM for MamulOp and a corresponding integration test.
View descriptor manipulation is moved from MLIR's LLVM dialect to C++ code compiled on the side. To this end a separation is introduced between `cblas.cpp` and `cblas_interface.cpp`, the latter operating on view types whose ABI correspond to the LLVM signature generated by MLIR.
An intermediary step is introduced that allocates a new descriptor on the MLIR side for the purpose of passing it to LLVM. The reason for this extra step is that the ABI for by-value ViewType objects wants aligned descriptors, e.g.:
```
extern "C" void linalg_dot_impl(ViewType<float, 1> X, ViewType<float, 1> Y,
BaseViewType<float> Z) {
...
}
```
produces LLVM IR with the signature:
```
%struct.ViewType = type { %struct.BaseViewType, [1 x i64], [1 x i64] }
%struct.BaseViewType = type { float*, i64 }
define void @linalg_dot_impl(%struct.ViewType* byval align 8, %struct.ViewType* byval align 8, float*, i64) tensorflow/mlir#0 {
...
}
```
We don't seem to be able to make such aligned allocations in the MLIR -> LLVM converter atm.
Going through a level of indirection allows the test to pass.
The temporary tradeoff is that the MLIR shims have to be written by hand.
They will disappear in the future.
PiperOrigin-RevId: 252670672
This introduces the support for region-containing operations to the dialect
conversion framework in order to support the conversion of affine control-flow
operations into the standard control flow with branches. Regions that belong
to an operation are converted before the operation itself. The
DialectConversionPattern can therefore access the converted regions of the
original operation and process them further if necessary. In particular, the
conversion is allowed to move the blocks from the original region to other
regions and to split blocks into multiple blocks. All block manipulations must
be performed through the PatternRewriter to ensure they will be undone if the
conversion fails.
Port the pass converting from the affine dialect (loops and ifs with bodies as
regions) to the standard dialect (branch-based cfg) to use DialectConversion in
order to exercise this new functionality. The modification to the lowering
functions are minor and are focused on using the PatterRewriter instead of
directly modifying the IR.
PiperOrigin-RevId: 252625169
This allows us to have SPIRVOps.td as the single entry point for
all SPIR-V ops, which simplifies downstream users and build rules.
PiperOrigin-RevId: 252609258