Summary:
This diff adds a new operation to linalg to allow reshaping of an
existing view into a new view in the same buffer at the same offset.
More specifically:
The `linalg.reshape` op produces a new view whose sizes are a reassociation
of the original `view`. Depending on whether or not the reassociated
MemRefType is contiguous, the resulting memref may require explicit alloc
and copies.
A reassociation is defined as a continous grouping of dimensions and is
represented with a affine map array attribute. In the future, non-continous
groupings may be allowed (i.e. permutations, reindexings etc).
For now, it is assumed that either:
1. a reassociation produces and consumes contiguous MemRefType or,
2. the reshape op will be folded into its consumers (by changing the shape
of the computations).
All other cases are undefined behavior and a reshape op may not lower to
LLVM if it cannot be proven statically that it does not require alloc+copy.
A reshape may either collapse or expand dimensions, depending on the
relationship between source and target memref ranks. The verification rule
is that the reassociation maps are applied to the memref with the larger
rank to obtain the memref with the smaller rank. In the case of a dimension
expansion, the reassociation maps can be interpreted as inverse maps.
Examples:
```mlir
// Dimension collapse (i, j) -> i' and k -> k'
%1 = linalg.reshape %0 [(i, j, k) -> (i, j),
(i, j, k) -> (k)] :
memref<?x?x?xf32, stride_spec> into memref<?x?xf32, stride_spec_2>
```
```mlir
// Dimension expansion i -> (i', j') and (k) -> (k')
%1 = linalg.reshape %0 [(i, j, k) -> (i, j),
(i, j, k) -> (k)] :
memref<?x?xf32, stride_spec> into memref<?x?x?xf32, stride_spec_2>
```
The relevant invalid and roundtripping tests are added.
Reviewers: AlexEichenberger, ftynse, rriddle, asaadaldien, yangjunpro
Subscribers: kiszk, merge_guards_bot, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72168
Summary: This diff reimplements getStridesAndOffset in a significantly simpler way by operating on the AffineExpr and calling into simplifyAffineExpr instead of rolling its own saturating arithmetic.
As a consequence it becomes quite simple to extend the behavior of getStridesAndOffset to encompass more cases by manipulating the AffineExpr more directly.
The divisions are still filtered out and continue to yield fully dynamic strides.
Simplifying the divisions is left for a later time if compelling use cases arise.
Relevant tests are added.
Reviewers: ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72098
Summary: This fixes the return value of helper methods on the base range class.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D72127
This fixes the error:
```
mlir/include/mlir/Dialect/Linalg/Utils/Utils.h:72:3: error: from definition of 'template<class LoopTy> mlir::edsc::GenericLoopNestRangeBuilder<LoopTy>::GenericLoopNestRangeBuilder(llvm::ArrayRef<mlir::edsc::ValueHandle*>, llvm::ArrayRef<mlir::Value>)' [-fpermissive]
GenericLoopNestRangeBuilder(ArrayRef<edsc::ValueHandle *> ivs,
```
This was tested independently on a Docker image with gcc-5 by jpienaar@
This should fix the error:
```
mlir/include/mlir/Dialect/Linalg/Utils/Utils.h:72:3: error: from definition of 'template<class LoopTy> mlir::edsc::GenericLoopNestRangeBuilder<LoopTy>::GenericLoopNestRangeBuilder(llvm::ArrayRef<mlir::edsc::ValueHandle*>, llvm::ArrayRef<mlir::Value>)' [-fpermissive]
GenericLoopNestRangeBuilder(ArrayRef<edsc::ValueHandle *> ivs,
```
This commit fixes shader ABI attributes to use `spv.` as the prefix
so that they match the dialect's namespace. This enables us to add
verification hooks in the SPIR-V dialect to verify them.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D72062
Summary:
This changes the implementation of OpResult to have some of the results be represented inline in Value, via a pointer int pair of Operation*+result number, and the rest being trailing objects on the main operation. The full details of the new representation is detailed in the proposal here:
https://groups.google.com/a/tensorflow.org/g/mlir/c/XXzzKhqqF_0/m/v6bKb08WCgAJ
The only difference between here and the above proposal is that we only steal 2-bits for the Value kind instead of 3. This means that we can only fit 2-results inline instead of 6. This allows for other users to steal the final bit for PointerUnion/etc. If necessary, we can always steal this bit back in the future to save more space if 3-6 results are common enough.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D72020
Summary:
This diff adds support to allow `linalg.generic` and
`linalg.indexed_generic` to take tensor input and output
arguments.
The subset of output tensor operand types must appear
verbatim in the result types after an arrow. The parser,
printer and verifier are extended to accomodate this
behavior.
The Linalg operations now support variadic ranked tensor
return values. This extension exhibited issues with the
current handling of NativeCall in RewriterGen.cpp. As a
consequence, an explicit cast to `SmallVector<Value, 4>`
is added in the proper place to support the new behavior
(better suggestions are welcome).
Relevant cleanups and name uniformization are applied.
Relevant invalid and roundtrip test are added.
Reviewers: mehdi_amini, rriddle, jpienaar, antiagainst, ftynse
Subscribers: burmako, shauheen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72022
Lots of SPIR-V ops take enum attributes and certain enum cases
need extra capabilities or extensions to be available. This commit
extends to allow specifying availability spec on enum cases.
Extra utility functions are generated for the corresponding enum
classes to return the availability requirement. The availability
interface implemention for a SPIR-V op now goes over all enum
attributes to collect the availability requirements.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D71947
* replaceAllUsesWith may be supplied with a null value.
* some compilers fail to implicitly convert single result operations to
OpaqueValue, so add an explicit OpOperand::set(Value) method.
Summary: This is part of an ongoing cleanup and uniformization work.
Reviewers: ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72084
Summary:
This is part of an ongoing cleanup and uniformization work.
This diff performs 3 types of cleanups:
1. Uniformize transformation names.
2. Replace all pattern operands that need not be captured by `$_`
3. Replace all usage of pattern captured op by the normalized `op` name (instead of positional parameters such as `$0`)
Reviewers: ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72081
Summary: This is part of an ongoing cleanup and uniformization work.
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72078
This allows us to include the definitions of these attributes in
other files without pulling in all dependencies for lowering.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D72054
for (const auto &x : llvm::zip(..., ...))
->
for (auto x : llvm::zip(..., ...))
The return type of zip() is a wrapper that wraps a tuple of references.
> warning: loop variable 'p' is always a copy because the range of type 'detail::zippy<detail::zip_shortest, ArrayRef<long> &, ArrayRef<long> &>' does not return a reference [-Wrange-loop-analysis]
Fixes: warning: comparison of integers of different signs: 'const unsigned int' and '(anonymous namespace)::OperationPrinter::(anonymous enum at F:\llvm-project\mlir\lib\IR\AsmPrinter.cpp:1444:3)' [-Wsign-compare]
Summary: A new class is added, IRMultiObjectWithUseList, that allows for representing an IR use list that holds multiple sub values(used in this case for OpResults). This class provides all of the same functionality as the base IRObjectWithUseList, but for specific sub-values. This saves a word per operation result and is a necessary step in optimizing the layout of operation results. For now the use list is placed on the operation itself, so zero-result operations grow by a word. When the work for optimizing layout is finished, this can be moved back to being a trailing object based on memory/runtime benchmarking.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D71955
Summary: The successor operand counts are directly tied to block operands anyways, and this simplifies the trailing objects of Operation(i.e. one less computation to perform).
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D71949
SPIR-V has a few mechanisms to control op availability: version,
extension, and capabilities. These mechanisms are considered as
different availability classes.
This commit introduces basic definitions for modelling SPIR-V
availability classes. Specifically, an `Availability` class is
added to SPIRVBase.td, along with two subclasses: MinVersion
and MaxVersion for versioning. SPV_Op is extended to take a
list of `Availability`. Each `Availability` instance carries
information for generating op interfaces for the corresponding
availability class and also the concrete availability
requirements.
With the availability spec on ops, we can now auto-generate the
op interfaces of all SPIR-V availability classes and also
synthesize the op's implementations of these interfaces. The
interface generation is done via new TableGen backends
-gen-avail-interface-{decls|defs}. The op's implementation is
done via -gen-spirv-avail-impls.
Differential Revision: https://reviews.llvm.org/D71930
The conversion from std.and/std.or to spv.LogicalAnd/spv.LogicalOr is
only valid for boolean (i1) types. Modify BinaryOpPattern in
StandardToSPIRV.td to allow limiting the type of the operands for
which the pattern is applied.
Differential Revision: https://reviews.llvm.org/D71881
Summary:
`mlir-translate -import-llvm test.ll` was going into segmentation fault if `test.ll` had `float` or `double` constants.
For example,
```
%3 = fadd double 3.030000e+01, %0
```
Now, it is handled in `Importer::getConstantAsAttr` (similar behaviour as normal integers)
Added tests for FP arithmetic
Reviewers: ftynse, mehdi_amini
Reviewed By: ftynse, mehdi_amini
Subscribers: shauheen, mehdi_amini, rriddle, jpienaar, burmako, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71912
This change refactors pass options to be more similar to how statistics are modeled. More specifically, the options are specified directly on the pass instead of in a separate options class. (Note that the behavior and specification for pass pipelines remains the same.) This brings about several benefits:
* The specification of options is much simpler
* The round-trip format of a pass can be generated automatically
* This gives a somewhat deeper integration with "configuring" a pass, which we could potentially expose to users in the future.
PiperOrigin-RevId: 286953824
This means that in-place, or root, updates need to use explicit calls to `startRootUpdate`, `finalizeRootUpdate`, and `cancelRootUpdate`. The major benefit of this change is that it enables in-place updates in DialectConversion, which simplifies the FuncOp pattern for example. The major downside to this is that the cases that *may* modify an operation in-place will need an explicit cancel on the failure branches(assuming that they started an update before attempting the transformation).
PiperOrigin-RevId: 286933674
This will enable future commits to reimplement the internal implementation of OpResult without needing to change all of the existing users. This is part of a chain of commits optimizing the size of operation results.
PiperOrigin-RevId: 286930047
This will enable future commits to reimplement the internal implementation of OpResult without needing to change all of the existing users. This is part of a chain of commits optimizing the size of operation results.
PiperOrigin-RevId: 286919966
This is an initial step to refactoring the representation of OpResult as proposed in: https://groups.google.com/a/tensorflow.org/g/mlir/c/XXzzKhqqF_0/m/v6bKb08WCgAJ
This change will make it much simpler to incrementally transition all of the existing code to use value-typed semantics.
PiperOrigin-RevId: 286844725
Rename the 'shlis' operation in the standard dialect to 'shift_left'. Add tests
for this operation (these have been missing so far) and add a lowering to the
'shl' operation in the LLVM dialect.
Add also 'shift_right_signed' (lowered to LLVM's 'ashr') and 'shift_right_unsigned'
(lowered to 'lshr').
The original plan was to name these operations 'shift.left', 'shift.right.signed'
and 'shift.right.unsigned'. This works if the operations are prefixed with 'std.'
in MLIR assembly. Unfortunately during import the short form is ambigous with
operations from a hypothetical 'shift' dialect. The best solution seems to omit
dots in standard operations for now.
Closestensorflow/mlir#226
PiperOrigin-RevId: 286803388
- a block argument associated with an arbitrary op can't be a valid
dimensional identifier; it has to be the block argument of either
a function op or an affine.for.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#331
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/331 from bondhugula:valid_dim 3273b4fcbaa31fb7b6671d93c9e42a6b2a6a4e4c
PiperOrigin-RevId: 286593693
This will allow us to lower most of gpu.all_reduce (when all_reduce
doesn't exist in the target dialect) within the GPU dialect, and only do
target-specific lowering for the shuffle op.
PiperOrigin-RevId: 286548256
This is the block argument equivalent of the existing `getAsmResultNames` hook.
Closestensorflow/mlir#329
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/329 from plaidml:flaub-region-arg-names fc7876f2d1335024e441083cd25263fd6247eb7d
PiperOrigin-RevId: 286523299
Concatting lists in TableGen is easy, creating unique lists less so. There is no reason for duplicated op traits so we could throw an error instead but duplicates could occur due to concatting different list of traits in ODS (e.g., for convenience reasons), so just dedup them during Operator trait construction instead.
PiperOrigin-RevId: 286488423
Update vector transfer_read/write ops to operatate on memrefs with vector element type.
This handle cases where the memref vector element type represents the minimal memory transfer unit (or multiple of the minimal memory transfer unit).
PiperOrigin-RevId: 286482115
This CL allows specifying an additional name for specifying the .td file that is used to generate the doc for a dialect. This is necessary for a dialect like Linalg which has different "types" of ops that are used in different contexts.
This CL also restructures the Linalg documentation and renames LinalgLibraryOps -> LinalgStructuredOps but is otherwise NFC.
PiperOrigin-RevId: 286450414
Adds vector ReshapeOp to the VectorOps dialect. An aggregate vector reshape operation, which aggregates multiple hardware vectors, can enable optimizations during decomposition (e.g. loading one input hardware vector and performing multiple rotate and scatter store operations to the vector output).
PiperOrigin-RevId: 286440658
Introduces some centralized methods to move towards
consistent use of i32 as vector subscripts.
Note: sizes/strides/offsets attributes are still i64
PiperOrigin-RevId: 286434133
This function template has been introduced in the early days of MLIR to work
around the absence of common type for ranges of values (operands, block
argumeents, vectors, etc). Core IR now provides ValueRange for exactly this
purpose. Use it instead of the template parameter.
PiperOrigin-RevId: 286431338
Added test cases for the newly added LLVM operations and lowering features.
Closestensorflow/mlir#300
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/300 from dfki-jugr:std_to_llvm da6168bbc1a369ae2e99ad3881fdddd82f075dd4
PiperOrigin-RevId: 286231169
This enables providing a default implementation of an interface method. This method is defined on the Trait that is attached to the operation, and thus has all of the same constraints and properties as any other interface method. This allows for interface authors to provide a conservative default implementation for certain methods, without requiring that all users explicitly define it. The default implementation can be specified via the argument directly after the interface method body:
StaticInterfaceMethod<
/*desc=*/"Returns whether two array of types are compatible result types for an op.",
/*retTy=*/"bool",
/*methodName=*/"isCompatibleReturnTypes",
/*args=*/(ins "ArrayRef<Type>":$lhs, "ArrayRef<Type>":$rhs),
/*methodBody=*/[{
return ConcreteOp::isCompatibleReturnTypes(lhs, rhs);
}],
/*defaultImplementation=*/[{
/// Returns whether two arrays are equal as strongest check for
/// compatibility by default.
return lhs == rhs;
}]
PiperOrigin-RevId: 286226054
'```mlir' is used to indicate the code block is MLIR code/should use MLIR syntax
highlighting, while '{.mlir}' was a markdown extension that used a style file
to color the background differently of the code block. The background color
extension was a custom one that we can retire given we have syntax
highlighting.
Also change '```td' to '```tablegen' to match chroma syntax highlighting
designation.
PiperOrigin-RevId: 286222976
* Fixes use of anonymous namespace for static methods.
* Uses explicit qualifiers(mlir::) instead of wrapping the definition with the namespace.
PiperOrigin-RevId: 286222654
Introduce affine.prefetch: op to prefetch using a multi-dimensional
subscript on a memref; similar to affine.load but has no effect on
semantics, but only on performance.
Provide lowering through std.prefetch, llvm.prefetch and map to llvm's
prefetch instrinsic. All attributes reflected through the lowering -
locality hint, rw, and instr/data cache.
affine.prefetch %0[%i, %j + 5], false, 3, true : memref<400x400xi32>
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#225
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/225 from bondhugula:prefetch 4c3b4e93bc64d9a5719504e6d6e1657818a2ead0
PiperOrigin-RevId: 286212997
The definition of the function template LLVM::ModuleTranslation::lookupValues
has been located in a source file. As long as it has been the only file that
actually called into the function, this did not cause any problem. However, it
creates linking issues if the function is used from other translation units.
PiperOrigin-RevId: 286203078
When memory attributions are present in `gpu.func`, require that they are of
memref type and live in memoryspaces 3 and 5 for workgroup and private memory
attributions, respectively. Adapt the conversion from the GPU dialect to the
NVVM dialect to drop the private memory space from attributions as NVVM is able
to model them as local `llvm.alloca`s in the default memory space.
PiperOrigin-RevId: 286161763
The inline interface uses two methods to check legality of inling:
1) Can a region be inlined into another.
2) Can an operation be inlined into another.
Setting the former to true, allows the inliner to use the second for
legality checks. Add this method to the SPIR-V dialect inlining
interface.
PiperOrigin-RevId: 286041734
The lowering of MemRef types to the LLVM dialect is connected to the underlying
runtime representation of structured memory buffers. It has changed several
times in the past and reached the current state of a LLVM structured-typed
descriptor containing two pointers and all sizes. In several reported use
cases, a different, often simpler, lowering scheme is required. For example,
lowering statically-shaped memrefs to bare LLVM pointers to simplify aliasing
annotation. Split the pattern population functions into those include
memref-related operations and the remaining ones. Users are expected to extend
TypeConverter::convertType to handle the memref types differently.
PiperOrigin-RevId: 286030610
This function has become redundant with MemRefDescriptor::getElementType and is
no longer necessary. Use the MemRefDescriptor pervasively to concentrate
descriptor-related logic in one place and drop the utility function.
PiperOrigin-RevId: 286024168
This CL adds more Linalg EDSC ops and tests to support building pointwise operations along with conv and dilated_conv.
This also fixes a bug in the existing linalg_matmul EDSC and beefs up the test.
The current set of ops is already enough to build an interesting, albeit simple, model used internally.
PiperOrigin-RevId: 285838012
This updates the lowering pipelines from the GPU dialect to lower-level
dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation
instead of a standard function annotated with an attribute. In particular, the
kernel outlining is updated to produce gpu.func instead of std.func and the
individual conversions are updated to consume gpu.funcs and disallow standard
funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved
in the generic syntax, but can also be used with the custom syntax on
gpu.funcs. The special kind of function for GPU allows one to use additional
features such as memory attribution.
PiperOrigin-RevId: 285822272
This keeps the IR valid and consistent as it is expected that each block should have a valid parent region/operation. Previously, converted blocks were kept floating without a valid parent region.
PiperOrigin-RevId: 285821687
The conversion from the Loops dialect to the Standard dialect, also known as
loop-to-cfg lowering, has been historically a function pass. It can be required
on non-Standard function Ops, in particular the recently introduced GPU
functions. Make the conversion an operation pass instead of a function pass.
PiperOrigin-RevId: 285814560
This PR targest issue tensorflow/mlir#295. It exposes the already existing
subiew promotion pass as a declarative pattern
Change-Id: If901ebef9fb53fcd0b12ecc536f6b174ce320b92
Closestensorflow/mlir#315
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/315 from tetuante:issue295 8e5f268b6d85f31015c33505329dbd7a4db97ac5
PiperOrigin-RevId: 285801463
Similar to insert/extract vector instructions but
(1) work on 1-D vectors only
(2) allow for a dynamic index
%c3 = constant 3 : index
%0 = vector.insertelement %arg0, %arg1[%c : index] : vector<4xf32>
%1 = vector.extractelement %arg0[%c3 : index] : vector<4xf32>
PiperOrigin-RevId: 285792205
ExtractSlicesOp extracts slices of its vector operand and with a specified tiling scheme.
This operation centralizes the tiling scheme around a single op, which simplifies vector op unrolling and subsequent pattern rewrite transformations.
PiperOrigin-RevId: 285761129
During the conversion from the standard dialect to the LLVM dialect,
memref-typed arguments are promoted from registers to memory and passed into
functions by pointer. This had been introduced into the lowering to work around
the abesnce of calling convention modeling in MLIR to enable better
interoperability with LLVM IR generated from C, and has been exerciced for
several months. Make this promotion the default calling covention when
converting to the LLVM dialect. This adds the documentation, simplifies the
code and makes the conversion consistent across function operations and
function types used in other places, e.g. in high-order functions or
attributes, which would not follow the same rule previously.
PiperOrigin-RevId: 285751280
This will be evolved into a simple programming model for custom ops and custom layers in followup CLs.
This CL also deletes the obsolete tablegen's reference-impl.td that was using EDSCs.
PiperOrigin-RevId: 285459545
This change allows for DialectConversion to attempt folding as a mechanism to legalize illegal operations. This also expands folding support in OpBuilder::createOrFold to generate new constants when folding, and also enables it to work in the context of a PatternRewriter.
PiperOrigin-RevId: 285448440
The clamp value determines the returned predicate. Previously, the clamp value was fixed to 31 and the predicate was therefore always true. This is incorrect for partial warp reductions, but went unnoticed because the returned values happened to be zero (but it could be anything).
PiperOrigin-RevId: 285343160
This cleans up the implementation of the various operation print methods. This is done via a combination of code cleanup, adding new streaming methods to the printer(e.g. operand ranges), etc.
PiperOrigin-RevId: 285285181
This type is not used anymore now that Linalg view and subview have graduated to std and that alignment is supported on alloc.
PiperOrigin-RevId: 285213424