with the objective to reduce large test cases into smaller ones while
preserving their interesting behavior.
Implement the framework to parse the command line arguments, parse the
input MLIR test case into a module and call reduction passes on the MLIR module.
Implement the Tester class which allows the different reduction passes to test the
interesting behavior of the generated reduced variants of the test case and keep track
of the most reduced generated variant.
The UnrollVectorPattern is can be used in a programmable fashion by:
```
OwningRewritePatternList patterns;
patterns.insert<UnrollVectorPattern<AddFOp>>(ArrayRef<int64_t>{2, 2}, ctx);
patterns.insert<UnrollVectorPattern<vector::ContractionOp>>(
ArrayRef<int64_t>{2, 2, 2}, ctx);
...
applyPatternsAndFoldGreedily(getFunction(), patterns);
```
Differential revision: https://reviews.llvm.org/D83064
Introduce pass to convert parallel affine.for op into 1-D
affine.parallel op. Run using --affine-parallelize. Removes
test-detect-parallel: pass for checking parallel affine.for ops.
Differential Revision: https://reviews.llvm.org/D82672
This pass removes redundant dialect-independent Copy operations in different
situations like the following:
%from = ...
%to = ...
... (no user/alias for %to)
copy(%from, %to)
... (no user/alias for %from)
dealloc %from
use(%to)
Differential Revision: https://reviews.llvm.org/D82757
Default vector.contract lowering essentially yields a series of sdot/ddot
operations. However, for some layouts a series of saxpy/daxpy operations,
chained through fma are more efficient. This CL introduces a choice between
the two lowering paths. A default heuristic is to follow.
Some preliminary avx2 performance numbers for matrix-times-vector.
Here, dot performs best for 64x64 A x b and saxpy for 64x64 A^T x b.
```
------------------------------------------------------------
A x b A^T x b
------------------------------------------------------------
GFLOPS sdot (reassoc) saxpy sdot (reassoc) saxpy
------------------------------------------------------------
1x1 0.6 0.9 0.6 0.9
2x2 2.5 3.2 2.4 3.5
4x4 6.4 8.4 4.9 11.8
8x8 11.7 6.1 5.0 29.6
16x16 20.7 10.8 7.3 43.3
32x32 29.3 7.9 6.4 51.8
64x64 38.9 79.3
128x128 32.4 40.7
------------------------------------------------------------
```
Reviewed By: nicolasvasilache, ftynse
Differential Revision: https://reviews.llvm.org/D83012
This commit augments spv.CopyMemory's implementation to support 2 memory
access operands. Hence, more closely following the spec. The following
changes are introduces:
- Customize logic for spv.CopyMemory serialization and deserialization.
- Add 2 additional attributes for source memory access operand.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D82710
This patch adds the capability to perform exact integer emptiness checks for FlatAffineConstraints using the General Basis Reduction algorithm (GBR). Previously, only a heuristic was available for emptiness checks, which was not guaranteed to always give a conclusive result.
This patch adds a `Simplex` class, which can be constructed using a `FlatAffineConstraints`, and can find an integer sample point (if one exists) using the GBR algorithm. Additionally, it adds two classes `Matrix` and `Fraction`, which are used by `Simplex`.
The integer emptiness check functionality can be accessed through the new `FlatAffineConstraints::isIntegerEmpty()` function, which runs the existing heuristic first and, if that proves to be inconclusive, runs the GBR algorithm to produce a conclusive result.
Differential Revision: https://reviews.llvm.org/D80860
This allow lowering to support scf.for and scf.if with results. As right now
spv region operations don't have return value the results are demoted to
Function memory. We create one allocation per result right before the region
and store the yield values in it. Then we can load back the value from
allocation to be able to use the results.
Differential Revision: https://reviews.llvm.org/D82246
MSVC 2017 doesn't support the case where a trailing variadic template list comes after template types with default parameters. Until we upgrade to VS 2019, we can't use the simplified definitions.
Moving forward dialects should only be registered in a thread safe context. This matches the existing usage we have today, but it allows for removing quite a bit of expensive locking from the context.
This led to ~.5 a second compile time improvement when running one conversion pass on a very large .mlir file(hundreds of thousands of operations).
Differential Revision: https://reviews.llvm.org/D82595
This revision adds support to ODS for generating interfaces for attributes and types, in addition to operations. These interfaces can be specified using `AttrInterface` and `TypeInterface` in place of `OpInterface`. All of the features of `OpInterface` are supported except for the `verify` method, which does not have a matching representation in the Attribute/Type world. Generating these interface can be done using `gen-(attr|type)-interface-(defs|decls|docs)`.
Differential Revision: https://reviews.llvm.org/D81884
This revisions add mechanisms to Attribute/Type for attaching traits and interfaces. The mechanisms are modeled 1-1 after those for operations to keep the system consistent. AttrBase and TypeBase now accepts a trailing list of `Trait` types that will be attached to the object. These traits should inherit from AttributeTrait::TraitBase and TypeTrait::TraitBase respectively as necessary. A followup commit will refactor the interface gen mechanisms in ODS to support Attribute/Type interface generation and add tests for the mechanisms.
Differential Revision: https://reviews.llvm.org/D81883
Also fixed bug in type inferface generator to address bug where operands and
attributes are interleaved.
Differential Revision: https://reviews.llvm.org/D82819
Current Affine comparison builders, which use operator overload, default to signed comparison. This creates the possibility of misuse of these builders and potential correctness issues when dealing with unsigned integers. This change makes the distinction between signed and unsigned comparison builders and forces the caller to make a choice between the two.
Differential Revision: https://reviews.llvm.org/D82323
Summary:
The patch makes the index type lowering of the GPU to NVVM/ROCDL conversion configurable. It introduces a pass option that controls the bitwidth used when lowering index computations and uses the LowerToLLVMOptions structure to control the Standard to LLVM lowering.
This commit fixes a use-after-free bug introduced by the reverted commit d10b1a3. It implements the following changes:
- Added a getDefaultOptions method to the LowerToLLVMOptions struct that returns a reference to statically allocated default options.
- Use the getDefaultOptions method to provide default LowerToLLVMOptions (instead of an initializer list).
- Added comments to clarify the required lifetime of the LowerToLLVMOptions
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D82475
`llvm.mlir.constant` was originally introduced as an LLVM dialect counterpart
to `std.constant`. As such, it was supporting "function pointer" constants
derived from the symbol name. This is different from `std.constant` that allows
for creation of a "function" constant since MLIR, unlike LLVM IR, supports
this. Later, `llvm.mlir.addressof` was introduced as an Op that obtains a
constant pointer to a global in the LLVM dialect. It naturally extends to
functions (in LLVM IR, functions are globals) and should be used for defining
"function pointer" values instead.
Fixes PR46344.
Differential Revision: https://reviews.llvm.org/D82667
Rationale:
In general, passing "fastmath" from MLIR to LLVM backend is not supported, and even just providing such a feature for experimentation is under debate. However, passing fine-grained fastmath related attributes on individual operations is generally accepted. This CL introduces an option to instruct the vector-to-llvm lowering phase to annotate floating-point reductions with the "reassociate" fastmath attribute, which allows the LLVM backend to use SIMD implementations for such constructs. Oher lowering passes can start using this mechanism right away in cases where reassociation is allowed.
Benefit:
For some microbenchmarks on x86-avx2, speedups over 20 were observed for longer vector (due to cleaner, spill-free and SIMD exploiting code).
Usage:
mlir-opt --convert-vector-to-llvm="reassociate-fp-reductions"
Reviewed By: ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D82624
Add a pass to rewrite sequential chains of `spirv::CompositeInsert`
operations into `spirv::CompositeConstruct` operations.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D82198
This patch add support for 'spv.CopyMemory'. The following changes are
introduced:
- 'CopyMemory' op is added to SPIRVOps.td.
- Custom parse and print methods are introduced.
- A few Roundtripping tests are added.
Differential Revision: https://reviews.llvm.org/D82384
Initially, unranked memref descriptors in the LLVM dialect were designed only
to be passed into functions. An assertion was guarding against returning
unranked memrefs from functions in the standard-to-LLVM conversion. This is
insufficient for functions that wish to return an unranked memref such that the
caller does not know the rank in advance, and hence cannot allocate the
descriptor and pass it in as an argument.
Introduce a calling convention for returning unranked memref descriptors as
follows. An unranked memref descriptor always points to a ranked memref
descriptor stored on stack of the current function. When an unranked memref
descriptor is returned from a function, the ranked memref descriptor it points
to is copied to dynamically allocated memory, the ownership of which is
transferred to the caller. The caller is responsible for deallocating the
dynamically allocated memory and for copying the pointed-to ranked memref
descriptor onto its stack.
Provide default lowerings for std.return, std.call and std.indirect_call that
maintain the conversion defined above.
This convention is additionally exercised by a runtime test to guard against
memory errors.
Differential Revision: https://reviews.llvm.org/D82647
Using fully qualified names wherever possible avoids ambiguous class and function names. This is a follow-up to D82371.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D82471
Replace any `rank(shape_of(tensor))` that relies on a ranked tensor with the
corresponding constant `const_size`.
Differential Revision: https://reviews.llvm.org/D82077
This patch introduces conversion patterns for `spv.module` and `spv._module_end`.
SPIR-V module is converted into `ModuleOp`. This will play a role of enclosing
scope to LLVM ops. At the moment, SPIR-V module attributes (such as memory model,
etc) are ignored.
Differential Revision: https://reviews.llvm.org/D82468
This revision adds a new support header, InterfaceSupport, to contain various generic bits of functionality for implementing "Interfaces". Interfaces embody a mechanism for attaching concept-based polymorphism to a type system. With this refactoring a new InterfaceMap type is added to allow for efficient interface lookups without going through an indirect call. This should provide a decent performance speedup without changing the size of AbstractOperation.
In a future revision, this functionality will also be used to bring Interface like functionality to Attributes and Types.
Differential Revision: https://reviews.llvm.org/D81882
Introduced `llvm.intr.bitreverse` and `llvm.intr.ctpop` LLVM bit
intrinsics to LLVM dialect. These intrinsics help with SPIR-V to
LLVM conversion, allowing a direct mapping from `spv.BitReverse`
and `spv.BitCount` respectively. Tests are added to `roundtrip.mlir`
and `llvm-intrinsics.mlir`.
Differential Revision: https://reviews.llvm.org/D82285
This patch provides an implementation for `spv.func` conversion. The pattern
is populated in a separate method added to the pass. At the moment, the type
signature conversion only includes the supported types. The conversion pattern
also matches SPIR-V function control attributes to LLVM function attributes.
Those are modelled as `passthrough` attributes in LLVM dialect. The following
mapping are used:
- None: no attributes passed
- Inline: `alwaysinline` seems to be the right equivalent (`inlinehint` is
semantically weaker in my opinion)
- DontInline: `noinline`
- Pure and Const: I think those can be modelled as `readonly` and `readnone`
attributes respectively.
Also, 2 patterns added for return ops conversion (`spv.Return` for void return
and `spv.ReturnValue` for a single value return).
Differential Revision: https://reviews.llvm.org/D81931
Example of Matmul implementation in linalg.generic operation contained few mistakes that can puzzle new startes when trying to run the example.
Differential Revision: https://reviews.llvm.org/D82289
The patch makes the index type lowering of the GPU to NVVM/ROCDL
conversion configurable. It introduces a pass option that controls the
bitwidth used when lowering index computations.
Differential Revision: https://reviews.llvm.org/D80285
Summary:
We already had a parallel loop specialization pass that is used to
enable unrolling and consecutive vectorization by rewriting loops
whose bound is defined as a min of a constant and a dynamic value
into a loop with static bound (the constant) and the minimum as
bound, wrapped into a conditional to dispatch between the two.
This adds the same rewriting for for loops.
Differential Revision: https://reviews.llvm.org/D82189
Avoid using max on unsigned constants, in case the caller is using 0 we
end up with:
warning: taking the max of unsigned zero and a value is always equal to the other value [-Wmax-unsigned-zero]
Instead we can just use native TableGen to fold the comparison here.
Allow lhs and rhs to have different type than accumulator/destination. Some
hardware like GPUs support natively operations like uint8xuint8xuint32.
Differential Revision: https://reviews.llvm.org/D82069
Use direct vector constants for the 1-D case. This approach
scales much better than generating elaborate insertion operations
that are eventually folded into a constant. We could of course
generalize the 1-D case to higher ranks, but this simplification
already helps in scaling some microbenchmarks that would formerly
crash on the intermediate IR length.
Reviewed By: reidtatge
Differential Revision: https://reviews.llvm.org/D82144
All class derived from `edsc::NestedBuilder` in core MLIR have been replaced
with alternatives based on OpBuilder+callbacks. The *Builder EDSC
infrastructure has been deprecated. Remove edsc::NestedBuilder.
This completes the "structured builders" refactoring.
Differential Revision: https://reviews.llvm.org/D82128
Callback-based constructions of blocks where the body is populated in the same
function as the block creation is a natural extension of callback-based loop
construction. They provide more concise and simple APIs than EDSC BlockBuilder
at less than 20% infrastructural code cost, and are compatible with
ScopedContext. BlockBuilder, Blockhandle and related functionality has been
deprecated, remove them.
Differential Revision: https://reviews.llvm.org/D82015
Callback-based loop construction, with loop bodies being constructed during the
construction of the parent op using a function, is now fully supported by the
core infrastructure. This provides almost the same level of brevity as EDSC
LoopBuilder at less than 30% infrastructural code cost. Functional equivalents
compatible with EDSC ScopedContext are implemented on top of the main builders.
LoopBuilder and related functionality has been deprecated, remove it.
Differential Revision: https://reviews.llvm.org/D81874
This revision removes the TypeConverter parameter passed to the apply* methods, and instead moves the responsibility of region type conversion to patterns. The types of a region can be converted using the 'convertRegionTypes' method, which acts similarly to the existing 'applySignatureConversion'. This method ensures that all blocks within, and including those moved into, a region will have the block argument types converted using the provided converter.
This has the benefit of making more of the legalization logic controlled by patterns, instead of being handled explicitly by the driver. It also opens up the possibility to support multiple type conversions at some point in the future.
This revision also adds a new utility class `FailureOr<T>` that provides a LogicalResult friendly facility for returning a failure or a valid result value.
Differential Revision: https://reviews.llvm.org/D81681
Existing implementation of affine loop nest builders relies on EDSC
ScopedContext, which is not used pervasively. Provide a common OpBuilder-based
helper function to construct a perfect nest of affine loops with the body of
the innermost loop populated by a callback. Use this function to implement the
EDSC version.
Affine "for" loops differ from SCF "for" loops by (1) not allowing to yield
values and (2) supporting short-hand form for constant bounds, which justifies
a separate implementation of the loop nest builder for the same of simplicity.
Differential Revision: https://reviews.llvm.org/D81955
Traditionally patterns have always had the root operation kind hardcoded to a specific operation name. This has worked well for quite some time, but it has certain limitations that make it undesirable. For example, some lowering have the same implementation for many different operations types with a few lowering entire dialects using the same pattern implementation. This problem has led to several "solutions":
a) Provide a template implementation to the user so that they can instantiate it for each operation combination, generally requiring the inclusion of the auto-generated operation definition file.
b) Use a non-templated pattern that allows for providing the name of the operation to match
- No one ever does this, because enumerating operation names can be cumbersome and so this quickly devolves into solution a.
This revision removes the restriction that patterns have a hardcoded root type, and allows for a class patterns that could match "any" operation type. The major downside of root-agnostic patterns is that they make certain pattern analyses more difficult, so it is still very highly encouraged that an operation specific pattern be used whenever possible.
Differential Revision: https://reviews.llvm.org/D82066
This class enables for abstracting more of the details for the rewrite process, and will allow for clients to apply specific cost models to the pattern list. This allows for DialectConversion and the GreedyPatternRewriter to share the same underlying matcher implementation. This also simplifies the plumbing necessary to support dynamic patterns.
Differential Revision: https://reviews.llvm.org/D81985
Summary:
The "i1" (viz. bool) type does not have a proper equivalent on the "C"
size. So, to avoid any ABIs issues, we simply use print_i32 on an i32
value of one or zero for true and false. This has the added advantage
that one less function needs to be implemented when porting the runtime
support library.
Reviewers: ftynse, bkramer, nicolasvasilache
Reviewed By: ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D82048
Summary:
Fixed build of D81618
Add a pattern for expanding tanh op into exp form.
A `tanh` is expanded into:
1) 1-exp^{-2x} / 1+exp^{-2x}, if x => 0
2) exp^{2x}-1 / exp^{2x}+1 , if x < 0.
Differential Revision: https://reviews.llvm.org/D82040
Summary:
This is to provide a utility to remove unsupported constraints or for
pipelines that happen to receive these but cannot lower them due to not
supporting assertions.
Differential Revision: https://reviews.llvm.org/D81560
The ScopedBuilder class in EDSC is being gradually phased out in favor of core
OpBuilder-based helpers with callbacks. Provide helper functions that are
compatible with `edsc::ScopedContext` and can be used to create and populate
blocks using callbacks that take block arguments as callback arguments. This
removes the need for `edsc::BlockHandle`, forward-declaration of `Value`s used
for block arguments and the tag `edsc::Append` class, leading to noticable
reduction in the verbosity of the code using helper functions.
Remove "eager mode" construction tests that are only relevant to the
`BlockBuilder`-based approach.
`edsc::BlockHandle` and `edsc::BlockBuilder` are now deprecated and will be
removed soon.
Differential Revision: https://reviews.llvm.org/D82008
This patch adjust the load/store matrix intrinsics, formerly known as
llvm.matrix.columnwise.load/store, to improve the naming and allow
passing of extra information (volatile).
The patch performs the following changes:
* Rename columnwise.load/store to column.major.load/store. This is more
expressive and also more in line with the naming in Clang.
* Changes the stride arguments from i32 to i64. The stride can be
larger than i32 and this makes things more uniform with the way
things are handled in Clang.
* A new boolean argument is added to indicate whether the load/store
is volatile. The lowering respects that when emitting vector
load/store instructions
* MatrixBuilder is updated to require both Alignment and IsVolatile
arguments, which are passed through to the generated intrinsic. The
alignment is set using the `align` attribute.
The changes are grouped together in a single patch, to have a single
commit that breaks the compatibility. We probably should be fine with
updating the intrinsics, as we did not yet officially support them in
the last stable release. If there are any concerns, we can add
auto-upgrade rules for the columnwise intrinsics though.
Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse
Reviewed By: anemet, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D81472
- This will allow calling these functions from Op's that support this interface (like FuncOp) directly:
```
FuncOp func = ...
func.isPrivate()
```
Differential Revision: https://reviews.llvm.org/D82060
Summary:
- Define the MatrixTimesScalar operation and add roundtrip tests.
- Added a new base class for matrix-specific operations to avoid invalid operands type mismatch check.
- Created a separate Matrix arithmetic operations td file to add more operations in the future.
- Augmented the automatically generated verify method to print more fine-grained error messages.
- Made minor Updates to the matrix type tests.
Reviewers: antiagainst, rriddle, mravishankar
Reviewed By: antiagainst
Subscribers: mehdi_amini, jpienaar, shauheen, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, bader, grosul1, frgossen, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D81677
Summary:
Two integration tests focused on i1 vectors, which exposed omissions
in the llvm backend which have since then been fixed. Note that this also
exposed an inaccuracy for print_i1 which has been fixed in this CL:
for a pure C ABI, int should be used rather than bool.
Reviewers: nicolasvasilache, ftynse, reidtatge, andydavis1, bkramer
Reviewed By: bkramer
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D81957
Implement the missing lowering from `std.dim` to the LLVM dialect in case of a
dynamic dimension.
Differential Revision: https://reviews.llvm.org/D81834
Recent work has introduced support for constructing loops via `::build` with
callbacks that construct loop bodies using only the core OpBuilder. This is now
supported on all loop types that Linalg lowers to. Refactor LoopNestBuilder in
Linalg to rely on this functionality instead of using a custom EDSC-based
approach to creating loop nests.
The specialization targeting parallel loops is also simplified by factoring out
the recursive call into a separate static function and considering only two
alternatives: top-level loop is parallel or sequential.
This removes the last remaining in-tree use of edsc::LoopBuilder, which is now
deprecated and will be removed soon.
Differential Revision: https://reviews.llvm.org/D81873
Similarly to `scf::ForOp`, introduce additional `function_ref` arguments to
`::build` functions of SCF `ParallelOp` and `ReduceOp`. The provided functions
will be called to construct the body of the respective operations while
constructing the operation itself. Exercise them in LoopUtils.
Differential Revision: https://reviews.llvm.org/D81872
Summary:
This revision replaces MatmulOp, now that DRR rules have been dropped.
This revision also fixes minor parsing bugs and a plugs a few holes to get e2e paths working (e.g. library call emission).
During the replacement the i32 version had to be dropped because only the EDSC operators +, *, etc support type inference.
Deciding on a type-polymorphic behavior, and implementing it, is left for future work.
Reviewers: aartbik
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D81935
- Modify HasParent trait to allow one of several op's as a parent -
- Expose this trait in the ODS framework using the ParentOneOf<> trait.
Differential Revision: https://reviews.llvm.org/D81880
This allows for passing a lambda to addDynamicallyLegalDialect without needing to explicit wrap with Optional<DynamicLegalityCallbackFn>.
Differential Revision: https://reviews.llvm.org/D81680
It is quite common for the same type to be converted many types throughout the conversion process, and there isn't any good reason why we aren't caching that result. Especially given that we currently use identity conversion to signify legality. This revision also adds a few additional helpers to TypeConverter.
Differential Revision: https://reviews.llvm.org/D81679
This revision replaces MatmulOp, now that DRR rules have been dropped.
This revision also fixes minor parsing bugs and a plugs a few holes to get e2e paths working (e.g. library call emission).
During the replacement the i32 version had to be dropped because only the EDSC operators +, *, etc support type inference.
Deciding on a type-polymorphic behavior, and implementing it, is left for future work.
Differential Revision: https://reviews.llvm.org/D79762
This is intended to avoid programming mistake where a temporary OpOperand is
created, for example:
for (OpOperand user : result.getUsers()) {
It can be confusing for the user, in particular since in MLIR most classes are intended to
be copied around by value while they have reference semantics.
Differential Revision: https://reviews.llvm.org/D81815
This reverts commit 32c757e4f8.
Broke the build bot:
******************** TEST 'MLIR :: Examples/standalone/test.toy' FAILED ********************
[...]
/tmp/ci-KIMiRFcVZt/lib/libMLIRLinalgToLLVM.a(LinalgToLLVM.cpp.o): In function `(anonymous namespace)::ConvertLinalgToLLVMPass::runOnOperation()':
LinalgToLLVM.cpp:(.text._ZN12_GLOBAL__N_123ConvertLinalgToLLVMPass14runOnOperationEv+0x100): undefined reference to `mlir::populateExpandTanhPattern(mlir::OwningRewritePatternList&, mlir::MLIRContext*)'
Summary:
Add a pattern for expanding tanh op into exp form.
A `tanh` is expanded into:
1) 1-exp^{-2x} / 1+exp^{-2x}, if x => 0
2) exp^{2x}-1 / exp^{2x}+1 , if x < 0.
Differential Revision: https://reviews.llvm.org/D81618
Similarly to `scf::ForOp`, introduce additional `function_ref` arguments to
`AffineForOp::build` that can be used to populate the body of the loop during
its construction. Provide compatibility functions for constructing affine loop
nests using `edsc::ScopedContext`.
`edsc::AffineLoopNestBuilder` and reletad functionality is now deprecated and
will be removed soon, users are expected to switch to `affineLoopNestBuilder`
that provides similar functionality with a simpler OpBuilder-based
implementation.
Differential Revision: https://reviews.llvm.org/D81754
Use ::Adaptor alias instead uniformly. Makes the naming more consistent as
adaptor can refer to attributes now too.
Differential Revision: https://reviews.llvm.org/D81789
Modify structure type in SPIR-V dialect to support:
1) Multiple decorations per structure member
2) Key-value based decorations (e.g., MatrixStride)
This commit kept the Offset decoration separate from members'
decorations container for easier implementation and logical clarity.
As such, all references to Structure layoutinfo are now offsetinfo,
and any member layout defining decoration (e.g., RowMajor for Matrix)
will be add to the members' decorations container along with its
value if any.
Differential Revision: https://reviews.llvm.org/D81426
Add `NoSideEffect` trait to `shape.to_extent_tensor` and
`shape.from_extent_tensor` and defined custom assembly format for the
operations.
Differential Revision: https://reviews.llvm.org/D81158
Modify structure type in SPIR-V dialect to support:
1) Multiple decorations per structure member
2) Key-value based decorations (e.g., MatrixStride)
This commit kept the Offset decoration separate from members'
decorations container for easier implementation and logical clarity.
As such, all references to Structure layoutinfo are now offsetinfo,
and any member layout defining decoration (e.g., RowMajor for Matrix)
will be add to the members' decorations container along with its
value if any.
Differential Revision: https://reviews.llvm.org/D81426
Modify structure type in SPIR-V dialect to support:
1) Multiple decorations per structure member
2) Key-value based decorations (e.g., MatrixStride)
This commit kept the Offset decoration separate from members'
decorations container for easier implementation and logical clarity.
As such, all references to Structure layoutinfo are now offsetinfo,
and any member layout defining decoration (e.g., RowMajor for Matrix)
will be add to the members' decorations container along with its
value if any.
Differential Revision: https://reviews.llvm.org/D81426
Following the previous revision `D81100`, this commit implements a templated class
that would provide conversion patterns for “straightforward” SPIR-V ops into
LLVM dialect. Templating allows to abstract away from concrete implementation
for each specific op. Those are mainly binary operations. Currently supported
and tested ops are:
- Arithmetic ops: `IAdd`, `ISub`, `IMul`, `FAdd`, `FSub`, `FMul`, `FDiv`, `FNegate`,
`SDiv`, `SRem` and `UDiv`
- Bitwise ops: `BitwiseAnd`, `BitwiseOr`, `BitwiseXor`
The implementation relies on `SPIRVToLLVMConversion` class that makes use of
`OpConversionPattern`.
Differential Revision: https://reviews.llvm.org/D81305
Allow for dynamic indices in the `dim` operation.
Rather than an attribute, the index is now an operand of type `index`.
This allows to apply the operation to dynamically ranked tensors.
The correct lowering of dynamic indices remains to be implemented.
Differential Revision: https://reviews.llvm.org/D81551
The operation `get_extent` now accepts the dimension as an operand and is no
longer limited to constant dimensions.
A helper function facilitates the common constant use case.
Differential Revision: https://reviews.llvm.org/D81248
This is useful for manipulating the standard dialect from transformations
outside of the standard dialect.
Differential Revision: https://reviews.llvm.org/D80609
These commits set up the skeleton for SPIR-V to LLVM dialect conversion.
I created SPIR-V to LLVM pass, registered it in Passes.td, InitAllPasses.h.
Added a pattern for `spv.BitwiseAndOp` and tests for it. Integer, float
and vector types are converted through LLVMTypeConverter.
Differential Revision: https://reviews.llvm.org/D81100
The SSA values created with `shape.const_size` are now named depending on the
value.
A constant size of 3, e.g., is now automatically named `%c3`.
Differential Revision: https://reviews.llvm.org/D81249
This parameter gives the developers the freedom to choose their desired function
signature conversion for preparing their functions for buffer placement. It is
introduced for BufferAssignmentFuncOpConverter, and also for
BufferAssignmentReturnOpConverter, and BufferAssignmentCallOpConverter to adapt
the return and call operations with the selected function signature conversion.
If the parameter is set, buffer placement won't also deallocate the returned
buffers.
Differential Revision: https://reviews.llvm.org/D81137
Summary:
This revision adds a common folding pattern that starts appearing on
vector_transfer ops.
Differential Revision: https://reviews.llvm.org/D81281
Summary:
`mlir-rocm-runner` is introduced in this commit to execute GPU modules on ROCm
platform. A small wrapper to encapsulate ROCm's HIP runtime API is also inside
the commit.
Due to behavior of ROCm, raw pointers inside memrefs passed to `gpu.launch`
must be modified on the host side to properly capture the pointer values
addressable on the GPU.
LLVM MC is used to assemble AMD GCN ISA coming out from
`ConvertGPUKernelToBlobPass` to binary form, and LLD is used to produce a shared
ELF object which could be loaded by ROCm HIP runtime.
gfx900 is the default target be used right now, although it could be altered via
an option in `mlir-rocm-runner`. Future revisions may consider using ROCm Agent
Enumerator to detect the right target on the system.
Notice AMDGPU Code Object V2 is used in this revision. Future enhancements may
upgrade to AMDGPU Code Object V3.
Bitcode libraries in ROCm-Device-Libs, which implements math routines exposed in
`rocdl` dialect are not yet linked, and is left as a TODO in the logic.
Reviewers: herhut
Subscribers: mgorny, tpr, dexonsmith, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #mlir, #llvm
Differential Revision: https://reviews.llvm.org/D80676
This revision adds a helper function to hoist vector.transfer_read /
vector.transfer_write pairs out of immediately enclosing scf::ForOp
iteratively, if the following conditions are true:
1. The 2 ops access the same memref with the same indices.
2. All operands are invariant under the enclosing scf::ForOp.
3. No uses of the memref either dominate the transfer_read or are
dominated by the transfer_write (i.e. no aliasing between the write and
the read across the loop)
To improve hoisting opportunities, call the `moveLoopInvariantCode` helper
function on the candidate loop above which to hoist. Hoisting the transfers
results in scf::ForOp yielding the value that originally transited through
memory.
This revision additionally exposes `moveLoopInvariantCode` as a helper in
LoopUtils.h and updates SliceAnalysis to support return scf::For values and
allow hoisting across multiple scf::ForOps.
Differential Revision: https://reviews.llvm.org/D81199
Forward declaring llvm::errs is not enough, as it is used as a default
parameter with a type that references the base class. So the class
hierarchy must be visible.
Summary:
This will inline the region to a shape.assuming in the case that the
input witness is found to be statically true.
Differential Revision: https://reviews.llvm.org/D80302
In the case of all inputs being constant and equal, cstr_eq will be
replaced with a true_witness.
Differential Revision: https://reviews.llvm.org/D80303
This allows replacing of this op with a true witness in the case of both
inputs being const_shapes and being found to be broadcastable.
Differential Revision: https://reviews.llvm.org/D80304
This allows assuming_all to be replaced when all inputs are known to be
statically passing witnesses.
Differential Revision: https://reviews.llvm.org/D80306
This will later be used during canonicalization and folding steps to replace
statically known passing constraints.
Differential Revision: https://reviews.llvm.org/D80307
Update linalg to affine lowering for convop to use affine load for input
whenever there is no padding. It had always been using std.loads because
max in index functions (needed for non-zero padding if not materializing
zeros) couldn't be represented in the non-zero padding cases.
In the future, the non-zero padding case could also be made to use
affine - either by materializing or using affine.execute_region. The
latter approach will not impact the scf/std output obtained after
lowering out affine.
Differential Revision: https://reviews.llvm.org/D81191
This simplifies a lot of handling of BoolAttr/IntegerAttr. For example, a lot of places currently have to handle both IntegerAttr and BoolAttr. In other places, a decision is made to pick one which can lead to surprising results for users. For example, DenseElementsAttr currently uses BoolAttr for i1 even if the user initialized it with an Array of i1 IntegerAttrs.
Differential Revision: https://reviews.llvm.org/D81047
This revision adds a helper function to hoist alloc/dealloc pairs and
alloca op out of immediately enclosing scf::ForOp if both conditions are true:
1. all operands are defined outside the loop.
2. all uses are ViewLikeOp or DeallocOp.
This is now considered Linalg-specific and will be generalized on a per-need basis.
Differential Revision: https://reviews.llvm.org/D81152
Add SubgroupId, SubgroupSize and NumSubgroups to GPU dialect ops and add the
lowering of those ops to SPIRV.
Differential Revision: https://reviews.llvm.org/D81042
Summary:
If the output filename was specified as "-", the ToolOutputFile class
would create a brand new raw_ostream object referring to the stdout.
This patch changes it to reuse the llvm::outs() singleton.
At the moment, this change should be "NFC", but it does enable other
enhancements, like the automatic stdout/stderr synchronization as
discussed on D80803.
I've checked the history, and I did not find any indication that this
class *has* to use a brand new stream object instead of outs() --
indeed, it is special-casing "-" in a number of places already, so this
change fits the pattern pretty well. I suspect the main reason for the
current state of affairs is that the class was originally introduced
(r111595, in 2010) as a raw_fd_ostream subclass, which made any other
solution impossible.
Another potential benefit of this patch is that it makes it possible to
move the raw_ostream class out of the business of special-casing "-" for
stdout handling. That state of affairs does not seem appropriate because
"-" is a valid filename (albeit hard to access with a lot of command
line tools) on most systems. Handling "-" in ToolOutputFile seems more
appropriate.
To make this possible, this patch changes the return type of
llvm::outs() and errs() to raw_fd_ostream&. Previously the functions
were constructing objects of that type, but returning a generic
raw_ostream reference. This makes it possible for new ToolOutputFile and
other code to use raw_fd_ostream methods like error() on the outs()
object. This does not seem like a bad thing (since stdout is a file
descriptor which can be redirected to anywhere, it makes sense to ask it
whether the writing was successful or if it supports seeking), and
indeed a lot of code was already depending on this fact via the
ToolOutputFile "back door".
Reviewers: dblaikie, JDevlieghere, MaskRay, jhenderson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81078
Summary:
Progressive lowering of vector.transpose into an operation that
is closer to an intrinsic, and thus the hardware ISA. Currently
under the common vector transform testing flag, as we prepare
deploying this transformation in the LLVM lowering pipeline.
Reviewers: nicolasvasilache, reidtatge, andydavis1, ftynse
Reviewed By: nicolasvasilache, ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #llvm, #mlir
Differential Revision: https://reviews.llvm.org/D80772
Add a new pass to lower operations from the `shape` to the `std` dialect.
The conversion applies only to the `size_to_index` and `index_to_size`
operations and affected types.
Other patterns will be added as needed.
Differential Revision: https://reviews.llvm.org/D81091
The original design of TypeConverter expected specific converters to derive the
class and override virtual functions for conversions and materializations. This
did not scale well to multi-dialect conversions, so the design was changed to
register a list of converter and materializer functions, removing the need for
virtual functions. The only remaining virtual function, `convertSignatureArg`
is never overridden in-tree. Make it non-virtual, drop the virtual destructor
and thus remove vtable from TypeConverter.
If there exist TypeConverter users that need custom `convertSignatureArg`
behavior, it should be implemented using the callback registration mechanism
similar to that of conversions and materializations.
Differential Revision: https://reviews.llvm.org/D80993