Instead of recomputing memref types from tensor types, try to infer them when possible. This results in more precise layout maps.
Differential Revision: https://reviews.llvm.org/D125614
Erase gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.
Reviewed By: bondhugula, csigg
Differential Revision: https://reviews.llvm.org/D124257
The warning caused build errors on a couple flang testers that are
building with -Werror. The diagnostic change makes the generated
error correct.
This is a followup to https://reviews.llvm.org/D125549
Differential Revision: https://reviews.llvm.org/D125587
There are a lot of cases where we accidentally ignored the result of some
parsing hook. Mark ParseResult as LLVM_NODISCARD just like ParseResult is.
This exposed some stuff to clean up, so do.
Differential Revision: https://reviews.llvm.org/D125549
In d4555698f8, the name of nano precision timer function has changed from `nano_time` to `nanoTime`, but benchmarks were not updated to reflect that. This change addresses the discrepancy.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D125217
This patch fixes a bug in areIdsUnique where it ignores the [start, end] range.
No test case is added since there are no use cases through IR from where it
can be tested, and it is hard to create a unittest since we do not currently
have Values in unittests.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D124735
This pass is to handle computationally complex operations like
tensor.pad which are not simply lowered to the exact same operation in
the memref dialect.
Differential Revision: https://reviews.llvm.org/D125384
Due to an apparent bug in the Doxygen version <1.8.16 used to generate
documentation for MLIR, parts of the navigation (specifically, the lists
of inherited methods for classes) are unusable due to dynsections.js
missing from the output generated by Doxygen. Setting this flag makes
Doxygen always produce the file.
Implements a floating-point sign operator (using the new semi-ring ops)
that accomodates +/-Inf and +/-NaN in consistent way.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D125494
The current grammar is really crusty, only supports a handful of
cases, and is also out-of-date after various refactorings. This commit
refactors the textmate grammar to handle significantly more cases,
and now provides proper coloring for a majority of cases (including
dialect attributes, operations, types, etc.)
Differential Revision: https://reviews.llvm.org/D125458
We shouldn't be making assumptions about the result of llvm::getTypeName,
which may have different results for anonymous namespaces depending
on the platform.
This commit refactors the current pass manager support to allow for
operation agnostic pass managers. This allows for a series of passes
to be executed on any viable pass manager root operation, instead
of one specific operation type. Op-agnostic/generic pass managers
only allow for adding op-agnostic passes.
These types of pass managers are extremely useful when constructing
pass pipelines that can apply to many different types of operations,
e.g., the default inliner simplification pipeline. With the advent of
interface/trait passes, this support can be used to define FunctionOpInterface
pass managers, or other pass managers that effectively operate on
specific interfaces/traits/etc (see #52916 for an example).
Differential Revision: https://reviews.llvm.org/D123536
This patch references code for translating memref.reinterpret_cast ops
to add translation rules for memref.reshape ops that have a static shape
argument. Since reshape ops don't have offsets, sizes, or strides, this
patch simply sets the allocated and aligned pointers of the MemRef
descriptor.
Reviewed By: ftynse, cathyzhyi
Differential Revision: https://reviews.llvm.org/D125039
Instead of requiring the client to compute the "isSplat" bit,
compute it internally. This makes the logic more consistent
and defines away a lot of "elements.size()==1" in the clients.
This addresses Issue #55185
Differential Revision: https://reviews.llvm.org/D125447
Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory
transfer for the operands and yield values.
This also add an integration test running on GPU warp. The same tests can be
later re-used with different comment lines to tests distribution
transformations.
This is mostly from @springerm contribution.
Differential Revision: https://reviews.llvm.org/D125430
Complex nested in other types is perfectly fine, just nested structs
aren't supported. Instead of checking whether there's nesting just check
whether the struct we're dealing with is a complex number.
Differential Revision: https://reviews.llvm.org/D125381
1. Call copy constructor of the base class
2. Assign value of the option directly
Reviewed By: dcaballe, rriddle
Differential Revision: https://reviews.llvm.org/D125101
D125214 split off a MLIRExecutionEngineUtils library that is used
by MLIRGPUTransforms. However, currently the entire ExecutionEngine
directory is skipped if the LLVM_NATIVE_ARCH target is not available.
Move the check for LLVM_NATIVE_ARCH, such that MLIRExecutionEngineUtils
always gets built, and only the JIT-related libraries are omitted
without native arch.
Differential Revision: https://reviews.llvm.org/D125357
This change integrates the BufferResultsToOutParamsPass into One-Shot Module Bufferization. This improves memory management (deallocation) when buffers are returned from a function.
Note: This currently only works with statically-sized tensors. The generated code is not very efficient yet and there are opportunities for improvment (fewer copies). By default, this new functionality is deactivated.
Differential Revision: https://reviews.llvm.org/D125376
Bufferization has an optional filter to exclude certain ops from analysis+bufferization. There were a few remaining places in the codebase where the filter was not checked.
Differential Revision: https://reviews.llvm.org/D125356
When a custom operation is unknown and does not have a dialect prefix, we currently
emit an error using the name of the operation with the default dialect prefix. This
leads to a confusing error message, especially when operations get moved between dialects.
For example, `func` was recently moved out of `builtin` and to the `func` dialect. The current
error message we get is:
```
func @foo()
^ custom op 'builtin.func' is unknown
```
This could lead users to believe that there is supposed to be a `builtin.func`,
because there used to be. This commit adds a better error message that does
not assume that the operation is supposed to be in the default dialect:
```
func @foo()
^ custom op 'func' is unknown (tried 'builtin.func' as well)
```
Differential Revision: https://reviews.llvm.org/D125351
`linalg.generic` ops have canonicalizers that either remove arguments
not used in the payload, or redundant arguments. Combine these and
enhance the canonicalization to also remove results that have no use.
This is effectively dead code elimination for Linalg ops.
Differential Revision: https://reviews.llvm.org/D123632
We can simplify the code needed to implement dyn_cast/cast/isa support for MLIR operations with documented interfaces via the CastInfo structures. This will also provide an example of how to use CastInfo.
Depends on D123901
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124963
Using "replaceUsesOfWith" is incorrect because the same initializer value may appear multiple times.
For example, if the epilogue is needed when this loop is unrolled
```
%x:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
```
then both epilogue's arguments will be incorrectly renamed to use the same result index (note #1 in both cases):
```
%x_unrolled:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
%x_epilogue:2 = scf.for ... iter_args(%arg1 = %x_unrolled#1, %arg2 = %x_unrolled#1) {
...
}
```
This is a full audit of emitError calls, I took the opportunity
to remove extranous parens and fix a couple cases where we'd
generate multiple diagnostics for the same error.
Differential Revision: https://reviews.llvm.org/D125355
Now that TableGen no longer relies on global Record state, we can allow
for the client to own the RecordKeeper and SourceMgr. Given that TableGen
internally still relies on the global llvm::SrcMgr, this method unfortunately
still isn't thread-safe.
Differential Revision: https://reviews.llvm.org/D125277
Change the parsing logic to use StringRef instead of lower level
char* logic. Also, if emitting a diagnostic on the first token
in the file, we make sure to use that position instead of the
very start of the file.
Differential Revision: https://reviews.llvm.org/D125353
Before dump, Insetad of switching to generic form silently after
verification failure. Print some debug logs to help identify why an op
may be printed in a different way.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125136
Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.
Differential Revision: https://reviews.llvm.org/D125244
The current implementation of `cloneWithNewYields` has a few issues
- It clones the loop body of the original loop to create a new
loop. This is very expensive.
- It performs `erase` operations which are incompatible when this
method is called from within a pattern rewrite. All erases need to
go through `PatternRewriter`.
To address these a new utility method `replaceLoopWithNewYields` is added
which
- moves the operations from the original loop into the new loop.
- replaces all uses of the original loop with the corresponding
results of the new loop
- use a call back to allow caller to generate the new yield values.
- the original loop is modified to just yield the basic block
arguments corresponding to the iter_args of the loop. This
represents a no-op loop. The loop itself is dead (since all its uses
are replaced), but is not removed. The caller is expected to erase
the op. Consequently, this method can be called from within a
`matchAndRewrite` method of a `PatternRewriter`.
The `cloneWithNewYields` could be replaces with
`replaceLoopWithNewYields`, but that seems to trigger a failure during
walks, potentially due to the operations being moved. That is left as
a TODO.
Differential Revision: https://reviews.llvm.org/D125147
This ensures that attributes such as the index bitwidth propagate
correctly to the AMDGPUToROCDL patterns.
Differential Revision: https://reviews.llvm.org/D125320