This patch changes names of identifiers and their corresponding getters in
PresburgerSet to match those of IntegerPolyhedron.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D117998
When the coefficients of dividend are negative, the gcd may be negative
which will change the sign of dividend and overflow denominator.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117911
When 2 clamp ops are in a row, they can be canonicalized into a single clamp
that uses the most constrained range
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D117934
Rationale:
Although file I/O is a bit alien to MLIR itself, we provide two convenient ways
for sparse tensor I/O. The input part was already there (behind the swiss army
knife sparse_tensor.new). Now we have a sparse_tensor.out to write out data. As
before, the ops are kept vague and may change in the future. For now this
allows us to compare TACO vs MLIR very easily.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D117850
Implement a taylor series approximation for atan and add an atan2 lowering
that uses atan's appromation. This includes tests for edge cases and tests
for each quadrant.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D115682
The cleanup was manual, but assisted by "include-what-you-use". It consists in
1. Removing unused forward declaration. No impact expected.
2. Removing unused headers in .cpp files. No impact expected.
3. Removing unused headers in .h files. This removes implicit dependencies and
is generally considered a good thing, but this may break downstream builds.
I've updated llvm, clang, lld, lldb and mlir deps, and included a list of the
modification in the second part of the commit.
4. Replacing header inclusion by forward declaration. This has the same impact
as 3.
Notable changes:
- llvm/Support/TargetParser.h no longer includes llvm/Support/AArch64TargetParser.h nor llvm/Support/ARMTargetParser.h
- llvm/Support/TypeSize.h no longer includes llvm/Support/WithColor.h
- llvm/Support/YAMLTraits.h no longer includes llvm/Support/Regex.h
- llvm/ADT/SmallVector.h no longer includes llvm/Support/MemAlloc.h nor llvm/Support/ErrorHandling.h
You may need to add some of these headers in your compilation units, if needs be.
As an hint to the impact of the cleanup, running
clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Support/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 8000919 lines
after: 7917500 lines
Reduced dependencies also helps incremental rebuilds and is more ccache
friendly, something not shown by the above metric :-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
In some cases, fusion can produce illegal operations if after fusion
the range of some of the loops cannot be computed from shapes of its
operands. Check for this case and abort the fusion if this happens.
Differential Revision: https://reviews.llvm.org/D117602
This is superseded by the same method on OpAsmOpInterface, which is
available on the Dialect through the Fallback mechanism,
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117750
This extends dense attribute element access to support 8b and 16b ints.
Also extends the corresponding parts of the C api.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D117731
This allows to pipe sequences of `mlir-opt -split-input-file | mlir-opt -split-input-file`.
Depends On D117750
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117756
Right shift can occur that is a 32-bit right shift. This is undefined behavior.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117732
- Set the DEBUG_TYPE of SerializeToBlob to serialize-to-blob
- Add debug output to print the assembly or PTX for GPU modules before
they are assembled and linked
Note that, as SerializeToBlob is a superclass of SerializeToCubin and
SerializeToHsaco, --debug-only=serialize-to-blom will dump the
intermediate compiler result for both of these passes.
In addition, if LLVM options such as --stop-after are used to control
the GPU kernel compilation process, the debug output will contain the
appropriate intermediate IR.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D117519
PDLDialect being a somewhat user-facing dialect and whose ops contain exclusively other PDL ops in their regions can take advantage of `OpAsmOpInterface` to provide nicer IR.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117828
This commits explicitly states that negative values and values exceeding
vector dimensions are allowed in vector.create_mask (but not in
vector.constant_mask). These values are now truncated when
canonicalizing vector.create_mask to vector.constant_mask.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D116069
When a Builder methods accepts multiple InsertPoints, when both point to
the same position, inserting instructions at one position will "move" the
other after the inserted position since the InsertPoint is pegged to the
instruction following the intended InsertPoint. For instance, when
creating a parallel region at Loc and passing the same position as AllocaIP,
creating instructions at Loc will "move" the AllocIP behind the Loc
position.
To avoid this ambiguity, add an assertion checking this condition and
fix the unittests.
In case of AllocaIP, an alternative solution could be to implicitly
split BasicBlock at InsertPoint, using the first as AllocaIP, the second
for inserting the instructions themselves. However, this solution is
specific to AllocaIP since AllocaIP will always have to be first. Hence,
this is an argument to generally handling ambiguous InsertPoints as API
sage error.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117226
When computing the new type of a collapse_shape operation, we need to at least
take into account whether the type has an identity layout, in which case we can
easily support dynamic strides. Otherwise, the canonicalizer creates invalid
IR.
Longer term, both the verifier and the canoncializer need to be extended to
support the general case.
Differential Revision: https://reviews.llvm.org/D117772
Earlier `computeSingleVarRepr` was returning a pair of upper bound and
lower bound indices of the inequality contraints that can be expressed
as a floordiv of an affine function. The equality expression can also be
expressed as a floordiv but contains only one index and hence the `LocalRepr`
class is introduced to facilitate this.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117430
This commit is the first step towards unifying core bufferization and One-Shot Bufferize.
This commit does not move over the implementations of BufferizableOpInterface yet. This will be done in separate commits. This change does also not move the unit tests yet. The tests will be moved together with op interface implementations and split into separate files.
Differential Revision: https://reviews.llvm.org/D117641
BlockArguments gained the ability to have locations attached a while ago, but they
have always been optional. This goes against the core tenant of MLIR where location
information is a requirement, so this commit updates the API to require locations.
Fixes#53279
Differential Revision: https://reviews.llvm.org/D117633
Previously the optional locations of function arguments were dropped in
`parseFunctionArgumentList`. This CL adds another output argument to the
function through which they are now returned. The values are then plumbed
through as an array of optional locations in the various places.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117604
When the printer is requested to elide large constant, we emit an opaque
attribute instead. This patch fills the dialect name with
"elided_large_const" instead of "_" to remove some user confusion when
they later try to consume it.
Differential Revision: https://reviews.llvm.org/D117711
Benchmarks show that memref::CopyOp is curently up to 200x slower than
tiled and vectorized versions of linalg::Copy.
Add a temporary flag to allow comprehensive bufferize to generate a
linalg::GenericOp that implements a copy until this performance bug is
resolved.
Differential Revision: https://reviews.llvm.org/D117696
The code in `BufferizableOpInterface`'s header/source no longer contains any analysis code. This makes it easier to run the bufferization with a different analysis or without any analysis.
Differential Revision: https://reviews.llvm.org/D117478
This separates the analysis (and its helpers/data structures) more clearly from the rest of the bufferization.
Differential Revision: https://reviews.llvm.org/D117477
This change adds full python bindings for PDL, including types and operations
with additional mixins to make operation construction more similar to the PDL
syntax.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D117458
Also move `createAlloc` and related helper functions out of BufferizationState. The goal is to make BufferizationState as small as possible. (Code cleanup)
Differential Revision: https://reviews.llvm.org/D117476
If not allow-return-memref, raise an error if a new memory allocation is returned/yielded from a block. We do not check for new allocations directly, but for ops that yield/return values that are not equivalent to values that are defined outside of the current of the block.
Note: We still need to check that scf.for yield values and bbArgs are aliasing to ensure that getAliasingOpOperand/getAliasingOpResult is correct.
Differential Revision: https://reviews.llvm.org/D116687
This op is needed for unit testing in a subsequent revision. (This is the first op that has a block that yields equivalent values via the op's results.)
Note: Bufferization of scf.execute_region ops with multiple blocks is not yet supported.
Differential Revision: https://reviews.llvm.org/D117424
When the min/max are the total range of the value, it is a no-op as the values
are already restricted to that range.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D117625
Try to fix the error:
mlir/lib/Parser/Parser.cpp:553:14: error: cannot call member function 'mlir::InFlightDiagnostic mlir::detail::Parser::emitError(llvm::SMLoc, const llvm::Twine&)' without object
<< "operation location alias was never defined";
This commit refactors the FunctionLike trait into an interface (FunctionOpInterface).
FunctionLike as it is today is already a pseudo-interface, with many users checking the
presence of the trait and then manually into functionality implemented in the
function_like_impl namespace. By transitioning to an interface, these accesses are much
cleaner (ideally with no direct calls to the impl namespace outside of the implementation
of the derived function operations, e.g. for parsing/printing utilities).
I've tried to maintain as much compatability with the current state as possible, while
also trying to clean up as much of the cruft as possible. The general migration plan for
current users of FunctionLike is as follows:
* function_like_impl -> function_interface_impl
Realistically most user calls should remove references to functions within this namespace
outside of a vary narrow set (e.g. parsing/printing utilities). Calls to the attribute name
accessors should be migrated to the `FunctionOpInterface::` equivalent, most everything
else should be updated to be driven through an instance of the interface.
* OpTrait::FunctionLike -> FunctionOpInterface
`hasTrait` checks will need to be moved to isa, along with the other various Trait vs
Interface API differences.
* populateFunctionLikeTypeConversionPattern -> populateFunctionOpInterfaceTypeConversionPattern
Fixes#52917
Differential Revision: https://reviews.llvm.org/D117272
The chunk size in schedule clause is one integer expression, which can
be either constant integer or integer variable. Fix schedule clause in
MLIR Op Def to support integer expression with different bit width.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D116073
The only benefit of FunctionPass is that it filters out function
declarations. This isn't enough to justify carrying it around, as we can
simplify filter out declarations when necessary within the pass. We can
also explore with better scheduling primitives to filter out declarations
at the pipeline level in the future.
The definition of FunctionPass is left intact for now to allow time for downstream
users to migrate.
Differential Revision: https://reviews.llvm.org/D117182
The input type of a linalg.generic can be less dynamic than its output
type. If this is the case moving a reshape across the generic op would
create invalid IR, as expand_shape cannot expand arbitrary dynamic
dimensions.
Check that the reshape is actually valid before creating the
expand_shape. This exposes the existing verification logic in reshape
utils and removes the incomplete custom implementation in fusion.
Differential Revision: https://reviews.llvm.org/D116600
Before this patch, deferred location in operation like `test.pretty_printed_region` would
break the parser, and so the IR round-trip.
Depends On D117088
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117413
This will allow to return to the client of `parseLocationInstance` a location that is resolved later.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117088
The current state of the top level Analysis/ directory is that it contains two libraries;
a generic Analysis library (free from dialect dependencies), and a LoopAnalysis library
that contains various analysis utilities that originated from Affine loop transformations.
This commit moves the LoopAnalysis to the more appropriate home of `Dialect/Affine/Analysis/`,
given the use and intention of the majority of the code within it. After the move, if there
are generic utilities that would fit better in the top-level Analysis/ directory, we can move
them.
Differential Revision: https://reviews.llvm.org/D117351
The leading space that is always printed at the beginning of regions is not consistent with other parts of the printing API. Moreover, this leading space can lead to undesirable assembly formats:
```
attr-dict-with-keyword $region
```
Prints as:
```
// Two spaces between `}` and `{`
attributes {foo} { ... }
```
Moreover, the leading space results in the odd generic op format:
```
"test.op"() ( {...}) : () -> ()
```
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117411
The translation from the MLIR LLVM dialect to LLVM IR includes a mechanism that
ensures the successors of a block to be different blocks in case block
arguments are passed to them since the opposite cannot be expressed in LLVM IR.
This mechanism previously only worked for functions because it was written
prior to the introduction of other region-carrying operations such as the
OpenMP dialect, which also translates directly to LLVM IR. Modify this
mechanism to handle all regions in the module and not only functions.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D117548
Also make the ODS Operator class have const iterator, and use const
references for existing API taking Operator by reference.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117516
The majority of dialects reimplement the same boilerplate over and over,
switching the default makes it for better discoverability and make it simpler
to implement new dialects.
Differential Revision: https://reviews.llvm.org/D117524
Coroutine lowering always takes the natural alignment when spilling to
the frame (issue #53148) so using AVX2 or AVX512 in a coroutine doesn't
work. Always overalign to 64 bytes to avoid this issue until we have a
better solution.
Differential Revision: https://reviews.llvm.org/D117501
This revision fixes a bug where the iterative algorithm would walk back def-use chains to an incorrect operand.
This exposed opportunities for a larger refactoring and behavior improvement.
The new algorithm has improved folding behavior and proceeds by tracking both the
permutation of the extraction position and the internal vector permutation.
Multiple partial intersection cases with a candidate insertOp are supported.
The refactoring of the implementation should also help it generalize to strided insert/extract op.
This also subsumes the previous `foldExtractOpFromTranspose` which is now a simple special case and can be deleted.
Differential Revision: https://reviews.llvm.org/D117322
In some cases, the result of an initTensorOp may have an attribute.
However, the Attribute was not passed to `inferResultType`, failing the
verifier. Therefore, propagate the Attribute to `inferResultType`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D117192
The execution engine would not be functional anyway, we're already
disabling the tests, this also disable the rest of the code.
Anecdotally this reduces the number of static library built when the
builtin target is disabled goes from 236 to 218.
Here is the complete list of LLVM targets built when running
`ninja check-mlir`:
libLLVMAggressiveInstCombine.a
libLLVMAnalysis.a
libLLVMAsmParser.a
libLLVMBinaryFormat.a
libLLVMBitReader.a
libLLVMBitstreamReader.a
libLLVMBitWriter.a
libLLVMCore.a
libLLVMDebugInfoCodeView.a
libLLVMDebugInfoDWARF.a
libLLVMDemangle.a
libLLVMFileCheck.a
libLLVMFrontendOpenMP.a
libLLVMInstCombine.a
libLLVMIRReader.a
libLLVMMC.a
libLLVMMCParser.a
libLLVMObject.a
libLLVMProfileData.a
libLLVMRemarks.a
libLLVMScalarOpts.a
libLLVMSupport.a
libLLVMTableGen.a
libLLVMTableGenGlobalISel.a
libLLVMTextAPI.a
libLLVMTransformUtils.a
Differential Revision: https://reviews.llvm.org/D117287
`getNumRegionInvocations` was originally added for the async reference counting, but turned out to be not useful, and currently is not used anywhere (couldn't find any uses in public github repos). Removing dead code.
Reviewed By: Mogball, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117347
- Generic visitors invoke operation callbacks before/in-between/after visiting the regions
attached to an operation and use a `WalkStage` to indicate which regions have been
visited.
- This can be useful for cases where we need to visit the operation in between visiting
regions attached to the operation.
Differential Revision: https://reviews.llvm.org/D116230
Clean up return value on affineDataCopyGenerate utility. Return the
actual success/failure status instead of the "number of bytes" which
isn't being used in the codebase in any way. The success/failure status
wasn't being sent out earlier.
Differential Revision: https://reviews.llvm.org/D117209
By default, copies are inserted right before the tensor OpOperand use. With this change, `bufferize` implementation can change the insertion point. This is needed for some ops where it would be illegal to insert a copy right before the use.
Differential Revision: https://reviews.llvm.org/D117291
LLVM dialect supports terminators with repeated successor blocks that take
different operands. This cannot be directly expressed in LLVM IR though since
it uses the number of the predecessor block to differentiate values in its PHI
nodes. Therefore, the translation to LLVM IR inserts dummy blocks to forward
arguments in case of repeated succesors with arguments. The insertion works
correctly. However, when connecting PHI nodes to their source values, the
assertion of the insertion having worked correctly was incorrect: it would only
trigger if repeated blocks were adjacent in the successor list (not guaranteed
by anything) and would not check if the successors have operands (no need for
dummy blocks in absence of operands since no PHIs are being created). Change
the assertion to only trigger in case of duplicate successors with operands,
and don't expect them to be adjacent.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D117214
When constructing an OperationName, the overwhelming majority of
cases are from registered operations. This revision adds a non-locked
lookup into the currently registered operations, which prevents locking
in the common case. This revision also optimizes several uses of
RegisteredOperationName that expect the operation to be registered,
e.g. such as in OpBuilder.
These changes provides a reasonable speedup (5-10%) in some
compilations, especially on platforms where locking is expensive.
Differential Revision: https://reviews.llvm.org/D117187
This guarantees the preconditions of fromCOO; whereas prior to this, one could call the constructor directly with an unsorted tensor, which would cause fromCOO to misbehave.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D117167
Adding the optional decompositions have been verified to improve memory
usage on common models. Added the decomposition to the default tosa to linalg
passes.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D117175
Spotted this in a final grep of projects I don't usually build before
pushing https://reviews.llvm.org/D115380, which makes
`SmallVector::set_size()` private.
Update to `truncate()`, a new-ish variant of `resize()` that asserts the
new size is not bigger and that avoids pulling in the allocation and
initialization code for growing. Doesn't really look like the perf
impact of that would matter here, but since `dirLength` is known to be a
smaller size then we might as well.
Differential Revision: https://reviews.llvm.org/D117073
Enable ReassociatingReshapeOpConversion with "non-identity" layouts.
This removes an early-return in this function, which seems unnecessary and is
preventing some memref.collapse_shape from converting to LLVM (see included lit test).
It seems unnecessary because the return message says "only empty layout map is supported"
but there actually is code in this function to deal with non-empty layout maps. Maybe
it refers to an earlier state of implementation and is just out of date?
Though, there is another concern about this early return: the condition that it actually
checks, `{src,dst}MemrefType.getLayout().isIdentity()`, is not quite the same as what the
return message says, "only empty layout map is supported". Stepping through this
`getLayout().isIdentity()` code in GDB, I found that it evaluates to `.getAffineMap().isIdentity()`
which does (AffineMap.cpp:271):
```
if (getNumDims() != getNumResults())
return false;
```
This seems that it would always return false for memrefs of rank greater than 1 ?
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114808
LLVM Dialect Constant Op translations assume that if the attribute is a
vector, it's a fixed length one, generating an invalid translation for
constant scalable vector initializations.
Differential Revision: https://reviews.llvm.org/D117125
* This is useful when you need to build mixed array from external static/dynamic arrays (e.g. from adaptor during dialect conversion)
* Also, to reduce C++ code in td and generated files
Differential Revision: https://reviews.llvm.org/D117106
This overload parses a pipeline string that contains the anchor operation type, and returns an OpPassManager
corresponding to the provided pipeline. This is useful for various situations, such as dynamic pass pipelines
which are not anchored within a parent pass pipeline.
fixes#52813
Differential Revision: https://reviews.llvm.org/D116525
Dynamic batch for rescale, gather, max_pool, avg_pool, conv2D and depthwise_conv2D. Split helper functions into a separate header file.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D117031
ShapedType was created in a time before interfaces, and is one of the earliest
type base classes in the ecosystem. This commit refactors ShapedType into
an interface, which is what it would have been if interfaces had existed at that
time. The API of ShapedType and it's derived classes are essentially untouched
by this refactor, with the exception being the API surrounding kDynamicIndex
(which requires a sole home).
For now, the API of ShapedType and its name have been kept as consistent to
the current state of the world as possible (to help with potential migration churn,
among other reasons). Moving forward though, we should look into potentially
restructuring its API and possible its name as well (it should really have "Interface"
at the end like other interfaces at the very least).
One other potentially interesting note is that I've attached the ShapedType::Trait
to TensorType/BaseMemRefType to act as mixins for the ShapedType API. This
is kind of weird, but allows for sharing the same API (i.e. preventing API loss from
the transition from base class -> Interface). This inheritance doesn't affect any
of the derived classes, it is just for API mixin.
Differential Revision: https://reviews.llvm.org/D116962
This field allows for defining a code block that is placed in both the interface
and trait declarations. This is very useful when defining a set of utilities to
expose on both the Interface class and the derived attribute/operation/type.
In non-static methods, `$_attr`/`$_op`/`$_type` (depending on the type of
interface) may be used to refer to an instance of the IR entity. In the interface
declaration, this is an instance of the interface class. In the trait declaration,
this is an instance of the concrete entity class (e.g. `IntegerAttr`, `FuncOp`, etc.).
Differential Revision: https://reviews.llvm.org/D116961
Apply scale may encounter scalar, tensor, or vector operations. Expand the
lowering so that it can lower arbitrary of container types.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D117080
This method simply forwards to populateFunctionLikeTypeConversionPattern,
which is more general. This also helps to remove special treatment of FuncOp from
DialectConversion.
Differential Revision: https://reviews.llvm.org/D116624
This patch fixes:
mlir/lib/Dialect/Arithmetic/Transforms/ExpandOps.cpp:161:52: error:
'static_assert' with no message is a C++17 extension
[-Werror,-Wc++17-extensions]
There have been a few API pieces remaining to allow for a smooth transition for
downstream users, but these have been up for a few months now. After this only
the C API will have reference to "Identifier", but those will be reworked in a followup.
The main updates are:
* Identifier -> StringAttr
* StringAttr::get requires the context as the first parameter
- i.e. `Identifier::get("...", ctx)` -> `StringAttr::get(ctx, "...")`
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D116626
If any of the operands is NaN, return the operand instead of a new constant.
When the rhs operand is a constant, the second arith.cmpf+select ops will be folded away.
https://reviews.llvm.org/D117010 marks the two ops commutative, which will place the constant on the rhs.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D117011
Enable constant folding of ops within the math dialect, and introduce constant folders for ceil and log2
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117085
This change makes it possible to use a different buffer deallocation strategy. E.g., `-buffer-deallocation` can be used, which also works for allocations that are not in destination-passing style.
Differential Revision: https://reviews.llvm.org/D117096
This op is an example for how to deal with ops who's OpResult may aliasing with one of multiple OpOperands.
Differential Revision: https://reviews.llvm.org/D116868
Add inliner interface for GPU dialect. The interface marks all GPU
dialect ops legal to inline anywhere.
Differential Revision: https://reviews.llvm.org/D116889
Enable constant folding of ops within the math dialect, and introduce constant folders for ceil and log2
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117085
Given a while loop whose condition is given by a cmp, don't recomputed the comparison (or its inverse) in the after region, instead use a constant since the original condition must be true if we branched to the after region.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117047
Most convolution operations need explicit padding of the input to
ensure all accesses are inbounds. In such cases, having a pad
operation can be a significant overhead. One way to reduce that
overhead is to try to fuse the pad operation with the producer of its
source.
A sequence
```
linalg.generic -> linalg.pad_tensor
```
can be replaced with
```
linalg.fill -> tensor.extract_slice -> linalg.generic ->
tensor.insert_slice.
```
if the `linalg.generic` has all parallel iterator types.
Differential Revision: https://reviews.llvm.org/D116418
This patch fixes:
mlir/lib/Dialect/Tosa/Transforms/TosaOptionalDecompositions.cpp:28:8:
error: 'runOnFunction' overrides a member function but is not marked
'override' [-Werror,-Wsuggest-override]
Moved all TOSA decomposition patterns so that they can be optionally populated
and used by external rewrites. This avoids decomposing TOSa operations when
backends may benefit from the non-decomposed version.
Reviewed By: rsuderman, mehdi_amini
Differential Revision: https://reviews.llvm.org/D116526
Given an if of the form, simplify it by eliminating the not and swapping the regions
scf.if not(c) {
yield origTrue
} else {
yield origFalse
}
becomes
scf.if c {
yield origFalse
} else {
yield origTrue
}
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D116990
Previously, CallOps did not have any aliasing OpResult/OpOperand pairs. Therefore, CallOps were mostly ignored by the analysis and buffer copies were not inserted when necessary.
This commit introduces the following changes:
* Function bbArgs writable by default. A function can now be bufferized without inspecting its callers.
* Callers must introduce buffer copies of function arguments when necessary. If a function is external, the caller must conservatively assume that a function argument is modified by the callee after bufferization. If the function is not external, the caller inspects the callee to determine if a function argument is modified.
Differential Revision: https://reviews.llvm.org/D116457
Changed the remaining appearances of alloc to memref.alloc in several
documentation sections, since they lead to misunderstandings, if they
are used.
Differential Revision: https://reviews.llvm.org/D116999
D115722 added a DL spec to GPU modules. It happens that the DL
default interface implementation is sensitive to the name of the
DL spec attribute. This patch is fixing the name of the attribute
to be the expected one.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D116956
When the unroll factor is 1, we should only fail "unrolling" when the trip count also is determined to be 1 and it is unable to be promoted.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D115365
Given a select whose result is an i1, we can eliminate the conditional in the select completely by adding a few arithmetic operations.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D116839
This revision fixes SubviewOp, InsertSliceOp, ExtractSliceOp construction during bufferization
where not all offset/size/stride operands were properly specified.
A test that exhibited problematic behaviors related to incorrect memref casts is introduced.
Init tensor optimization is disabled in teh testing func bufferize pass.
Differential Revision: https://reviews.llvm.org/D116899
init_tensor elimination is arguably a pre-optimization that should be separated from comprehensive bufferization.
In any case it is still experimental and easily results in wrong IR with violated SSA def-use orderings.
Isolate the optimization behind a flag, separate the test cases and add a test case that would results in wrong IR.
Differential Revision: https://reviews.llvm.org/D116936
During iterative inlining of the functions in a multi-step call chain, the
inliner could add the same call operation several times to the worklist, which
led to use-after-free when this op was considered more than once.
Closes#52887.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D116820
We allow the omission of a map in memref.reinterpret_cast under the assumption,
that the cast might cast to an identity layout. This change adds verification
that the static knowledge that is present in the reinterpret_cast supports
this assumption.
Differential Revision: https://reviews.llvm.org/D116601
This patch changes the syntax of omp.atomic.read to take the address of
destination, instead of having the value in a result. This will allow
using omp.atomic.read operation within an omp.atomic.capture operation
thus making its implementation less complex.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D116396
Fix crash in the presence of yield values. Multiple fixes to affine loop
tiling pre-condition checks and return status. Do not signal pass
failure on a failure to tile since the IR is still valid. Detect index
set computation failure in checkIfHyperrectangular and return failure.
Replace assertions with proper status return. Move checks to an
appropriate place earlier in the utility before mutation happens.
Differential Revision: https://reviews.llvm.org/D116738
This patch moves PresburgerSet to Presburger/ directory. This patch is purely
mechincal, it only moves and renames functionality and tests.
This patch is part of a series of patches to move presburger functionality to
Presburger/ directory.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D116836
a32300a changed it to create a ThreadPool eagerly so that it gets reused
across buffers, however it also made it so that we create a ThreadPool
early even if we're not gonna use it later because of the command line
option `--mlir-disable-threading` is provided.
Fix#53056
Reland 45adf60802 after build fixes
Differential Revision: https://reviews.llvm.org/D116848
a32300a changed it to create a ThreadPool eagerly so that it gets reused
across buffers, however it also made it so that we create a ThreadPool
early even if we're not gonna use it later because of the command line
option `--mlir-disable-threading` is provided.
Fix#53056
Differential Revision: https://reviews.llvm.org/D116848