Its possible for the clamp to have invalid min/max values on its range. To fix
this we validate the range of the min/max and clamp to a valid range.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D108256
LLVM considers global variables marked as externals to be defined within the module if it is initialized (including to an undef). Other external globals are considered as being defined externally and imported into the current translation unit. Lowering of MLIR Global Ops does not properly propagate undefined initializers, resulting in a global which is expected to be defined within the current TU, not being defined.
Differential Revision: https://reviews.llvm.org/D108252
MSVC needs to know where to put the archive (.lib) as well as the runtime
(.dll). If left to the default location, multiple rules to generate the same
file will be produced, creating a Ninja error.
Differential Revision: https://reviews.llvm.org/D108181
* Rename ids to values in FlatAffineValueConstraints.
* Overall cleanup of comments in FlatAffineConstraints and FlatAffineValueConstraints.
Differential Revision: https://reviews.llvm.org/D107947
* Extract "value" functionality of `FlatAffineConstraints` into a new derived `FlatAffineValueConstraints` class. Current users of `FlatAffineConstraints` can use `FlatAffineValueConstraints` without additional code changes, thus NFC.
* `FlatAffineConstraints` no longer associates dimensions with SSA Values. All functionality that requires this, is moved to `FlatAffineValueConstraints`.
* `FlatAffineConstraints` no longer makes assumptions about where Values associated with dimensions are coming from.
Differential Revision: https://reviews.llvm.org/D107725
This method bitcasts a DenseElementsAttr elementwise to one of the same
shape with a different element type.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D107612
Reduction axis should come after all parallel axis to work with vectorization.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D108005
These operations are not lowered to from any source dialect and are only
used for redundant tests. Removing these named ops, along with their
associated tests, will make migration to YAML operations much more
convenient.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D107993
Expand ParallelLoopTilingPass with an inbound_check mode.
In default mode, the upper bound of the inner loop is from the min op; in
inbound_check mode, the upper bound of the inner loop is the step of the outer
loop and an additional inbound check will be emitted inside of the inner loop.
This was 'FIXME' in the original codes and a typical usage is for GPU backends,
thus the outer loop and inner loop can be mapped to blocks/threads in seperate.
Differential Revision: https://reviews.llvm.org/D105455
The primary pattern for this pass clones many operations from producers
to consumers. Doing this top down prevents duplicated work when a
producer has multiple consumers, if it also is consuming another
linalg.generic.
As an example, a chain of ~2600 generics that are fused into ~70
generics was resulting in 16255 pattern invocations. This took 14
seconds on one machine but takes only 0.3 seconds with top-down
traversal.
Differential Revision: https://reviews.llvm.org/D107818
While the changes are extensive, they basically fall into a few
categories:
1) Moving the TestDialect itself.
2) Updating C++ code in tablegen to explicitly use ::mlir, since it
will be put in a headers that shouldn't expect a 'using'.
3) Updating some generic MLIR Interface definitions to do the same thing.
4) Updating the Tablegen generator in a few places to be explicit about
namespaces
5) Doing the same thing for llvm references, since we no longer pick
up the definitions from mlir/Support/LLVM.h
Differential Revision: https://reviews.llvm.org/D88251
The approach for handling reductions in the outer most
dimension follows that for inner most dimensions, outlined
below
First, transpose to move reduction dims, if needed
Convert reduction from n-d to 2-d canonical form
Then, for outer reductions, we emit the appropriate op
(add/mul/min/max/or/and/xor) and combine the results.
Differential Revision: https://reviews.llvm.org/D107675
Add in-source documentation on how CanonicalLoopInfo is intended to be used. In particular, clarify what parts of a CanonicalLoopInfo is considered part of the loop, that those parts must be side-effect free, and that InsertPoints to instructions outside those parts can be expected to be preserved after method calls implementing loop-associated directives.
CanonicalLoopInfo are now invalidated after it does not describe canonical loop anymore and asserts when trying to use it afterwards.
In addition, rename `createXYZWorkshareLoop` to `applyXYZWorkshareLoop` and remove the update location to avoid that the impression that they insert something from scratch at that location where in reality its InsertPoint is ignored. createStaticWorkshareLoop does not return a CanonicalLoopInfo anymore. First, it was not a canonical loop in the clarified sense (containing side-effects in form of calls to the OpenMP runtime). Second, it is ambiguous which of the two possible canonical loops it should actually return. It will not be needed before a feature expected to be introduced in OpenMP 6.0
Also see discussion in D105706.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107540
Move StaticVerifierFunctionEmitter to CodeGenHelper.h so that it can be
used for both ODS and DRR.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D106636
Using the python API to easily set up sparse kernels, this test
exhaustively builds, compilers, and runs SpMM for all annotations
on a sparse tensor, making sure every version generates the correct
result. This test also illustrates using the python API to set up
a sparse kernel and sparse compilation.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D107943
This reverts the revert 28c04794df.
The failing MLIR test that caused the revert should be fixed in this
version.
Also includes a PPC test fix previously in 1f87c7c478.
This can be useful when one needs to know which unrolled iteration an Op belongs to, for example, conveying noalias information among memory-affecting ops in parallel-access loops.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D107789
Existing linalg.conv2d is not well optimized for performance. Changed to a
version that is more aligned for optimziation. Include the corresponding
transposes to use this optimized version.
This also splits the conv and depthwise conv into separate implementations
to avoid overly complex lowerings.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D107504
This is a bit cleaner and removes issues with 2d vectors. It also has a
big impact on constant folding, hence the test changes.
Differential Revision: https://reviews.llvm.org/D107896
The conversion is a straightforward one-to-one mapping with optional unrolling
for nD vectors, similarly to other cast operations.
Depends On D107889
Reviewed By: cota, akuegel
Differential Revision: https://reviews.llvm.org/D107891
The constraint was checking that the type is not an LLVM structure or array
type, but was not checking that it is an LLVM-compatible type, making it accept
incorrect types. As a result, some LLVM dialect ops could process values that
are not compatible with the LLVM dialect leading to further issues with
conversions and translations that assume all values are LLVM-compatible. Make
LLVM_AnyNonAggregate only accept LLVM-compatible types.
Reviewed By: cota, akuegel
Differential Revision: https://reviews.llvm.org/D107889
Reimplement this function in terms of `composeMatchingMap`.
Also fix a bug in `composeMatchingMap` where local dims of `this` could be missing in `localCst`.
Differential Revision: https://reviews.llvm.org/D107813
This function overload is similar to the existing `FlatAffineConstraints::addLowerOrUpperBound`. It constrains a dimension based on an affine map. However, in contrast to the other overloading, it does not attempt to align dimensions/symbols of the affine map with the dimensions/symbols of the constraint set. Instead, dimensions/symbols are expected to already be aligned.
Differential Revision: https://reviews.llvm.org/D107727
This function aligns an affine map (and operands) with given dims and syms SSA values.
This is useful in conjunction with `FlatAffineConstraints::addLowerOrUpperBound`, which requires the `boundMap` to be aligned with the constraint set's dims and syms.
Differential Revision: https://reviews.llvm.org/D107728
Some folding cases are trivial to fold away, specifically no-op cases where
an operation's input and output are the same. Canonicalizing these away
removes unneeded operations.
The current version includes tensor cast operations to resolve shape
discreprencies that occur when an operation's result type differs from the
input type. These are resolved during a tosa shape propagation pass.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D107321
Dilation only requires increasing the padding on the left/right side of the
input, and including dilation in the convolution. This implementation still
lacks support for strided convolutions.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D107680
When using an attribute where a value is expected previously this would fail
complaining about unbound symbol. Instead make error clear and mention common
failure reason.
This patch enables normalizing memrefs with MemRef_ReinterpretCastOp by
adding MemRefsNormalizable trait in the Op definition.
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D107425
This enables querying shapes/values as shapes without mutating the IR
directly (e.g., towards enabling doing inference in analysis &
application steps, inferring function shape with constant from callsite,
...). Add a new ShapeAdaptor that abstracts over whether shape is from
Type or ShapedTypeComponents or DenseIntElementsAttribute. This adds new
accessors to ValueShapeRange to get Shape and value as shape, but
doesn't restrict or remove the previous way of accessing Type via the
Value for now, that does mean a less refined shape could be accidentally
queried and will be restricted in follow up.
Currently restricted Value query to what can be represented as Shape. So
only supports cases where constant subgraph evaluation's output is a
shape. I had considered making it more general, but without TBD extern
attribute concept or some such a user cannot today uniformly avoid
overhead.
Update TOSA ops and also the shape inference pass.
Differential Revision: https://reviews.llvm.org/D107768
Replace some code snippets With scf::ForOp methods. Additionally,
share a listener at one more point (although this pattern is still
not safe to roll back currently)
Differential Revision: https://reviews.llvm.org/D107754
The following constructor call (and others) used to be ambiguous:
```
FlatAffineConstraints constraints(0, 0, 0);
```
Differential Revision: https://reviews.llvm.org/D107726
Looks "under the hood" of the sparse stogage schemes.
Users should typically not be interested in these details
(hey, that is why we have "sparse compilers"!) but this
test makes sure the compact contents are as expected.
Reviewed By: ThomasRaoux, bixia
Differential Revision: https://reviews.llvm.org/D107683
Implements lowering dense to sparse conversion, for static tensor types only.
First step towards general sparse_tensor.convert support.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107681
These ops were not ported to the nD vector conversion when it was introduced
and nobody needed them so far.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D107750
Instead, include `<cstdlib>` which is the canonical header containing
the declaration of `alloca()`.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D107699
Perform scalar constant propagation for FPTruncOp only if the resulting value can be represented without precision loss or rounding.
Example:
%cst = constant 1.000000e+00 : f32
%0 = fptrunc %cst : f32 to bf16
-->
%cst = constant 1.000000e+00 : bf16
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D107518
Use new return type for `OpAsmDialectInterface::getAlias`:
* `AliasResult::NoAlias` if an alias was not provided.
* `AliasResult::OverridableAlias` if an alias was provided, but it might be overriden by other hook.
* `AliasResult::FinalAlias` if an alias was provided and it should be used (no other hooks will be checked).
In that case `AsmPrinter` will use either the first alias with `FinalAlias` result or
the last alias with `OverridableAlias` result (it depends on dialect array order).
Used `OverridableAlias` result for `BuiltinOpAsmDialectInterface`.
Use case: provide more informative alias for built-in attributes like `AffineMapAttr`
instead of generic "map<N>".
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D107437
Avoiding absolute imports allows the code to be relocatable (which is used for out of tree integrations).
Differential Revision: https://reviews.llvm.org/D107617
CastOp::areCastCompatible does not check whether casts are definitely compatible.
When going from dynamic to static offset or stride, the canonicalization cannot
know whether it is really cast compatible. In that case, it can only canonicalize
to an alloc plus copy.
Differential Revision: https://reviews.llvm.org/D107545
Tested with gcc-10. Other compilers may generate additional warnings. This does not fix all warnings. There are a few extra ones in LLVMCore and MLIR.
* `OpEmitter::getAttrNameIndex`: -Wunused-function (function is private and not used anywhere)
* `PrintOpPass` copy constructor: -Wextra ("Base class should be explicitly initialized in the copy constructor")
* `LegalizeForLLVMExport.cpp`: -Woverflow (overflow is expected, silence warning by making the cast explicit)
Differential Revision: https://reviews.llvm.org/D107525
The file in which `Region::viewGraph` is defined has changed. This should have been updated with D106342.
Differential Revision: https://reviews.llvm.org/D107517
There is a case in EmitDiagnostics where the filter check is bypassed (when locationStack is empty). Filter might also be bypassed when loc instead of showableLoc is added to the locationStack.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D106522
A text file may be comprised of many different "chunks", when
the input file contains the `// -----` split markers. We don't
need to use a unique MLIRContext per chunk, as having
separate contexts is intended to allow for easy unloading of
unused data and all chunks have the same lifetime (tied to the
input file). This commit uses one context for the entire file,
greatly reducing memory consumption in certain situations (up
to 70%).
Differential Revision: https://reviews.llvm.org/D107488
'mlir::DiagnosticEngine::DiagnosticEngine(const mlir::DiagnosticEngine&)' is implicitly deleted because the default definition would be ill-formed.
Reviewed By: rdzhabarov
Differential Revision: https://reviews.llvm.org/D107287
Making sure the AMX dialect webpage reads better with a short introduction on the purpose of this dialect.
Reviewed By: grosul1, bondhugula
Differential Revision: https://reviews.llvm.org/D107419
There are many downstream users of llvm::dbgs, which is defined in Debug.h. Before D106342, many users included that dependency transitively via the now deleted ViewRegionGraph.h. Adding it back to Transforms/Passes.h for convenience.
Differential Revision: https://reviews.llvm.org/D107451
* Add new pass option `print-data-flow-edges`, default value `true`.
* Add new pass option `print-control-flow-edges`, default value `false`.
* Remove `PrintCFGPass`. Same functionality now provided by
`PrintOpPass`.
Differential Revision: https://reviews.llvm.org/D106342
The existing vector transforms reduce the dimension of transfer_read
ops. However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector. This handles this case.
Note that in the longer term, it may be preferaby to support
zero-dimension vectors. see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.
Differential Revision: https://reviews.llvm.org/D103432
This test ensures that an error is generated from the Python side when running a module pass on a function. The test used to instantiate ViewOpGraph, however, this pass was changed into a general "any op" pass in D106253. Therefore, a different pass must be used in this test.
Differential Revision: https://reviews.llvm.org/D107424
* New pass option `max-label-len`: Truncate attributes/result types that have more #chars.
* New pass option `print-attrs`: Activate/deactivate rendering of attributes.
* New pass option `printResultTypes`: Activate/deactivate rendering of result types.
Differential Revision: https://reviews.llvm.org/D106337
* Visualize blocks and regions as subgraphs.
* Generate DOT file directly instead of using `GraphTraits`. `GraphTraits` does not support subgraphs.
Differential Revision: https://reviews.llvm.org/D106253
Also makes style consistent with the "surrounding"
text that appears on one webpage in MLIR doc
Reviewed By: grosul1
Differential Revision: https://reviews.llvm.org/D107418
We can propagate the shape from tosa.cond_if operands into the true/false
regions then through the connected blocks. Then, using the tosa.yield ops
we can determine what all possible return types are.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105940
Handles shape inference for identity, cast, and rescale. These were missed
during the initialy elementwise work. This includes resize shape propagation
which includes both attribute and input type based propagation.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105845
This prevents an explosion of threads, given that each file gets its own context and thus its own thread pool. We don't really need a thread pool for the LSP contexts anyways, so it's better to just disable threading.
mlir/test/transforms/loop-fusion.mlir is too big and is split into mlir/test/transforms/loop-fusion.mlir, mlir/test/transforms/loop-fusion-2.mlir, mlir/test/transforms/loop-fusion-3.mlir
and mlir/test/transforms/loop-fusion-4.mlir. Further tests can be added in mlir/test/transforms/loop-fusion-4.mlir
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D106473
This patch adds the critical construct to the OpenMP dialect. The
implementation models the definition in 2.17.1 of the OpenMP 5 standard.
A name and hint can be specified. The name is a global entity or has
external linkage, it is modelled as a FlatSymbolRefAttr. Hint is
modelled as an integer enum attribute.
Also lowering to LLVM IR using the OpenMP IRBuilder.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107135