If I remember correctly this wasn't done previously because dim used to
be in the memref dialect.
Differential Revision: https://reviews.llvm.org/D111651
Some random changes that were hanging around in my workspace. Also,
a tiny step towards creating a header file for the sparse utils lib.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D111589
Add a switch to code gen strategy to disable/enable the vector transfer lowering and disable it by default.
Differential Revision: https://reviews.llvm.org/D111647
Add the vector transfer patterns and introduce the max transfer rank option on the codegen strategy.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111635
This revision takes advantage of the recently added support for 0-d transfers and vector.multi_reduction that return a scalar.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D111626
This revision updates the op semantics, printer, parser and verifier to allow 0-d transfers.
Until 0-d vectors are available, such transfers have a special form that transits through vector<1xt>.
This is a stepping stone towards the longer term work of adding 0-d vectors and will help significantly reduce corner cases in vectorization.
Transformations and lowerings do not yet support this form, extensions will follow.
Differential Revision: https://reviews.llvm.org/D111559
vector.multi_reduction currently does not allow reducing down to a scalar.
This creates corner cases that are hard to handle during vectorization.
This revision extends the semantics and adds the proper transforms, lowerings and canonicalizations to allow lowering out of vector.multi_reduction to other abstractions all the way to LLVM.
In a future, where we will also allow 0-d vectors, scalars will still be relevant: 0-d vector and scalars are not equivalent on all hardware.
In the process, splice out the implementation patterns related to vector.multi_reduce into a new file.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D111442
`hint-expression` is an IntegerAttr, because it can be a combination of multiple values from the enum `omp_sync_hint_t` (Section 2.17.12 of OpenMP 5.0)
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D111360
Make `raw_ostream operator<<` follow const correctness semantic,
since it is a requirement of FormatVariadic implementation.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D111547
* Change callback signature `bool(Operation *)` -> `Optional<bool>(Operation *)`
* addDynamicallyLegalOp add callback to the chain
* If callback returned empty `Optional` next callback in chain will be called
Differential Revision: https://reviews.llvm.org/D110487
Call `printType(subElemType)` instead of `os << subElemType` for them.
It allows to handle type aliases inside complex types.
As a side effect, fixed `test.int` parsing.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D111536
* Call `llvm_canonicalize_cmake_booleans` for all CMake options,
which are propagated to `lit.local.cfg` files.
* Use Python native boolean values instead of strings for such options.
This fixes the cases, when CMake variables have values other than `ON` (like `TRUE`).
This might happen due to IDE integration or due to CMake preset usage.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D110073
Operations that have the InferTypeOpInterface trait can now omit the return
types in their custom assembly formats.
Differential Revision: https://reviews.llvm.org/D111326
This relaxes vectorization of dense memrefs a bit so that affine expressions
are allowed in more outer dimensions. Vectorization of non unit stride
references is disabled though, since this seems ineffective anyway.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D111469
Until now, we only had documentation oriented towards developers of the
bindings. Provide some documentation for users of the bindings that don't want
or need to understand the inner workings.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111540
1. Add support to vectorize induction variables of loops that are
not mapped to any vector dimension in SuperVectorize pass.
2. Fix a bug in getForInductionVarOwner.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D111370
This test is crashing 9 out of 10 runs in CI, but I can't reproduce
locally right now. Disabling to get the CI back to green and avoid
backsliding with more ASAN issues that would go unnoticed.
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
Instead of hard-coding results for both Intel and AMD, let's relax
the checks to simplify the test while supporting both implementations.
Note that:
- If a new hardware implementation comes up in the future, it is likely
to pass the relaxed tests, i.e. no future maintenance burden for us.
- If something terribly wrong happens (e.g. instead of rsqrt we
execute 1/sqrt), the tests will probably catch it, since the relaxed
tests expect low precision (e.g. rsqrt(1) != 1.0).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D111461
TensorLiteralParser::getHexAttr does a isIntOrIndexOrFloat check and properly handles index elements, but TensorLiteralParser::getAttr that calls into it has a mismatched check. This just makes the checks match so that index element attrs can parse when of type tensor.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D111374
The purpose of this revision is to make "write into non-writable memory" conflict detection easier to understand.
The main idea is that there is a conflict in the case of inplace bufferization if:
1. Someone writes to (an alias of) opOperand, opResult or the to-be-bufferized op writes itself.
2. And, opOperand or opResult aliases a non-writable buffer.
Differential Revision: https://reviews.llvm.org/D111379
This commit adds a pattern to perform constant folding on linalg
generic ops which are essentially transposes. We see real cases
where model importers may generate such patterns.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D110597
Introduce support for accepting ops instead of values when constructing ops. A
single-result op can be used instead of a value, including in lists of values,
and any op can be used instead of a list of values. This is similar to, but
more powerful, than the C++ API that allows for implicitly casting an OpType to
Value if it is statically known to have a single result - the cast in Python is
based on the op dynamically having a single result, and also handles the
multi-result case. This allows to build IR in a more concise way:
op = dialect.produce_multiple_results()
other = dialect.produce_single_result()
dialect.consume_multiple_results(other, op)
instead of having to access the results manually
op = dialect.produce.multiple_results()
other = dialect.produce_single_result()
dialect.consume_multiple_results(other.result, op.operation.results)
The dispatch is implemented directly in Python and is triggered automatically
for autogenerated OpView subclasses. Extension OpView classes should use the
functions provided in ods_common.py if they want to implement this behavior.
An alternative could be to implement the dispatch in the C++ bindings code, but
it would require to forward opaque types through all Python functions down to a
binding call, which makes it hard to inspect them in Python, e.g., to obtain
the types of values.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D111306
The convolution op is one of the remaining hard coded Linalg operations that have no region attached. It got obsolete due to the OpDSL convolution operations. Removing it allows us to delete specialized code and tests that are not needed for the OpDSL counterparts that rely on the standard code paths.
Test needed due to specialized implementations are removed. Tiling and fusion tests are replaced by variants using linalg.conv_2d.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111233
This reverts commit 7aebdfc4fc.
The build is broken with errors like:
GPUPasses.cpp:(.text.pybind11_object_init[pybind11_object_init]+0x118): undefined reference to `PyExc_TypeError'
After CMake 3.18, we are able to limit the scope of the
find_package(Python3 ...) search to just Development.Module. Searching
for Development will fail in manylinux builds, and isn't necessary
since we are not embedding the Python interpreter. For more information, see:
https://pybind11.readthedocs.io/en/stable/compiling.html#findpython-mode
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D111383
These kind of function can behave differently on these X86 chips, there
isn't really "one true answer" so we'll accept both.
Also remove spurious passes and use mattr="avx" to match the instruction
used here.
Differential Revision: https://reviews.llvm.org/D111373
Currently Affine LICM checks iterOperands and does not hoist out any
instruction containing iterOperands. We should check iterArgs instead.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D111090
Add an interface for outlineable OpenMP operations.
This patch was initially done in fir-dev and is now needed
for the upstreaming.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D111310
* Need to investigate the proper solution to https://github.com/pybind/pybind11/issues/3336 or engineer something different.
* The attempt to produce an empty buffer_info as a workaround triggers asan/ubsan.
* Usage of this API does not arise naturally in practice yet, and it is more important to be asan/crash clean than have a solution right now.
* Switching back to raising an exception, even though that triggers terminate().
* This already half existed in terms of reading the raw buffer backing a DenseElementsAttr.
* Documented the precise expectations of the buffer layout.
* Extended the Python API to support construction from bitcasted buffers, allowing construction of all primitive element types (even those that lack a compatible representation in Python).
* Specifically, the Python API can now load all integer types at all bit widths and all floating point types (f16, f32, f64, bf16).
Differential Revision: https://reviews.llvm.org/D111284
The signature of this function was confusing. Check for hasKnownBufferizationAliasingBehavior separately when needed.
Differential Revision: https://reviews.llvm.org/D110916
It was bundling quite a lot of patterns that convert high-D
vector ops into low-D elementary ops. It might not be good
for all of the patterns to happen for a particular downstream
user. For example, `ShapeCastOpRewritePattern` rewrites
`vector.shape_cast` into data movement extract/insert ops.
Instead, split the entry point into multiple ones so users
can pull in patterns on demand.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D111225
Move getInplaceableOpResult() call into bufferizableInPlaceAnalysis.
Note: The only goal of this change is to make the signature of bufferizableInPlaceAnalysis smaller. (Fewer arguments.)
Differential Revision: https://reviews.llvm.org/D110915
ConstShapeOp has a constant shape, so its type can always be static.
We still allow it to have ShapeType though.
Differential Revision: https://reviews.llvm.org/D111139
Update OpDSL to support unsigned integers by adding unsigned min/max/cast signatures. Add tests in OpDSL and on the C++ side to verify the proper signed and unsigned operations are emitted.
The patch addresses an issue brought up in https://reviews.llvm.org/D111170.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D111230
Add folding rule for std.or op when an operand has all bits set.
or(x, <all bits set>) -> <all bits set>
Differential Revision: https://reviews.llvm.org/D111206
This was causing a subsequent assert/crash when a type converter failed to convert a block argument.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D110985
Create the Arithmetic dialect that contains basic integer and floating
point arithmetic operations. Ops that did not meet this criterion were
moved to the Math dialect.
First of two atomic patches to remove integer and floating point
operations from the standard dialect. Ops will be removed from the
standard dialect in a subsequent patch.
Reviewed By: ftynse, silvas
Differential Revision: https://reviews.llvm.org/D110200
clang-tidy, fix redundant return statement at the end of the void method.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D111251
It's nice for users to have more information when debugging failures and
these are only triggered in a failure path.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D107676
For the type lattice, we (now) use the "less specialized or equal" partial
order, leading to the bottom representing the empty set, and the top
representing any type.
This naming is more in line with the generally used conventions, where the top
of the lattice is the full set, and the bottom of the lattice is the empty set.
A typical example is the powerset of a finite set: generally, meet would be the
intersection, and join would be the union.
```
top: {a,b,c}
/ | \
{a,b} {a,c} {b,c}
| X X |
{a} { b } {c}
\ | /
bottom: { }
```
This is in line with the examined lattice representations in LLVM:
* lattice for `BitTracker::BitValue` in `Hexagon/BitTracker.h`
* lattice for constant propagation in `HexagonConstPropagation.cpp`
* lattice in `VarLocBasedImpl.cpp`
* lattice for address space inference code in `InferAddressSpaces.cpp`
Reviewed By: silvas, jpienaar
Differential Revision: https://reviews.llvm.org/D110766
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
Instead just emit a warning that analysis failed and the result will be treated conservatively.
Differential Revision: https://reviews.llvm.org/D111217
Implement min and max using the newly introduced std operations instead of relying on compare and select.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D111170
This fixes round-trip / ambiguity when an operation in the standard dialect would
have the same name as an operation in the default dialect.
Differential Revision: https://reviews.llvm.org/D111204
This patch extends Linalg core vectorization with support for min/max reductions
in linalg.generic ops. It enables the reduction detection for min/max combiner ops.
It also renames MIN/MAX combining kinds to MINS/MAXS to make the sign explicit for
floating point and signed integer types. MINU/MAXU should be introduce din the future
for unsigned integer types.
Reviewed By: pifon2a, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D110854
This avoids keeping references to passes that may be freed by
the time that the pass manager has finished executing (in the
non-crash case).
Fixes PR#52069
Differential Revision: https://reviews.llvm.org/D111106
This otherwise loses a lot of debugging info and results in a painful
debugging experience.
Reviewed By: mravishankar, stellaraccident
Differential Revision: https://reviews.llvm.org/D111107
We have several ways to materialize sparse tensors (new and convert) but no explicit operation to release the underlying sparse storage scheme at runtime (other than making an explicit delSparseTensor() library call). To simplify memory management, a sparse_tensor.release operation has been introduced that lowers to the runtime library call while keeping tensors, opague pointers, and memrefs transparent in the initial IR.
*Note* There is obviously some tension between the concept of immutable tensors and memory management methods. This tension is addressed by simply stating that after the "release" call, no further memref related operations are allowed on the tensor value. We expect the design to evolve over time, however, and arrive at a more satisfactory view of tensors and buffers eventually.
Bug:
http://llvm.org/pr52046
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D111099
This allows us to generate interfaces in a namespace,
following other TableGen'erated code.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D108311
Move the generalization pattern to the other Linalg transforms to make it available to the codegen strategy.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110728
These are considered noops.
Buferization will still fail on scf.execute_region which yield values.
This is used to make comprehensive bufferization interoperate better with external clients.
Differential Revision: https://reviews.llvm.org/D111130
This option is needed for passes that are known to reach a fix point, but may
need many iterations depending on the size of the input IR.
Differential Revision: https://reviews.llvm.org/D111058
This change allows better interop with external clients of comprehensive bufferization functions
but is otherwise NFC for the MLIR pass itself.
Differential Revision: https://reviews.llvm.org/D111121
This fixes some typos in OpDefinitions.md and DeclarativeRewrites.md,
and wrap function/class names in backticks.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110582
The discussion in https://reviews.llvm.org/D110425 demonstrated that "packing"
may be a confusing term to define the behavior of this op in presence of the
attribute. Instead, indicate the intended effect of preventing the folder from
being applied.
Reviewed By: nicolasvasilache, silvas
Differential Revision: https://reviews.llvm.org/D111046
This patch is mainly to propogate location attribute from spv.GlobalVariable to llvm.mlir.global.
It also contains three small changes.
1. Remove the restriction on UniformConstant In SPIRVToLLVM.cpp;
2. Remove the errorCheck on relaxedPrecision when deserializering SPIR-V in Deserializer.cpp
3. In SPIRVOps.cpp, let ConstantOp take signedInteger too.
Co-authered: Alan Liu <alanliu.yf@gmail.com> and Xinyi Liu <xyliuhelen@gmail.com>
Reviewed by:antiagainst
Differential revision: https://reviews.llvm.org/D110207
The new constructor relies on type-based dynamic dispatch and allows one to
construct call operations given an object representing a FuncOp or its name as
a string, as opposed to requiring an explicitly constructed attribute.
Depends On D110947
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D110948
Constructing a ConstantOp using the default-generated API is verbose and
requires to specify the constant type twice: for the result type of the
operation and for the type of the attribute. It also requires to explicitly
construct the attribute. Provide custom constructors that take the type once
and accept a raw value instead of the attribute. This requires dynamic dispatch
based on type in the constructor. Also provide the corresponding accessors to
raw values.
In addition, provide a "refinement" class ConstantIndexOp similar to what
exists in C++. Unlike other "op view" Python classes, operations cannot be
automatically downcasted to this class since it does not correspond to a
specific operation name. It only exists to simplify construction of the
operation.
Depends On D110946
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D110947
Provide a couple of quality-of-life usability improvements for Python bindings,
in particular:
* give access to the list of types for the list of op results or block
arguments, similarly to ValueRange->TypeRange,
* allow for constructing empty dictionary arrays,
* support construction of array attributes by concatenating an existing
attribute with a Python list of attributes.
All these are required for the upcoming customization of builtin and standard
ops.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D110946
Add missing mlir-capi-*-test tool substitutions in order to fix CAPI
test failures when mlir is not installed yet.
Differential Revision: https://reviews.llvm.org/D110991
Include mlir_tools_dir in the PATH used in test environment,
as otherwise mlir-reduce is unable to find mlir-opt when building
standalone (and hence mlir_tools_dir != llvm_tools_dir).
Differential Revision: https://reviews.llvm.org/D110992
First the leak sanitizer has to be disabled, as even an empty script
leads to leak detection with Python.
Then we need to preload the ASAN runtime, as the main binary (python)
won't be linked against it. This will only work on Linux right now.
Differential Revision: https://reviews.llvm.org/D111004
NFC. Drop unnecessary use of OpBuilder in buildTripCountMapAndOperands.
Rename this to getTripCountMapAndOperands and remove stale comments.
Differential Revision: https://reviews.llvm.org/D110993
I guess this is why we should use unique_ptr as much as possible.
Also fix the InterfaceAttachmentTest.cpp test.
Differential Revision: https://reviews.llvm.org/D110984
Exposes mlir::TypeID to the C API as MlirTypeID along with various accessors
and helper functions.
Differential Revision: https://reviews.llvm.org/D110897
Tiling can create dim ops and those dim ops can take `InitTensorOp`
as input. Including it in the tiling canonicalization patterns
allows us to fold those dim ops away.
Also sorted the existing ops along the way.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D110876
The pooling ops are among the last remaining hard coded Linalg operations that have no region attached. They got obsolete due to the OpDSL pooling operations. Removing them allows us to delete specialized code and tests that are not needed for the OpDSL counterparts that rely on the standard code paths.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110909
Add support for dynamic shared memory for GPU launch ops: add an
optional operand to gpu.launch and gpu.launch_func ops to specify the
amount of "dynamic" shared memory to use. Update lowerings to connect
this operand to the GPU runtime.
Differential Revision: https://reviews.llvm.org/D110800
This revision exposes some minimal funcitonality to allow comprehensive
bufferization to interop with external projects.
Differential Revision: https://reviews.llvm.org/D110875
For convolution, the input window dimension's access affine map
is of the form `(d0 * s0 + d1)`, where `d0`/`d1` is the output/
filter window dimension, and `s0` is the stride.
When tiling, https://reviews.llvm.org/D109267 changed how the
way dimensions are acquired. Instead of directly querying using
`*.dim` ops on the original convolution op, we now get it by
applying the access affine map to the loop upper bounds. This
is fine for dimensions having single-dimension affine maps,
like matmul, but not for convolution input. It will cause
incorrect compuation and out of bound. A concrete example, say
we have 1x225x225x3 (NHWC) input, 3x3x3x32 (HWCF) filter, and
1x112x112x3 (NHWC) output with stride 2, (112 * 2 + 3) would be
227, which is different from the correct input window dimension
size 225.
Instead, we should first calculate the max indices for each loop,
and apply the affine map to them, and then plus one to get the
dimension size. Note this makes no difference for matmul-like
ops given they will have `d0 - 1 + 1` effectively.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110849
* This could have been removed some time ago as it only had one op left in it, which is redundant with the new approach.
* `matmul_i8_i8_i32` (the remaining op) can be trivially replaced by `matmul`, which natively supports mixed precision.
Differential Revision: https://reviews.llvm.org/D110792
Previously, the dialect was exposed for linking and pass management purposes,
but we did not generate op classes for it. Generate them.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110819
This is an important core dialect that has not been exposed previously. Set up
the default bindings generation and provide a nicer wrapper for the `for` loop
with access to the loop configuration and body.
Depends On D110758
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D110759
Without this change, these attributes can only be accessed through the generic
operation attribute dictionary provided the caller knows the special operation
attribute names used for this purpose. Add some Python wrapping to support this
use case.
Also provide access to function arguments usable inside the function along with
a couple of quality-of-life improvements in using block arguments (function
arguments being the arguments of its entry block).
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D110758
The former is redundant because the later carries it as part of
its builder. Add a getContext() helper method to DialectAsmParser
to make this more convenient, and stop passing the context around
explicitly. This simplifies ODS generated parser hooks for attrs
and types.
This resolves PR51985
Recommit 4b32f8bac4 after fixing a dependency.
Differential Revision: https://reviews.llvm.org/D110796
This is (perhaps unintuitively) where the other AsmParser method
implementations are, which means that dialects don't generally need
to depend on MLIRParser directly. This should fix a build failure
building .so files on the mlir-nvidia builder.
The former is redundant because the later carries it as part of
its builder. Add a getContext() helper method to DialectAsmParser
to make this more convenient, and stop passing the context around
explicitly. This simplifies ODS generated parser hooks for attrs
and types.
This resolves PR51985
Differential Revision: https://reviews.llvm.org/D110796
Should have verified the perm length and input rank were the same before
inferring shape. Caused a crash with invalid IR.
Differential Revision: https://reviews.llvm.org/D110674
The lack of negi details leaked from merger class into codegen part.
Also, special case for vector code was not needed, the type can be used directly!
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110677
Added interface implementations for AllocOp and CloneOp defined in the MemRef diallect.
Adapted the BufferDeallocation pass to be compatible with the interface introduced in this CL.
Differential Revision: https://reviews.llvm.org/D109350
This revision retires a good portion of the complexity of the codegen strategy and puts the logic behind pass logic.
Differential revision: https://reviews.llvm.org/D110678
We weren't retaining the ctypes closures that the ExecutionEngine was
calling back into, leading to mysterious errors.
Open to feedback about how to test this. And an extra pair of eyes to
make sure I caught all the places that need to be aware of this.
Differential Revision: https://reviews.llvm.org/D110661
Quantized int type should include I32 types as its the output of a quantizd
convolution or matmul operation.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D110651
Unroll-and-jam currently doesn't work when the loop being unroll-and-jammed
or any of its inner loops has iter_args. This patch modifies the
unroll-and-jam utility to support loops with iter_args.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D110085
Adapt the signature of the PaddingValueComputationFunction callback to either return the padding value or failure to signal padding is not desired.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110572
This integration tests runs a fused and non-fused version of
sampled matrix multiplication. Both should eventually have the
same performance!
NOTE: relies on pending tensor.init fix!
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110444
This revision makes sure that when the output buffer materializes locally
(in contrast with the passing in of output tensors either in-place or not
in-place), the zero initialization assumption is preserved. This also adds
a bit more documentation on our sparse kernel assumption (viz. TACO
assumptions).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110442
New mode option that allows for either running the default fusion kind that happens today or doing either of producer-consumer or sibling fusion. This will also be helpful to minimize the compile-time of the fusion tests.
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/D110102
The sparse constant provides a constant tensor in coordinate format. We first split the sparse constant into a constant tensor for indices and a constant tensor for values. We then generate a loop to fill a sparse tensor in coordinate format using the tensors for the indices and the values. Finally, we convert the sparse tensor in coordinate format to the destination sparse tensor format.
Add tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110373
Let the calling pass or pattern replace the uses of the original root operation. Internally, the tileAndFuse still replaces uses and updates operands but only of newly created operations.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110169
This revision adds a
```
FlatAffineValueConstraints(ValueRange ivs, ValueRange lbs, ValueRange ubs)
```
method and use it in hoist padding.
Differential Revision: https://reviews.llvm.org/D110427
This revision extracts padding hoisting in a new file and cleans it up in prevision of future improvements and extensions.
Differential Revision: https://reviews.llvm.org/D110414
When splitting with linalg.copy, cannot write into the destination alloc directly. Instead, write into a subview of the alloc.
Differential Revision: https://reviews.llvm.org/D110512
This patch adds functionality to FlatAffineConstraints to remove local
variables using equalities. This helps in keeping output representation of
FlatAffineConstraints smaller.
This patch is part of a series of patches aimed at generalizing affine
dependence analysis.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D110056
We currently, incorrectly, assume that a range always has at least
one element when building a contiguous range. This commit adds
a proper empty check to avoid crashing.
Differential Revision: https://reviews.llvm.org/D110457
For such cases, the type of the constant DenseElementsAttr is
different from the transpose op return type.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D110446
This patch introduces a generic reduction detection utility that works
across different dialecs. It is mostly a generalization of the reduction
detection algorithm in Affine. The reduction detection logic in Affine,
Linalg and SCFToOpenMP have been replaced with this new generic utility.
The utility takes some basic components of the potential reduction and
returns: 1) the reduced value, and 2) a list with the combiner operations.
The logic to match reductions involving multiple combiner operations disabled
until we can properly test it.
Reviewed By: ftynse, bondhugula, nicolasvasilache, pifon2a
Differential Revision: https://reviews.llvm.org/D110303
This has a few benefits:
* It allows for defining parsers/printer code blocks that
can be shared between operations and attribute/types.
* It removes the weird duplication of generic parser/printer hooks,
which means that newly added hooks only require touching one class.
Differential Revision: https://reviews.llvm.org/D110375
These are among the last operations still defined explicitly in C++. I've
tried to keep this commit as NFC as possible, but these ops
definitely need a non-NFC cleanup at some point.
Differential Revision: https://reviews.llvm.org/D110440
* If the input is a constant splat value, we just
need to reshape it.
* If the input is a general constant with one user,
we can also constant fold it, without bloating
the IR.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D110439
This commits updates the remaining usages of the ArrayRef<Value> based
matchAndRewrite/rewrite methods in favor of the new OpAdaptor
overload.
Differential Revision: https://reviews.llvm.org/D110360
This has been a TODO for a long time, and it brings about many advantages (namely nice accessors, and less fragile code). The existing overloads that accept ArrayRef are now treated as deprecated and will be removed in a followup (after a small grace period). Most of the upstream MLIR usages have been fixed by this commit, the rest will be handled in a followup.
Differential Revision: https://reviews.llvm.org/D110293
Initially, the padding transformation and the related operation were only used
to guarantee static shapes of subtensors in tiled operations. The
transformation would not insert the padding operation if the shapes were
already static, and the overall code generation would actively remove such
"noop" pads. However, this transformation can be also used to pack data into
smaller tensors and marshall them into faster memory, regardless of the size
mismatches. In context of expert-driven transformation, we should assume that,
if padding is requested, a potentially padded tensor must be always created.
Update the transformation accordingly. To do this, introduce an optional
`packing` attribute to the `pad_tensor` op that serves as an indication that
the padding is an intentional choice (as opposed to side effect of type
normalization) and should be left alone by cleanups.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110425
Add support for intersecting, subtracting, complementing and checking equality of sets having divisions.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D110138
This canonicalization pattern complements the tensor.cast(pad_tensor) one in
propagating constant type information when possible. It contributes to the
feasibility of pad hoisting.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110343
* Do not discard static result type information that cannot be inferred from lower/upper padding.
* Add optional argument to `PadTensorOp::inferResultType` for specifying known result dimensions.
Differential Revision: https://reviews.llvm.org/D110380
This is only noticeable when using an attribute across dialects I think.
Previously the namespace would be ommited, but it wouldn't matter as
long as the generated code stays within a single namespace.
Differential Revision: https://reviews.llvm.org/D110367
When generating code to add an element to SparseTensorCOO (e.g., when doing dense=>sparse conversion), we used to check for nonzero values on the runtime side, whereas now we generate MLIR code to do that check.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110121
Current warning message in method `addAffineForOpDomain` of mlir/lib/Analysis/AffineStructures.cpp is being printed to the stdout/stderr.
This patch redirects the warning with LLVM_DEBUG following standard llvm practice.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D108340
clang-cl errors out while handling the templated version of tgfmt. This
patch works around the issue by explicitly choosing the non-templated
version of tgfmt, which takes an ArrayRef<std::string>.
More details in this thread:
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068936.html
Thanks @Mehdi Amini for suggesting the fix :)
Differential Revision: https://reviews.llvm.org/D110223
Enables putting types and attributes in sets and in dicts as keys.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D110301
This test makes sure kernels map to efficient sparse code, i.e. all
compressed for-loops, no co-iterating while loops. In addition, this
revision removes the special constant folding inside the sparse
compiler in favor of Mahesh' new generic linalg folding. Thanks!
NOTE: relies on Mahesh fix, which needs to be rebased first
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110001
When both a DefaultValuedAttr and a successor or variadic region was specified, this would generate invalid C++ declaration. There would be the parameter with a default value, followed by the successors/regions, which don't have a default, which is invalid.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110205
The current folder of constant -> generic op only handles splat
constants. The same logic holds for scalar constants. Teach the
pattern to handle such cases.
Differential Revision: https://reviews.llvm.org/D109982
This fixes a bug where we discover new information about the arguments of an
already executable edge, but don't visit the arguments. We only visit the arguments, and not the block itself, so this commit shouldn't really affect performance at all.
Fixes PR#51871
Differential Revision: https://reviews.llvm.org/D110197
Should reset the operation to original state when canceling the updates.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D110176
Now not just SUM, but also PRODUCT, AND, OR, XOR. The reductions
MIN and MAX are still to be done (also depends on recognizing
these operations in cmp-select constructs).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110203
If no interchange vector is given initialize it with the identity permutation from 0 to number of loops.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D110249
This revision removes the ad-hoc MemRefs that were needed using the old
ABI (when we still passed by value) and replaces them with the shared
StridedMemRef definitions of CRunnerUtils (possible now that we pass by
pointer). This avoids code duplication and makes sure we have a consistent
view of strided memory references in all our support libraries.
Reviewed By: jsetoain
Differential Revision: https://reviews.llvm.org/D110221
This change adds automatic wrapper functoins with emit_c_interface
to all methods in the sparse support library that deal with MEMREFs.
The wrappers will take care of passing MEMREFs by value internally
and by pointer externally, thereby avoiding ABI issues across platforms.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110219
DialectAsmParser has a `parseAttribute` member that takes a
contextual type, but DialectAsmPrinter doesn't have the corresponding
member to take advantage of it. As such, custom attribute
implementations can't really use it. This adds the obvious missing
method which fills this hole.
Differential Revision: https://reviews.llvm.org/D110211
Previously, the translation to LLVM IR would emit IR that directly uses
a scope metadata node in case only one scope was in use in alias.scopes
or noalias metadata. It should always be a list of scopes. The verifier
change in 8700f2bd36 enforced this and
broke the test. Fix the translation to always create a list of scopes
using a new metadata node, update and reenable the respective test.
Fixes PR51919.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110140
Add a PointerProxy similar to the existing iterator_facade_base::ReferenceProxy and return it from the arrow operator. This prevents iterator facades with a reference type that is not a true reference to take the address of a temporary.
Forward the reference type of the mapped_iterator to the iterator adaptor which in turn forwards it to the iterator facade. This fixes mlir::op_iterator::operator->() to take the address of a temporary.
Make some polishing changes to op_iterator and op_filter_iterator.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D109490
Compute the tiled producer slice dimensions directly starting from the consumer not using the producer at all.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110147
Add a helper method to check if an index vector contains a permutation of its indices. Additionally, refactor applyPermutationToVector to take int64_t.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110135
It was previously assumed that tensor.insert_slice should be bufferized first in a greedy fashion to avoid out-of-place bufferization of the large tensor. This heuristic does not hold upon further inspection.
This CL removes the special handling of such ops and adds a test that exhibits better behavior and appears in real use cases.
The only test adversely affected is an artificial test which results in a returned memref: this pattern is not allowed by comprehensive bufferization in real scenarios anyway and the offending test is deleted.
Differential Revision: https://reviews.llvm.org/D110072
Previously, comprehensive bufferize would consider all aliasing reads and writes to
the result buffer and matching operand. This resulted in spurious dependences
being considered and resulted in too many unnecessary copies.
Instead, this revision revisits the gathering of read and write alias sets.
This results in fewer alloc and copies.
An exhaustive test cases is added that considers all possible permutations of
`matmul(extract_slice(fill), extract_slice(fill), ...)`.
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.
This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).
Differential Revision: https://reviews.llvm.org/D108454
This patch adds mergeLocalIds andmergeSymbolIds as public functions
for FlatAffineConstraints and FlatAffineValueConstraints respectively.
mergeLocalIds is also required to support divisions in intersection,
subtraction, equality checks, and complement for PresburgerSet.
This patch is part of a series of patches aimed at generalizing affine
dependence analysis.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D110045
Lots of custom ops have hand-rolled comma-delimited parsing loops, as does
the MLIR parser itself. Provides a standard interface for doing this that
is less error prone and less boilerplate.
While here, extend Delimiter to support <> and {} delimited sequences as
well (I have a use for <> in CIRCT specifically).
Differential Revision: https://reviews.llvm.org/D110122