Using callbacks for allocation/deallocation allows users to override
the default.
Also add an option to comprehensive bufferization pass to use `alloca`
instead of `alloc`s. Note that this option is just for testing. The
option to use `alloca` does not work well with the option to allow for
returning memrefs.
Using callbacks for allocation/deallocation allows users to override
the default.
Also add an option to comprehensive bufferization pass to use `alloca`
instead of `alloc`s. Note that this option is just for testing. The
option to use `alloca` does not work well with the option to allow for
returning memrefs.
Differential Revision: https://reviews.llvm.org/D112166
Introduce the initial support for operation interfaces in C API and Python
bindings. Interfaces are a key component of MLIR's extensibility and should be
available in bindings to make use of full potential of MLIR.
This initial implementation exposes InferTypeOpInterface all the way to the
Python bindings since it can be later used to simplify the operation
construction methods by inferring their return types instead of requiring the
user to do so. The general infrastructure for binding interfaces is defined and
InferTypeOpInterface can be used as an example for binding other interfaces.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D111656
This patch adds a polynomial approximation that matches the
approximation in Eigen.
Note that the approximation only applies to vectorized inputs;
the scalar rsqrt is left unmodified.
The approximation is protected with a flag since it emits an AVX2
intrinsic (generated via the X86Vector). This is the only reasonably
clean way that I could find to generate the exact approximation that
I wanted (i.e. an identical one to Eigen's).
I considered two alternatives:
1. Introduce a Rsqrt intrinsic in LLVM, which doesn't exist yet.
I believe this is because there is no definition of Rsqrt that
all backends could agree on, since hardware instructions that
implement it have widely varying degrees of precision.
This is something that the standard could mandate, but Rsqrt is
not part of IEEE754, so I don't think this option is feasible.
2. Emit fdiv(1.0, sqrt) with fast math flags to allow reciprocal
transformations. Although portable, this doesn't allow us
to generate exactly the code we want; it is the LLVM backend,
and not MLIR, who controls what code is generated based on the
target CPU.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D112192
This is a temporary fix, better would be to avoid including
llvm/Option/ArgList.h from a Support source file.
Differential Revision: https://reviews.llvm.org/D111974
After the TargetRegistry.h move, nothing in Support includes headers
from MC. However, files in tablegen use MC headers, so we must add an
entry for them in tblgen srcs.
Differential Revision: https://reviews.llvm.org/D111835
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
Create the Arithmetic dialect that contains basic integer and floating
point arithmetic operations. Ops that did not meet this criterion were
moved to the Math dialect.
First of two atomic patches to remove integer and floating point
operations from the standard dialect. Ops will be removed from the
standard dialect in a subsequent patch.
Reviewed By: ftynse, silvas
Differential Revision: https://reviews.llvm.org/D110200
This revision exposes some minimal funcitonality to allow comprehensive
bufferization to interop with external projects.
Differential Revision: https://reviews.llvm.org/D110875
* This could have been removed some time ago as it only had one op left in it, which is redundant with the new approach.
* `matmul_i8_i8_i32` (the remaining op) can be trivially replaced by `matmul`, which natively supports mixed precision.
Differential Revision: https://reviews.llvm.org/D110792
This revision extracts padding hoisting in a new file and cleans it up in prevision of future improvements and extensions.
Differential Revision: https://reviews.llvm.org/D110414
This patch introduces a generic reduction detection utility that works
across different dialecs. It is mostly a generalization of the reduction
detection algorithm in Affine. The reduction detection logic in Affine,
Linalg and SCFToOpenMP have been replaced with this new generic utility.
The utility takes some basic components of the potential reduction and
returns: 1) the reduced value, and 2) a list with the combiner operations.
The logic to match reductions involving multiple combiner operations disabled
until we can properly test it.
Reviewed By: ftynse, bondhugula, nicolasvasilache, pifon2a
Differential Revision: https://reviews.llvm.org/D110303
In attempting to build JAX on Apple Silicon, we discovered an issue with
the bazel configuration in llvm-project-overlay. This patch fixes the
logic, at least when building JAX. More context is included on the
following GitHub issue: https://github.com/google/jax/issues/5501
Differential Revision: https://reviews.llvm.org/D109839
This revision refactors ElementsAttr into an Attribute Interface.
This enables a common interface with which to interact with
element attributes, without needing to modify the builtin
dialect. It also removes a majority (if not all?) of the need for
the current OpaqueElementsAttr, which was originally intended as
a way to opaquely represent data that was not representable by
the other builtin constructs.
The new ElementsAttr interface not only allows for users to
natively represent their data in the way that best suits them,
it also allows for efficient opaque access and iteration of the
underlying data. Attributes using the ElementsAttr interface
can directly expose support for interacting with the held
elements using any C++ data type they claim to support. For
example, DenseIntOrFpElementsAttr supports iteration using
various native C++ integer/float data types, as well as
APInt/APFloat, and more. ElementsAttr instances that refer to
DenseIntOrFpElementsAttr can use all of these data types for
iteration:
```c++
DenseIntOrFpElementsAttr intElementsAttr = ...;
ElementsAttr attr = intElementsAttr;
for (uint64_t value : attr.getValues<uint64_t>())
...;
for (APInt value : attr.getValues<APInt>())
...;
for (IntegerAttr value : attr.getValues<IntegerAttr>())
...;
```
ElementsAttr also supports failable range/iterator access,
allowing for selective code paths depending on data type
support:
```c++
ElementsAttr attr = ...;
if (auto range = attr.tryGetValues<uint64_t>()) {
for (uint64_t value : *range)
...;
}
```
Differential Revision: https://reviews.llvm.org/D109190
Add an interface that allows grouping together all covolution and
pooling ops within Linalg named ops. The interface currently
- the indexing map used for input/image access is valid
- the filter and output are accessed using projected permutations
- that all loops are charecterizable as one iterating over
- batch dimension,
- output image dimensions,
- filter convolved dimensions,
- output channel dimensions,
- input channel dimensions,
- depth multiplier (for depthwise convolutions)
Differential Revision: https://reviews.llvm.org/D109793
Presently, definitions default to those for Linux which are not defined for FreeBSD (HAVE_LSEEK64, HAVE_MALLINFO, etc.). Patch sets os_defines to posix definitions under FreeBSD.
Reviewed By: GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D109913
This reduces the maintenance burden by using globs, which is the
tradeoff we make in the other LLVM Bazel build files as well.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D109720
Conversion to the LLVM dialect is being refactored to be more progressive and
is now performed as a series of independent passes converting different
dialects. These passes may produce `unrealized_conversion_cast` operations that
represent pending conversions between built-in and LLVM dialect types.
Historically, a more monolithic Standard-to-LLVM conversion pass did not need
these casts as all operations were converted in one shot. Previous refactorings
have led to the requirement of running the Standard-to-LLVM conversion pass to
clean up `unrealized_conversion_cast`s even though the IR had no standard
operations in it. The pass must have been also run the last among all to-LLVM
passes, in contradiction with the partial conversion logic. Additionally, the
way it was set up could produce invalid operations by removing casts between
LLVM and built-in types even when the consumer did not accept the uncasted
type, or could lead to cryptic conversion errors (recursive application of the
rewrite pattern on `unrealized_conversion_cast` as a means to indicate failure
to eliminate casts).
In fact, the need to eliminate A->B->A `unrealized_conversion_cast`s is not
specific to to-LLVM conversions and can be factored out into a separate type
reconciliation pass, which is achieved in this commit. While the cast operation
itself has a folder pattern, it is insufficient in most conversion passes as
the folder only applies to the second cast. Without complex legality setup in
the conversion target, the conversion infra will either consider the cast
operations valid and not fold them (a separate canonicalization would be
necessary to trigger the folding), or consider the first cast invalid upon
generation and stop with error. The pattern provided by the reconciliation pass
applies to the first cast operation instead. Furthermore, having a separate
pass makes it clear when `unrealized_conversion_cast`s could not have been
eliminated since it is the only reason why this pass can fail.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D109507
OpenMP reductions need a neutral element, so we match some known reduction
kinds (integer add/mul/or/and/xor, float add/mul, integer and float min/max) to
define the neutral element and the atomic version when possible to express
using atomicrmw (everything except float mul). The SCF-to-OpenMP pass becomes a
module pass because it now needs to introduce new symbols for reduction
declarations in the module.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D107549
An interface to allow for tiling of operations is introduced. The
tiling of the linalg.pad_tensor operation is modified to use this
interface.
Differential Revision: https://reviews.llvm.org/D108611
* Add `DimOfIterArgFolder`.
* Move existing cross-dialect canonicalization patterns to `LoopCanonicalization.cpp`.
* Rename `SCFAffineOpCanonicalization` pass to `SCFForLoopCanonicalization`.
* Expand documentaton of scf.for: The type of loop-carried variables may not change with iterations. (Not even the dynamic type.)
Differential Revision: https://reviews.llvm.org/D108806
This canonicalization simplifies affine.min operations inside "for loop"-like operations (e.g., scf.for and scf.parallel) based on two invariants:
* iv >= lb
* iv < lb + step * ((ub - lb - 1) floorDiv step) + 1
This commit adds a new pass `canonicalize-scf-affine-min` (instead of being a canonicalization pattern) to avoid dependencies between the Affine dialect and the SCF dialect.
Differential Revision: https://reviews.llvm.org/D107731
There's a lot of unnecessary backslashes here that we can avoid to
reduce confusion.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D108495
Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores.
Differential Revision: https://reviews.llvm.org/D107733
Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body.
affine.min ops such as:
```
map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)>
%r = affine.min #affine.min #map(%iv)[%step, %ub]
```
are rewritten them into (in the case the peeled loop):
```
%r = %step
```
To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized.
Differential Revision: https://reviews.llvm.org/D107222
Only require one intermediate repository instead of two.
Fewer parameters in llvm_config.
Second attempt of https://reviews.llvm.org/D107714, this time also updating `third_party_build` and `deps_impl` paths.
Reviewed By: GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D108274