The bufferization of arith.constant ops is also switched over to BufferizableOpInterface-based bufferization. The old implementation is deleted. Both implementations utilize GlobalCreator, now renamed to just `getGlobalFor`.
GlobalCreator no longer maintains a set of all created allocations to avoid duplicate allocations of the same constant. Instead, `getGlobalFor` scans the module to see if there is already a global allocation with the same constant value.
For compatibility reasons, it is still possible to create a pass that bufferizes only `arith.constant`. This pass (createConstantBufferizePass) could be deleted once all users were switched over to One-Shot bufferization.
Differential Revision: https://reviews.llvm.org/D118483
llvm/Support/Path.h was likely previously implicitly included, and a
refactoring removed that inclusion, breaking the pass.
Differential Revision: https://reviews.llvm.org/D118508
This patch adds the vector.scan op which computes the
scan for a given n-d vector. It requires specifying the operator,
the identity element and whether the scan is inclusive or
exclusive.
TEST: Added test in ops.mlir
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D117171
This patch introduces a class LexSimplex that can currently be used to find the
lexicographically minimal rational point in an IntegerPolyhedron. This is a
series of patches leading to computing the lexicographically minimal integer
lattice point as well parametric lexicographic minimization.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117437
With opaque pointers, we can no longer derive this from the pointer
type, so we need to explicitly provide the element type the atomic
operation should work with.
Differential Revision: https://reviews.llvm.org/D118359
This is in preparation of switching `-tensor-constant-bufferize` and `-arith-bufferize` to BufferizableOpInterface-based implementations.
Differential Revision: https://reviews.llvm.org/D118324
This commit switches the `tensor-bufferize` pass over to BufferizableOpInterface-based bufferization.
Differential Revision: https://reviews.llvm.org/D118246
The pass can currently not handle to_memref(to_tensor(x)) folding where a cast is necessary. This is required with the new unified bufferization. There is already a canonicalization pattern that handles such foldings and it should be used during this pass.
Differential Revision: https://reviews.llvm.org/D117988
Stop the Cpp target from emitting unused labels. The previosly generated
code generated warning if `-Wunused-label` is passed to a compiler.
Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D118154
OwningRewritePatternList has been deprecated for ~10 months now, we can remove
the leftover using directives at this point.
Differential Revision: https://reviews.llvm.org/D118287
These transformations already operate on memref operations (as part of
splitting up the standard dialect). Now that the operations have moved,
it's time for these transformations to move as well.
Differential Revision: https://reviews.llvm.org/D118285
We already convert to BitVector internally, and other APIs (namely Operation::eraseOperands)
already use BitVector as well. Switching over provides a common format between
API and also reduces the amount of format conversions necessary.
Fixes#53325
Differential Revision: https://reviews.llvm.org/D118083
The option parser currently does not properly handle nested options, meaning that
in some cases we can print pass pipelines that we can't actually parse back in. For example,
from #52885 we currently can't parse in inliner pipelines that have nested pipeline strings.
This commit adds handling for string values (e.g. "...") and nested options
(e.g. `foo{baz{bar=10 pizza=11}}`).
Fixes#52885
Differential Revision: https://reviews.llvm.org/D118078
This is part of splitting up the standard dialect. The move makes sense anyways,
given that the memref dialect already holds memref.atomic_rmw which is the non-region
sibling operation of std.generic_atomic_rmw (the relationship is even more clear given
they have nearly the same description % how they represent the inner computation).
Differential Revision: https://reviews.llvm.org/D118209
The GPU dialect currently contains an explicit reference to LLVMFuncOp
during verification to handle the situation where the kernel has already been
converted. This commit changes that reference to instead use FunctionOpInterface,
which has two main benefits:
* It allows for removing an otherwise unnecessary dependency on the LLVM dialect
* It removes hardcoded assumptions about the lowering path and use of the GPU dialect
Differential Revision: https://reviews.llvm.org/D118172
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
This is for compatibility with existing bufferization passes. Also clean up memref type generation a bit.
Differential Revision: https://reviews.llvm.org/D118243
This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "elementtype" TypeAttr on the operands to pass LLVM verification and need to be provided.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D118006
If we are extracting it is more useful to push the index_cast past the
extraction. This increases the chance the tensor.extract can evaluated at
compile time.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118204
There is not much of a benefit to reshape a from element vs reloading it.
Updated to progagate shape manipulations into the output type of
tensor.from_elements.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D118201
Fusion of reshape ops by linearization incorrectly inverted the
indexing map before linearizing dimensions. This leads to incorrect
indexing maps used in the fused operation.
Differential Revision: https://reviews.llvm.org/D117908
This pattern is not written to handle operations with `linalg.index`
operations in its body, i.e. operations that have index semantics.
Differential Revision: https://reviews.llvm.org/D117856
This is mostly a copy of the existing tensor.from_elements bufferization. Once TensorInterfaceImpl.cpp is moved to the tensor dialect, the existing rewrite pattern can be deleted.
Differential Revision: https://reviews.llvm.org/D117775
This is mostly a copy of the existing tensor.generate bufferization. Once TensorInterfaceImpl.cpp is moved to the tensor dialect, the existing rewrite pattern can be deleted.
Differential Revision: https://reviews.llvm.org/D117770
This patch supports the atomic construct (capture) following section 2.17.7 of OpenMP 5.0 standard. Also added tests for the same.
Reviewed By: peixin, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D115851
This is a pretty important debugging option to stay hidden. Also,
improve its cmd-line description; the current description gives no hint
that this is the one to use to have locations printed inline.
Out-of-line locations are also unproductive to work with in many cases
where the locations are actually compact, which is also why this option
should be more visible. This revision doesn't change the default on it
though.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D117186
A lot of dialects have dependencies that are unnecessary, either because of copy/paste
of files when creating things or some other means. This commit cleans up a bunch of
the simple ones:
* Copy/Paste or missed during refactoring
Most of the dependencies cleaned up here look like copy/paste errors when creating
new dialects/transformations, or because the dependency wasn't removed during a
refactoring (e.g. when splitting the standard dialect).
* Unnecessary hard coding of constant operations in matchers
There are a few instances where a dialect had a dependency because it
was hardcoding checks for constant operations instead of using the better m_Constant
approach.
Differential Revision: https://reviews.llvm.org/D118062
This has been a major TODO for a very long time, and is necessary for establishing a proper
dialect-free dependency layering for the Transforms library. Code was moved to effectively
two main locations:
* Affine/
There was quite a bit of affine dialect related code in Transforms/ do to historical reasons
(of a time way into MLIR's past). The following headers were moved to:
Transforms/LoopFusionUtils.h -> Dialect/Affine/LoopFusionUtils.h
Transforms/LoopUtils.h -> Dialect/Affine/LoopUtils.h
Transforms/Utils.h -> Dialect/Affine/Utils.h
The following transforms were also moved:
AffineLoopFusion, AffinePipelineDataTransfer, LoopCoalescing
* SCF/
Only one SCF pass was in Transforms/ (likely accidentally placed here): ParallelLoopCollapsing
The SCF specific utilities in LoopUtils have been moved to SCF/Utils.h
* Misc:
mlir::moveLoopInvariantCode was also moved to LoopLikeInterface.h given
that it is a simple utility defined in terms of LoopLikeOpInterface.
Differential Revision: https://reviews.llvm.org/D117848
Transforms/ should only contain transformations that are dialect-independent and
this pass interacts with MemRef operations (making it a better fit for living in that
dialect).
Differential Revision: https://reviews.llvm.org/D117841
Transforms/ should only contain dialect-independent transformations,
and these files are a much better fit for the bufferization dialect anyways.
Differential Revision: https://reviews.llvm.org/D117839
The current lowering from GPU to NVVM does
not correctly handle the following cases when
lowering the gpu shuffle op.
1. When the active width is set to 32 (all lanes),
then the current approach computes (1 << 32) -1 which
results in poison values in the LLVM IR. We fix this by
defining the active mask as (-1) >> (32 - width).
2. In the case of shuffle up, the computation of the third
operand c has to be different from the other 3 modes due to
the op definition in the ISA reference.
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html)
Specifically, the predicate value is computed as j >= maxLane
for up and j <= maxLane for all other modes. We fix this by
computing maskAndClamp as 32 - width for this mode.
TEST: We modify the existing test and add more checks for the up mode.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D118086
Adding a similar decomposition for exponential minus one to the SPIRV
backends along with the necessary tests.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D118081
Control-Flow Sink moves operations whose only uses are in conditionally-executed regions into those regions so that paths in which their results are not needed do not perform unnecessary computation.
Depends on D115087
Reviewed By: jpienaar, rriddle, bondhugula
Differential Revision: https://reviews.llvm.org/D115088
Add a transpose option to hoist padding to transpose the padded tensor before storing it into the packed tensor. The early transpose improves the memory access patterns of the actual compute kernel. The patch introduces a transpose right after the hoisted pad tensor and a second transpose inside the compute loop. The second transpose can either be fused into the compute operation or will canonicalize away when lowering to vector instructions.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D117893
No longer go through an external model. Also put BufferizableOpInterface into the same build target as the BufferizationDialect. This allows for some code reuse between BufferizationOps canonicalizers and BufferizableOpInterface implementations.
Differential Revision: https://reviews.llvm.org/D117987
Both insertion points are valid. This is to make BufferizableOpInteface-based bufferization compatible with existing partial bufferization test cases. (So less changes are necessary to unit tests.)
Differential Revision: https://reviews.llvm.org/D117986
This is the only op that is not supported via BufferizableOpInterfaceImpl bufferization. Once this op is supported we can switch `tensor-bufferize` over to the new unified bufferization.
Differential Revision: https://reviews.llvm.org/D117985
This is in preparation of unifying the existing bufferization with One-Shot bufferization.
A subsequent commit will replace `tensor-bufferize`'s implementation with the BufferizableOpInterface-based implementation and move over missing test cases.
Differential Revision: https://reviews.llvm.org/D117984
Only using that change in StringRef already decreases the number of
preoprocessed lines from 7837621 to 7776151 for LLVMSupport
Perhaps more interestingly, it shows that many files were relying on the
inclusion of StringRef.h to have the declaration from STLExtras.h. This
patch tries hard to patch relevant part of llvm-project impacted by this
hidden dependency removal.
Potential impact:
- "llvm/ADT/StringRef.h" no longer includes <memory>,
"llvm/ADT/Optional.h" nor "llvm/ADT/STLExtras.h"
Related Discourse thread:
https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
This patch moves merging of duplicate divisions to presburger utility
functions. This is required to support division merging in structures other
than IntegerPolyhedron.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D118001
Pass a ValueRange instead of an ArrayRef<Value> for better compatibility. Also provide an additional function overload that automatically deallocates the buffer if specified.
Differential Revision: https://reviews.llvm.org/D118025
This patch changes names of identifiers and their corresponding getters in
PresburgerSet to match those of IntegerPolyhedron.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D117998
When the coefficients of dividend are negative, the gcd may be negative
which will change the sign of dividend and overflow denominator.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117911
When 2 clamp ops are in a row, they can be canonicalized into a single clamp
that uses the most constrained range
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D117934
Rationale:
Although file I/O is a bit alien to MLIR itself, we provide two convenient ways
for sparse tensor I/O. The input part was already there (behind the swiss army
knife sparse_tensor.new). Now we have a sparse_tensor.out to write out data. As
before, the ops are kept vague and may change in the future. For now this
allows us to compare TACO vs MLIR very easily.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D117850
Implement a taylor series approximation for atan and add an atan2 lowering
that uses atan's appromation. This includes tests for edge cases and tests
for each quadrant.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D115682
The cleanup was manual, but assisted by "include-what-you-use". It consists in
1. Removing unused forward declaration. No impact expected.
2. Removing unused headers in .cpp files. No impact expected.
3. Removing unused headers in .h files. This removes implicit dependencies and
is generally considered a good thing, but this may break downstream builds.
I've updated llvm, clang, lld, lldb and mlir deps, and included a list of the
modification in the second part of the commit.
4. Replacing header inclusion by forward declaration. This has the same impact
as 3.
Notable changes:
- llvm/Support/TargetParser.h no longer includes llvm/Support/AArch64TargetParser.h nor llvm/Support/ARMTargetParser.h
- llvm/Support/TypeSize.h no longer includes llvm/Support/WithColor.h
- llvm/Support/YAMLTraits.h no longer includes llvm/Support/Regex.h
- llvm/ADT/SmallVector.h no longer includes llvm/Support/MemAlloc.h nor llvm/Support/ErrorHandling.h
You may need to add some of these headers in your compilation units, if needs be.
As an hint to the impact of the cleanup, running
clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Support/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 8000919 lines
after: 7917500 lines
Reduced dependencies also helps incremental rebuilds and is more ccache
friendly, something not shown by the above metric :-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
In some cases, fusion can produce illegal operations if after fusion
the range of some of the loops cannot be computed from shapes of its
operands. Check for this case and abort the fusion if this happens.
Differential Revision: https://reviews.llvm.org/D117602
This is superseded by the same method on OpAsmOpInterface, which is
available on the Dialect through the Fallback mechanism,
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117750
This extends dense attribute element access to support 8b and 16b ints.
Also extends the corresponding parts of the C api.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D117731
This allows to pipe sequences of `mlir-opt -split-input-file | mlir-opt -split-input-file`.
Depends On D117750
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117756
Right shift can occur that is a 32-bit right shift. This is undefined behavior.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117732
- Set the DEBUG_TYPE of SerializeToBlob to serialize-to-blob
- Add debug output to print the assembly or PTX for GPU modules before
they are assembled and linked
Note that, as SerializeToBlob is a superclass of SerializeToCubin and
SerializeToHsaco, --debug-only=serialize-to-blom will dump the
intermediate compiler result for both of these passes.
In addition, if LLVM options such as --stop-after are used to control
the GPU kernel compilation process, the debug output will contain the
appropriate intermediate IR.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D117519
PDLDialect being a somewhat user-facing dialect and whose ops contain exclusively other PDL ops in their regions can take advantage of `OpAsmOpInterface` to provide nicer IR.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117828
This commits explicitly states that negative values and values exceeding
vector dimensions are allowed in vector.create_mask (but not in
vector.constant_mask). These values are now truncated when
canonicalizing vector.create_mask to vector.constant_mask.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D116069
When a Builder methods accepts multiple InsertPoints, when both point to
the same position, inserting instructions at one position will "move" the
other after the inserted position since the InsertPoint is pegged to the
instruction following the intended InsertPoint. For instance, when
creating a parallel region at Loc and passing the same position as AllocaIP,
creating instructions at Loc will "move" the AllocIP behind the Loc
position.
To avoid this ambiguity, add an assertion checking this condition and
fix the unittests.
In case of AllocaIP, an alternative solution could be to implicitly
split BasicBlock at InsertPoint, using the first as AllocaIP, the second
for inserting the instructions themselves. However, this solution is
specific to AllocaIP since AllocaIP will always have to be first. Hence,
this is an argument to generally handling ambiguous InsertPoints as API
sage error.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117226
When computing the new type of a collapse_shape operation, we need to at least
take into account whether the type has an identity layout, in which case we can
easily support dynamic strides. Otherwise, the canonicalizer creates invalid
IR.
Longer term, both the verifier and the canoncializer need to be extended to
support the general case.
Differential Revision: https://reviews.llvm.org/D117772
Earlier `computeSingleVarRepr` was returning a pair of upper bound and
lower bound indices of the inequality contraints that can be expressed
as a floordiv of an affine function. The equality expression can also be
expressed as a floordiv but contains only one index and hence the `LocalRepr`
class is introduced to facilitate this.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117430
This commit is the first step towards unifying core bufferization and One-Shot Bufferize.
This commit does not move over the implementations of BufferizableOpInterface yet. This will be done in separate commits. This change does also not move the unit tests yet. The tests will be moved together with op interface implementations and split into separate files.
Differential Revision: https://reviews.llvm.org/D117641
BlockArguments gained the ability to have locations attached a while ago, but they
have always been optional. This goes against the core tenant of MLIR where location
information is a requirement, so this commit updates the API to require locations.
Fixes#53279
Differential Revision: https://reviews.llvm.org/D117633
Previously the optional locations of function arguments were dropped in
`parseFunctionArgumentList`. This CL adds another output argument to the
function through which they are now returned. The values are then plumbed
through as an array of optional locations in the various places.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117604
When the printer is requested to elide large constant, we emit an opaque
attribute instead. This patch fills the dialect name with
"elided_large_const" instead of "_" to remove some user confusion when
they later try to consume it.
Differential Revision: https://reviews.llvm.org/D117711
Benchmarks show that memref::CopyOp is curently up to 200x slower than
tiled and vectorized versions of linalg::Copy.
Add a temporary flag to allow comprehensive bufferize to generate a
linalg::GenericOp that implements a copy until this performance bug is
resolved.
Differential Revision: https://reviews.llvm.org/D117696
The code in `BufferizableOpInterface`'s header/source no longer contains any analysis code. This makes it easier to run the bufferization with a different analysis or without any analysis.
Differential Revision: https://reviews.llvm.org/D117478
This separates the analysis (and its helpers/data structures) more clearly from the rest of the bufferization.
Differential Revision: https://reviews.llvm.org/D117477
This change adds full python bindings for PDL, including types and operations
with additional mixins to make operation construction more similar to the PDL
syntax.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D117458
Also move `createAlloc` and related helper functions out of BufferizationState. The goal is to make BufferizationState as small as possible. (Code cleanup)
Differential Revision: https://reviews.llvm.org/D117476
If not allow-return-memref, raise an error if a new memory allocation is returned/yielded from a block. We do not check for new allocations directly, but for ops that yield/return values that are not equivalent to values that are defined outside of the current of the block.
Note: We still need to check that scf.for yield values and bbArgs are aliasing to ensure that getAliasingOpOperand/getAliasingOpResult is correct.
Differential Revision: https://reviews.llvm.org/D116687
This op is needed for unit testing in a subsequent revision. (This is the first op that has a block that yields equivalent values via the op's results.)
Note: Bufferization of scf.execute_region ops with multiple blocks is not yet supported.
Differential Revision: https://reviews.llvm.org/D117424
When the min/max are the total range of the value, it is a no-op as the values
are already restricted to that range.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D117625
Try to fix the error:
mlir/lib/Parser/Parser.cpp:553:14: error: cannot call member function 'mlir::InFlightDiagnostic mlir::detail::Parser::emitError(llvm::SMLoc, const llvm::Twine&)' without object
<< "operation location alias was never defined";
This commit refactors the FunctionLike trait into an interface (FunctionOpInterface).
FunctionLike as it is today is already a pseudo-interface, with many users checking the
presence of the trait and then manually into functionality implemented in the
function_like_impl namespace. By transitioning to an interface, these accesses are much
cleaner (ideally with no direct calls to the impl namespace outside of the implementation
of the derived function operations, e.g. for parsing/printing utilities).
I've tried to maintain as much compatability with the current state as possible, while
also trying to clean up as much of the cruft as possible. The general migration plan for
current users of FunctionLike is as follows:
* function_like_impl -> function_interface_impl
Realistically most user calls should remove references to functions within this namespace
outside of a vary narrow set (e.g. parsing/printing utilities). Calls to the attribute name
accessors should be migrated to the `FunctionOpInterface::` equivalent, most everything
else should be updated to be driven through an instance of the interface.
* OpTrait::FunctionLike -> FunctionOpInterface
`hasTrait` checks will need to be moved to isa, along with the other various Trait vs
Interface API differences.
* populateFunctionLikeTypeConversionPattern -> populateFunctionOpInterfaceTypeConversionPattern
Fixes#52917
Differential Revision: https://reviews.llvm.org/D117272
The chunk size in schedule clause is one integer expression, which can
be either constant integer or integer variable. Fix schedule clause in
MLIR Op Def to support integer expression with different bit width.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D116073
The only benefit of FunctionPass is that it filters out function
declarations. This isn't enough to justify carrying it around, as we can
simplify filter out declarations when necessary within the pass. We can
also explore with better scheduling primitives to filter out declarations
at the pipeline level in the future.
The definition of FunctionPass is left intact for now to allow time for downstream
users to migrate.
Differential Revision: https://reviews.llvm.org/D117182
The input type of a linalg.generic can be less dynamic than its output
type. If this is the case moving a reshape across the generic op would
create invalid IR, as expand_shape cannot expand arbitrary dynamic
dimensions.
Check that the reshape is actually valid before creating the
expand_shape. This exposes the existing verification logic in reshape
utils and removes the incomplete custom implementation in fusion.
Differential Revision: https://reviews.llvm.org/D116600
Before this patch, deferred location in operation like `test.pretty_printed_region` would
break the parser, and so the IR round-trip.
Depends On D117088
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117413
This will allow to return to the client of `parseLocationInstance` a location that is resolved later.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117088
The current state of the top level Analysis/ directory is that it contains two libraries;
a generic Analysis library (free from dialect dependencies), and a LoopAnalysis library
that contains various analysis utilities that originated from Affine loop transformations.
This commit moves the LoopAnalysis to the more appropriate home of `Dialect/Affine/Analysis/`,
given the use and intention of the majority of the code within it. After the move, if there
are generic utilities that would fit better in the top-level Analysis/ directory, we can move
them.
Differential Revision: https://reviews.llvm.org/D117351
The leading space that is always printed at the beginning of regions is not consistent with other parts of the printing API. Moreover, this leading space can lead to undesirable assembly formats:
```
attr-dict-with-keyword $region
```
Prints as:
```
// Two spaces between `}` and `{`
attributes {foo} { ... }
```
Moreover, the leading space results in the odd generic op format:
```
"test.op"() ( {...}) : () -> ()
```
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117411
The translation from the MLIR LLVM dialect to LLVM IR includes a mechanism that
ensures the successors of a block to be different blocks in case block
arguments are passed to them since the opposite cannot be expressed in LLVM IR.
This mechanism previously only worked for functions because it was written
prior to the introduction of other region-carrying operations such as the
OpenMP dialect, which also translates directly to LLVM IR. Modify this
mechanism to handle all regions in the module and not only functions.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D117548
Also make the ODS Operator class have const iterator, and use const
references for existing API taking Operator by reference.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117516
The majority of dialects reimplement the same boilerplate over and over,
switching the default makes it for better discoverability and make it simpler
to implement new dialects.
Differential Revision: https://reviews.llvm.org/D117524
Coroutine lowering always takes the natural alignment when spilling to
the frame (issue #53148) so using AVX2 or AVX512 in a coroutine doesn't
work. Always overalign to 64 bytes to avoid this issue until we have a
better solution.
Differential Revision: https://reviews.llvm.org/D117501
This revision fixes a bug where the iterative algorithm would walk back def-use chains to an incorrect operand.
This exposed opportunities for a larger refactoring and behavior improvement.
The new algorithm has improved folding behavior and proceeds by tracking both the
permutation of the extraction position and the internal vector permutation.
Multiple partial intersection cases with a candidate insertOp are supported.
The refactoring of the implementation should also help it generalize to strided insert/extract op.
This also subsumes the previous `foldExtractOpFromTranspose` which is now a simple special case and can be deleted.
Differential Revision: https://reviews.llvm.org/D117322
In some cases, the result of an initTensorOp may have an attribute.
However, the Attribute was not passed to `inferResultType`, failing the
verifier. Therefore, propagate the Attribute to `inferResultType`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D117192
The execution engine would not be functional anyway, we're already
disabling the tests, this also disable the rest of the code.
Anecdotally this reduces the number of static library built when the
builtin target is disabled goes from 236 to 218.
Here is the complete list of LLVM targets built when running
`ninja check-mlir`:
libLLVMAggressiveInstCombine.a
libLLVMAnalysis.a
libLLVMAsmParser.a
libLLVMBinaryFormat.a
libLLVMBitReader.a
libLLVMBitstreamReader.a
libLLVMBitWriter.a
libLLVMCore.a
libLLVMDebugInfoCodeView.a
libLLVMDebugInfoDWARF.a
libLLVMDemangle.a
libLLVMFileCheck.a
libLLVMFrontendOpenMP.a
libLLVMInstCombine.a
libLLVMIRReader.a
libLLVMMC.a
libLLVMMCParser.a
libLLVMObject.a
libLLVMProfileData.a
libLLVMRemarks.a
libLLVMScalarOpts.a
libLLVMSupport.a
libLLVMTableGen.a
libLLVMTableGenGlobalISel.a
libLLVMTextAPI.a
libLLVMTransformUtils.a
Differential Revision: https://reviews.llvm.org/D117287
`getNumRegionInvocations` was originally added for the async reference counting, but turned out to be not useful, and currently is not used anywhere (couldn't find any uses in public github repos). Removing dead code.
Reviewed By: Mogball, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117347