Modify structure type in SPIR-V dialect to support:
1) Multiple decorations per structure member
2) Key-value based decorations (e.g., MatrixStride)
This commit kept the Offset decoration separate from members'
decorations container for easier implementation and logical clarity.
As such, all references to Structure layoutinfo are now offsetinfo,
and any member layout defining decoration (e.g., RowMajor for Matrix)
will be add to the members' decorations container along with its
value if any.
Differential Revision: https://reviews.llvm.org/D81426
Implemented `FComparePattern` and `IComparePattern` classes
that provide conversion of SPIR-V comparison ops (such as
`spv.FOrdGreaterThanEqual` and others) to LLVM dialect.
Also added tests in `comparison-ops-to-llvm.mlir`.
Differential Revision: https://reviews.llvm.org/D81487
This patch changes the fusion algorithm so that after fusing two loop nests
we revisit previously visited nodes so that they are considered again for
fusion in the context of the new fused loop nest.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D81609
Summary:
* extra ';' in the following files:
mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
mlir/lib/Dialect/Shape/IR/Shape.cpp
* base class ‘mlir::ConvertVectorToSCFBase<ConvertVectorToSCFPass>’
should be explicitly initialized in the copy constructor [-Wextra] in
mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp
* warning: ‘bool Expression::operator==(const Expression&) const’
defined but not used [-Wunused-function] in
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp
Differential Revision: https://reviews.llvm.org/D81673
Code example in MLIR Linalg doc fixed because it referenced non-existing variables and some parameters were of wrong types.
Differential Revision: https://reviews.llvm.org/D81633
Summary:
- Print function name when ReturnOp verification fails
- This helps easily finding the invalid ReturnOp in an IR dump.
Differential Revision: https://reviews.llvm.org/D81513
Summary:
We now support index casting for tensor<index> to tensor<int>. This
better supports compatibility with the Shape dialect.
Differential Revision: https://reviews.llvm.org/D81611
Summary: At this point Parser has grown to be over 5000 lines and can be very difficult to navigate/update/etc. This commit splits Parser.cpp into several sub files focused on parsing specific types of entities; e.g., Attributes, Types, etc.
Differential Revision: https://reviews.llvm.org/D81299
Modify structure type in SPIR-V dialect to support:
1) Multiple decorations per structure member
2) Key-value based decorations (e.g., MatrixStride)
This commit kept the Offset decoration separate from members'
decorations container for easier implementation and logical clarity.
As such, all references to Structure layoutinfo are now offsetinfo,
and any member layout defining decoration (e.g., RowMajor for Matrix)
will be add to the members' decorations container along with its
value if any.
Differential Revision: https://reviews.llvm.org/D81426
Following the previous revision `D81100`, this commit implements a templated class
that would provide conversion patterns for “straightforward” SPIR-V ops into
LLVM dialect. Templating allows to abstract away from concrete implementation
for each specific op. Those are mainly binary operations. Currently supported
and tested ops are:
- Arithmetic ops: `IAdd`, `ISub`, `IMul`, `FAdd`, `FSub`, `FMul`, `FDiv`, `FNegate`,
`SDiv`, `SRem` and `UDiv`
- Bitwise ops: `BitwiseAnd`, `BitwiseOr`, `BitwiseXor`
The implementation relies on `SPIRVToLLVMConversion` class that makes use of
`OpConversionPattern`.
Differential Revision: https://reviews.llvm.org/D81305
The tutorial refers to invoking toyc with '-mlir-print-debuginfo' but
it wasn't registered anymore.
Differential Revision: https://reviews.llvm.org/D81604
Allow for dynamic indices in the `dim` operation.
Rather than an attribute, the index is now an operand of type `index`.
This allows to apply the operation to dynamically ranked tensors.
The correct lowering of dynamic indices remains to be implemented.
Differential Revision: https://reviews.llvm.org/D81551
The operation `get_extent` now accepts the dimension as an operand and is no
longer limited to constant dimensions.
A helper function facilitates the common constant use case.
Differential Revision: https://reviews.llvm.org/D81248
This is useful for manipulating the standard dialect from transformations
outside of the standard dialect.
Differential Revision: https://reviews.llvm.org/D80609
Summary:
Even though this operation is intended for 1d/2d conversions currently,
leaving a semantic hole in the lowering prohibits proper testing of this
operation. This CL adds a straightforward reference implementation for the
missing cases.
Reviewers: nicolasvasilache, mehdi_amini, ftynse, reidtatge
Reviewed By: reidtatge
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D81503
Having the input dumped on failure seems like a better
default: I debugged FileCheck tests for a while without knowing
about this option, which really helps to understand failures.
Remove `-dump-input-on-failure` and the environment variable
FILECHECK_DUMP_INPUT_ON_FAILURE which are now obsolete.
Differential Revision: https://reviews.llvm.org/D81422
Summary:
The NVVM target only provides implementations for tanh etc. on f32 and
f64 operands. To also support f16, we now insert operations to extend to f32
and truncate back to f16 around the intrinsic call.
Differential Revision: https://reviews.llvm.org/D81473
Summary:
UnrankedMemRefType doesn't have a rank but previously this was just
checking for unranked tensor. Avoids failure later if one queries the shape
post checking if ranked.
Differential Revision: https://reviews.llvm.org/D81441
These commits set up the skeleton for SPIR-V to LLVM dialect conversion.
I created SPIR-V to LLVM pass, registered it in Passes.td, InitAllPasses.h.
Added a pattern for `spv.BitwiseAndOp` and tests for it. Integer, float
and vector types are converted through LLVMTypeConverter.
Differential Revision: https://reviews.llvm.org/D81100
The SSA values created with `shape.const_size` are now named depending on the
value.
A constant size of 3, e.g., is now automatically named `%c3`.
Differential Revision: https://reviews.llvm.org/D81249
The operations `to_extent_tensor` and `from_extent_tensor` become no-ops when
lowered to the standard dialect.
This is possible with a lowering from `shape.shape` to `tensor<?xindex>`.
Differential Revision: https://reviews.llvm.org/D81162
This parameter gives the developers the freedom to choose their desired function
signature conversion for preparing their functions for buffer placement. It is
introduced for BufferAssignmentFuncOpConverter, and also for
BufferAssignmentReturnOpConverter, and BufferAssignmentCallOpConverter to adapt
the return and call operations with the selected function signature conversion.
If the parameter is set, buffer placement won't also deallocate the returned
buffers.
Differential Revision: https://reviews.llvm.org/D81137
This patch is a follow-up on https://reviews.llvm.org/D81127
BF16 constants were represented as 64-bit floating point values due to the lack
of support for BF16 in APFloat. APFloat was recently extended to support
BF16 so this patch is fixing the BF16 constant representation to be 16-bit.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D81218
Summary:
This revision adds a common folding pattern that starts appearing on
vector_transfer ops.
Differential Revision: https://reviews.llvm.org/D81281
This allows verifying op-indepent attributes (e.g., attributes that do not require the op to have been created) before constructing an operation. These include checking whether required attributes are defined or constraints on attributes (such as I32 attribute). This is not perfect (e.g., if one had a disjunctive constraint where one part relied on the op and the other doesn't, then this would not try and extract the op independent from the op dependent).
The next step is to move these out to a trait that could be verified earlier than in the generated method. The first use case is for inferring the return type while constructing the op. At that point you don't have an Operation yet and that ends up in one having to duplicate the same checks, e.g., verify that attribute A is defined before querying A in shape function which requires that duplication. Instead this allows one to invoke a method to verify all the traits and, if this is checked first during verification, then all other traits could use attributes knowing they have been verified.
It is a little bit funny to have these on the adaptor, but I see the adaptor as a place to collect information about the op before the op is constructed (e.g., avoiding stringly typed accessors, verifying what is possible to verify before the op is constructed) while being cheap to use even with constructed op (so layer of indirection between the op constructed/being constructed). And from that point of view it made sense to me.
Differential Revision: https://reviews.llvm.org/D80842
Improve consistency when printing test results:
Previously we were using different labels for group names (the header
for the list of, e.g., failing tests) and summary count lines. For
example, "Failing Tests"/"Unexpected Failures". This commit changes lit
to label things consistently.
Improve wording of labels:
When talking about individual test results, the first word in
"Unexpected Failures", "Expected Passes", and "Individual Timeouts" is
superfluous. Some labels contain the word "Tests" and some don't.
Let's simplify the names.
Before:
```
Failing Tests (1):
...
Expected Passes : 3
Unexpected Failures: 1
```
After:
```
Failed Tests (1):
...
Passed: 3
Failed: 1
```
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D77708
Summary:
`mlir-rocm-runner` is introduced in this commit to execute GPU modules on ROCm
platform. A small wrapper to encapsulate ROCm's HIP runtime API is also inside
the commit.
Due to behavior of ROCm, raw pointers inside memrefs passed to `gpu.launch`
must be modified on the host side to properly capture the pointer values
addressable on the GPU.
LLVM MC is used to assemble AMD GCN ISA coming out from
`ConvertGPUKernelToBlobPass` to binary form, and LLD is used to produce a shared
ELF object which could be loaded by ROCm HIP runtime.
gfx900 is the default target be used right now, although it could be altered via
an option in `mlir-rocm-runner`. Future revisions may consider using ROCm Agent
Enumerator to detect the right target on the system.
Notice AMDGPU Code Object V2 is used in this revision. Future enhancements may
upgrade to AMDGPU Code Object V3.
Bitcode libraries in ROCm-Device-Libs, which implements math routines exposed in
`rocdl` dialect are not yet linked, and is left as a TODO in the logic.
Reviewers: herhut
Subscribers: mgorny, tpr, dexonsmith, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #mlir, #llvm
Differential Revision: https://reviews.llvm.org/D80676
Add support for flat, location, and noperspective decorations in the
serializer and deserializer to be able to process basic shader files
for graphics applications.
Differential Revision: https://reviews.llvm.org/D80837
Recently introduced allocation hoisting is quite conservative on the cases when it triggers.
This revision makes it such that the allocations for vector transfer lowerings are hoisted
to the top of the function.
This should be revisited in the context of parallelism and is a temporary workaround.
Differential Revision: https://reviews.llvm.org/D81253
This revision adds a helper function to hoist vector.transfer_read /
vector.transfer_write pairs out of immediately enclosing scf::ForOp
iteratively, if the following conditions are true:
1. The 2 ops access the same memref with the same indices.
2. All operands are invariant under the enclosing scf::ForOp.
3. No uses of the memref either dominate the transfer_read or are
dominated by the transfer_write (i.e. no aliasing between the write and
the read across the loop)
To improve hoisting opportunities, call the `moveLoopInvariantCode` helper
function on the candidate loop above which to hoist. Hoisting the transfers
results in scf::ForOp yielding the value that originally transited through
memory.
This revision additionally exposes `moveLoopInvariantCode` as a helper in
LoopUtils.h and updates SliceAnalysis to support return scf::For values and
allow hoisting across multiple scf::ForOps.
Differential Revision: https://reviews.llvm.org/D81199
Forward declaring llvm::errs is not enough, as it is used as a default
parameter with a type that references the base class. So the class
hierarchy must be visible.
Summary:
This will inline the region to a shape.assuming in the case that the
input witness is found to be statically true.
Differential Revision: https://reviews.llvm.org/D80302
In the case of all inputs being constant and equal, cstr_eq will be
replaced with a true_witness.
Differential Revision: https://reviews.llvm.org/D80303
This allows replacing of this op with a true witness in the case of both
inputs being const_shapes and being found to be broadcastable.
Differential Revision: https://reviews.llvm.org/D80304
This allows assuming_all to be replaced when all inputs are known to be
statically passing witnesses.
Differential Revision: https://reviews.llvm.org/D80306
This will later be used during canonicalization and folding steps to replace
statically known passing constraints.
Differential Revision: https://reviews.llvm.org/D80307
Update linalg to affine lowering for convop to use affine load for input
whenever there is no padding. It had always been using std.loads because
max in index functions (needed for non-zero padding if not materializing
zeros) couldn't be represented in the non-zero padding cases.
In the future, the non-zero padding case could also be made to use
affine - either by materializing or using affine.execute_region. The
latter approach will not impact the scf/std output obtained after
lowering out affine.
Differential Revision: https://reviews.llvm.org/D81191
This simplifies a lot of handling of BoolAttr/IntegerAttr. For example, a lot of places currently have to handle both IntegerAttr and BoolAttr. In other places, a decision is made to pick one which can lead to surprising results for users. For example, DenseElementsAttr currently uses BoolAttr for i1 even if the user initialized it with an Array of i1 IntegerAttrs.
Differential Revision: https://reviews.llvm.org/D81047
This revision adds a helper function to hoist alloc/dealloc pairs and
alloca op out of immediately enclosing scf::ForOp if both conditions are true:
1. all operands are defined outside the loop.
2. all uses are ViewLikeOp or DeallocOp.
This is now considered Linalg-specific and will be generalized on a per-need basis.
Differential Revision: https://reviews.llvm.org/D81152
Add SubgroupId, SubgroupSize and NumSubgroups to GPU dialect ops and add the
lowering of those ops to SPIRV.
Differential Revision: https://reviews.llvm.org/D81042
Summary:
If the output filename was specified as "-", the ToolOutputFile class
would create a brand new raw_ostream object referring to the stdout.
This patch changes it to reuse the llvm::outs() singleton.
At the moment, this change should be "NFC", but it does enable other
enhancements, like the automatic stdout/stderr synchronization as
discussed on D80803.
I've checked the history, and I did not find any indication that this
class *has* to use a brand new stream object instead of outs() --
indeed, it is special-casing "-" in a number of places already, so this
change fits the pattern pretty well. I suspect the main reason for the
current state of affairs is that the class was originally introduced
(r111595, in 2010) as a raw_fd_ostream subclass, which made any other
solution impossible.
Another potential benefit of this patch is that it makes it possible to
move the raw_ostream class out of the business of special-casing "-" for
stdout handling. That state of affairs does not seem appropriate because
"-" is a valid filename (albeit hard to access with a lot of command
line tools) on most systems. Handling "-" in ToolOutputFile seems more
appropriate.
To make this possible, this patch changes the return type of
llvm::outs() and errs() to raw_fd_ostream&. Previously the functions
were constructing objects of that type, but returning a generic
raw_ostream reference. This makes it possible for new ToolOutputFile and
other code to use raw_fd_ostream methods like error() on the outs()
object. This does not seem like a bad thing (since stdout is a file
descriptor which can be redirected to anywhere, it makes sense to ask it
whether the writing was successful or if it supports seeking), and
indeed a lot of code was already depending on this fact via the
ToolOutputFile "back door".
Reviewers: dblaikie, JDevlieghere, MaskRay, jhenderson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81078
Summary:
The fusion for tensor_reshape is embedding the information to indexing maps,
thus the exising pattenr also works for indexed_generic ops.
Depends On D80347
Differential Revision: https://reviews.llvm.org/D80348
Summary:
Different from the fusion between generic ops, indices are involved. In this
context, we need to re-map the indices for producer since the fused op is built
on consumer's perspective. This patch supports all combination of the fusion
between indexed_generic ops and generic ops, which includes tests case:
1) generic op as producer and indexed_generic op as consumer.
2) indexed_generic op as producer and generic op as consumer.
3) indexed_generic op as producer and indexed_generic op as consumer.
Differential Revision: https://reviews.llvm.org/D80347
Summary:
Progressive lowering of vector.transpose into an operation that
is closer to an intrinsic, and thus the hardware ISA. Currently
under the common vector transform testing flag, as we prepare
deploying this transformation in the LLVM lowering pipeline.
Reviewers: nicolasvasilache, reidtatge, andydavis1, ftynse
Reviewed By: nicolasvasilache, ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #llvm, #mlir
Differential Revision: https://reviews.llvm.org/D80772
Add a new pass to lower operations from the `shape` to the `std` dialect.
The conversion applies only to the `size_to_index` and `index_to_size`
operations and affected types.
Other patterns will be added as needed.
Differential Revision: https://reviews.llvm.org/D81091
The original design of TypeConverter expected specific converters to derive the
class and override virtual functions for conversions and materializations. This
did not scale well to multi-dialect conversions, so the design was changed to
register a list of converter and materializer functions, removing the need for
virtual functions. The only remaining virtual function, `convertSignatureArg`
is never overridden in-tree. Make it non-virtual, drop the virtual destructor
and thus remove vtable from TypeConverter.
If there exist TypeConverter users that need custom `convertSignatureArg`
behavior, it should be implemented using the callback registration mechanism
similar to that of conversions and materializations.
Differential Revision: https://reviews.llvm.org/D80993
This patch enables affine loop fusion for loops with affine vector loads
and stores. For that, we only had to use affine memory op interfaces in
LoopFusionUtils.cpp and Utils.cpp so that vector loads and stores are
also taken into account.
Reviewed By: andydavis1, ftynse
Differential Revision: https://reviews.llvm.org/D80971
This commit adds basic matrix type support to the SPIR-V dialect
including type definition, IR assembly, parsing, printing, and
(de)serialization.
Differential Revision: https://reviews.llvm.org/D80594
Dialect conversion infrastructure supports 1->N type conversions by requiring
individual conversions to provide facilities to generate operations
retrofitting N values into 1 of the original type when N > 1. This
functionality can also be used to materialize explicit "cast"-like operations,
but it did not support 1->1 type conversions until now. Modify TypeConverter to
support materialization of cast operations for 1-1 conversions.
This also makes materialization specification more extensible following the
same pattern as type conversions. Instead of overloading a virtual function,
users or subclasses of TypeConversion can now register type-specific
materialization callbacks that will be called in order for the given type.
Differential Revision: https://reviews.llvm.org/D79729
One header guard was overlooked when renaming LoopOps to SCF, rename it.
Also drop two unused macros, one of which referred to LoopOp (not "Ops",
hence the overlook).
The lower-to-affine-loops pass in chapters 5-7 of the Toy tutorial has
been creating affine loops, erasing their terminator and creating it
anew using a PatternRewriter instance to work around the fact that
implicit terminators were created without notifying the rewriter. Now
that has been fixed in 3ccf4a5bd1, remove the code erasing and
re-creating the terminators and rely on the default ones.
Add BufferAssignmentCallOpConverter as a pattern rewriter for Buffer
Placement. It matches the signature of the caller operation with the callee
after rewriting the callee with FunctionAndBlockSignatureConverter.
Differential Revision: https://reviews.llvm.org/D80785
Keeping in the affine.for to gpu.launch conversions, which should
probably be the affine.parallel to gpu.launch conversion as well.
Differential Revision: https://reviews.llvm.org/D80747
Summary:
MSVC does not seem to like certain forward declarations.
https://reviews.llvm.org/D80728 introduces an error where
seemingly unrelated .cpp files that include the .h
(but do not otherwise use the class that depends on the forward declaration).
Instead of forward declaration, include the full vector ops definition.
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80841
This revision replaces the load + vector.type_cast by appropriate vector transfer
operations. These play more nicely with other vector abstractions and canonicalization
patterns and lower to load/store with or without masks when appropriate.
Differential Revision: https://reviews.llvm.org/D80809
Summary:
Implemented the basic changes for defining master operation in OpenMP.
It uses the generic parser and printer.
Reviewed By: kiranchandramohan, ftynse
Differential Revision: https://reviews.llvm.org/D80689
This revision adds custom rewrites for patterns that arise during linalg structured
ops vectorization. These patterns allow the composition of linalg promotion,
vectorization and removal of redundant copies.
The patterns are voluntarily limited and restrictive atm.
More robust behavior will be implemented once more powerful side effect modeling and analyses are available on view/subview.
On the transfer_read side, the following pattern is rewritten:
```
%alloc = ...
[optional] %view = std.view %alloc ...
%subView = subview %allocOrView ...
[optional] linalg.fill(%allocOrView, %cst) ...
...
linalg.copy(%in, %subView) ...
vector.transfer_read %allocOrView[...], %cst ...
```
into
```
[unchanged] %alloc = ...
[unchanged] [optional] %view = std.view %alloc ...
[unchanged] [unchanged] %subView = subview %allocOrView ...
...
vector.transfer_read %in[...], %cst ...
```
On the transfer_write side, the following pattern is rewriten:
```
%alloc = ...
[optional] %view = std.view %alloc ...
%subView = subview %allocOrView...
...
vector.transfer_write %..., %allocOrView[...]
linalg.copy(%subView, %out)
```
Differential Revision: https://reviews.llvm.org/D80728
This utility factors out the machinery required to add iterArgs and yield values to an scf.ForOp.
Differential Revision: https://reviews.llvm.org/D80656
Buffer placement can now operates on functions that return buffers. These
buffers escape from the deallocation phase of buffer placement.
Differential Revision: https://reviews.llvm.org/D80696
Summary: Add a test to check if the standalone dialect is registered within standalone-opt. Similar to the mlir-opt commandline.mlir test.
Reviewers: Kayjukh, stephenneuendorffer
Reviewed By: Kayjukh
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, frgossen, jurahul, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80764
https://reviews.llvm.org/D79246 introduces alignment propagation for vector transfer operations. Unfortunately, the alignment calculation is incorrect and can result in crashes.
This revision fixes the calculation by using the natural alignment of the memref elemental type, instead of the resulting vector type.
If more alignment is desired, it can be done in 2 ways:
1. use a proper vector.type_cast to transform a memref<axbxcxdxf32> into a memref<axbxvector<cxdxf32>> giving a natural alignment of vector<cxdxf32>
2. add an alignment attribute to vector transfer operations and propagate it.
With this change the alignment in the relevant tests goes down from 128 to 4.
Lastly, a few minor cleanups are performed and the custom `isMinorIdentityMap` is deprecated.
Differential Revision: https://reviews.llvm.org/D80734
operands of Generic ops.
Unit-extent dimensions are typically used for achieving broadcasting
behavior. The pattern added (along with canonicalization patterns
added previously) removes the use of unit-extent dimensions, and
instead uses a more canonical representation of the computation. This
new pattern is not added as a canonicalization for now since it
entails adding additional reshape operations. A pass is added to
exercise these patterns, along with an API entry to populate a
patterns list with these patterns.
Differential Revision: https://reviews.llvm.org/D79766
D80142 restructured MLIR-to-GPU-binary conversion to support multiple
targets. It also modified cmake files to link relevant LLVM components
in test/lib, which broke shared-library builds, and likely made the
conversions unusable outside mlir-opt (or other tools that link in test
library targets). Link these components to GPUCommon instead.
Differential Revision: https://reviews.llvm.org/D80739
This allows constructing operand adaptor from existing op (useful for commonalizing verification as I want to do in a follow up).
I also add ability to use member initializers for the generated adaptor constructors for convenience.
Differential Revision: https://reviews.llvm.org/D80667
The operand of `from_extent_tensor` is now of the same index type as the result
type of the inverse operation `to_extent_tensor`.
Differential Revision: https://reviews.llvm.org/D80283
Make ConvertKernelFuncToCubin pass to be generic:
- Rename to ConvertKernelFuncToBlob.
- Allow specifying triple, target chip, target features.
- Initializing LLVM backend is supplied by a callback function.
- Lowering process from MLIR module to LLVM module is via another callback.
- Change mlir-cuda-runner to adopt the revised pass.
- Add new tests for lowering to ROCm HSA code object (HSACO).
- Tests for CUDA and ROCm are kept in separate directories.
Differential Revision: https://reviews.llvm.org/D80142
The operation `num_elements` determines the number of elements for a given
shape.
That is the product of its dimensions.
Differential Revision: https://reviews.llvm.org/D80281
Add the two conversion operations `index_to_size` and `size_to_index` to the
shape dialect.
This facilitates the conversion of index types between the shape and the
standard dialect.
Differential Revision: https://reviews.llvm.org/D80280
Purely cosmetic change.
The operation implementations in `Shape.cpp` are now lexicographic order.
Differential Revision: https://reviews.llvm.org/D80277
This changes will catch error where C++ op are used without being
registered, either through creation with the OpBuilder or when trying to
cast to the C++ op.
Differential Revision: https://reviews.llvm.org/D80651
Summary:
Index is the proper type for storing shapes when constant folding, so
this fixes the previous code (which was using i64).
Differential Revision: https://reviews.llvm.org/D80600
I just spent a bunch of time debugging a mysterious bug that ended being due to my SmallVector getting passed to the Args&... overload instead of the MutableArrayRef overload, with disastrous results.
I appreciate the intent of this API, but for a function that does a bunch of unsafe casts, adding in potential overload confusion is just too much C++ footgun. If we end up needing this functionality, having something like a separate `packArgs(Args&...) -> SmallVector` overload would be preferable.
Turns out this API is unused and untested (even out of tree as far as I can tell, modulo the optional passing of no args to the other invoke as I fixed in this patch), so it's an easy fix -- just delete it and touch up the other overload.
Differential Revision: https://reviews.llvm.org/D80607
This allocation of a workgroup memory is lowered to a
spv.globalVariable. Only static size allocation with element type
being int or float is handled. The lowering does account for the
element type that are not supported in the lowered spv.module based on
the extensions/capabilities and adjusts the number of elements to get
the same byte length.
Differential Revision: https://reviews.llvm.org/D80411
Summary:
This includes a basic implementation for the OpenMP parallel
operation without a custom pretty-printer and parser.
The if, num_threads, private, shared, first_private, last_private,
proc_bind and default clauses are included in this implementation.
Currently the reduction clause is omitted as it is more complex and
requires analysis to see if we can share implementation with the loop
dialect. The allocate clause is also omitted.
A discussion about the design of this operation can be found here:
https://llvm.discourse.group/t/openmp-parallel-operation-design-issues/686
The current OpenMP Specification can be found here:
https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Reviewers: jdoerfert
Subscribers: mgorny, yaxunl, kristof.beyls, guansong, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79410
Take advantage of equality constrains to generate the type inference interface.
This is used for equality and trivially built types. The type inference method
is only generated when no type inference trait is specified already.
This reorders verification that changes some test error messages.
Differential Revision: https://reviews.llvm.org/D80484
Now that OpBuilder is available in `build` functions, it becomes possible to
populate the "then" and "else" regions directly when building the "if"
operation. This is desirable in more structured forms of builders, especially
in when conditionals are mixed with loops. Provide new `build` APIs taking
callbacks for body constructors, similarly to scf::ForOp, and replace more
clunky edsc::BlockBuilder uses with these. The original APIs remain available
and go through the new implementation.
Differential Revision: https://reviews.llvm.org/D80527
alloc/dealloc/copies.
Add options to LinalgPromotion to use callbacks for implementating the
allocation, deallocation of buffers used for the promoted subviews,
and to copy data into and from the original subviews to the allocated
buffers.
Also some misc. cleanup of the code.
Differential Revision: https://reviews.llvm.org/D80365
Modifying the loop nest builder for generating scf.parallel loops to
not generate scf.parallel loops for non-parallel iterator types in
Linalg operations. The existing implementation incorrectly generated
scf.parallel for all tiled loops. It is rectified by refactoring logic
used while lowering to loops that accounted for this.
Differential Revision: https://reviews.llvm.org/D80188
Summary:
This op extracts an extent from a shape.
This also is the first op which constant folds to shape.const_size,
which revealed that shape.const_size needs a folder (ConstantLike ops
seem to always need folders for the constant folding infra to work).
Differential Revision: https://reviews.llvm.org/D80394
This revision expands the types of vector contractions that can be lowered to vector.outerproduct.
All 8 permutation cases are support.
The idiomatic manipulation of AffineMap written declaratively makes this straightforward.
In the process a bug with the vector.contract verifier was uncovered.
The vector shape verification part of the contract op is rewritten to use AffineMap composition.
One bug in the vector `ops.mlir` test is fixed and a new case not yet captured is added
to the vector`invalid.mlir` test.
Differential Revision: https://reviews.llvm.org/D80393
Summary:
Add DynamicMemRefType which can reference one of the statically ranked StridedMemRefType or a UnrankedMemRefType so that runner utils only need to be implemented once.
There is definitely room for more clean up and unification, but I will keep that for follow-ups.
Reviewers: nicolasvasilache
Reviewed By: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80513
This revision adds the additional lowering and exposes the patterns at a finer granularity for better programmatic reuse. The unit test makes use of the finer grained pattern for simpler checks.
As the ContractionOpLowering is exposed programmatically, cleanup opportunities appear and static class methods are turned into free functions with static visibility.
Differential Revision: https://reviews.llvm.org/D80375
This still allows `if (value)` while requiring an explicit cast when not
in a boolean context. This means things like `std::set<Value>` will no
longer compile.
Differential Revision: https://reviews.llvm.org/D80497
* Enables using with more variadic sized operands;
* Generate convenience accessors for attributes;
- The accessor are named the same as their name in ODS and returns attribute
type (not convenience type) and no derived attributes.
This is first step to changing adapter to support verifying argument
constraints before the op is even created. This does not change the name of
adaptor nor does it require it except for ops with variadic operands to keep this change smaller.
Considered creating separate adapter but decided against that given operands also require attributes in general (and definitely for verification of operands and attributes).
Differential Revision: https://reviews.llvm.org/D80420
Enable inset/extract/construct composite ops as well as access chain for
cooperative matrix. ConstantComposite requires more change and will be done in
a separate patch. Also fix the getNumElements function for coopMatrix per
feedback from Jeff Bolz. The number of element is implementation dependent so
it cannot be known at compile time.
Differential Revision: https://reviews.llvm.org/D80321
Adds support for cooperative matrix support for arithmetic and cast
instructions. It also adds cooperative matrix store, muladd and matrixlength
instructions which are part of the extension.
Differential Revision: https://reviews.llvm.org/D80181
Due to similar APIs between CUDA and ROCm (HIP),
ConvertGpuLaunchFuncToCudaCalls pass could be used on both platforms with some
refactoring.
In this commit:
- Migrate ConvertLaunchFuncToCudaCalls from GPUToCUDA to GPUCommon, and rename.
- Rename runtime wrapper APIs be platform-neutral.
- Let GPU binary annotation attribute be specifiable as a PassOption.
- Naming changes within the implementation and tests.
Subsequent patches would introduce ROCm-specific tests and runtime wrapper
APIs.
Differential Revision: https://reviews.llvm.org/D80167
This reverts commit cdb6f05e2d.
The build is broken with:
You have called ADD_LIBRARY for library obj.MLIRGPUtoCUDATransforms without any source files. This typically indicates a problem with your CMakeLists.txt file
Due to similar APIs between CUDA and ROCm (HIP),
ConvertGpuLaunchFuncToCudaCalls pass could be used on both platforms with some
refactoring.
In this commit:
- Migrate ConvertLaunchFuncToCudaCalls from GPUToCUDA to GPUCommon, and rename.
- Rename runtime wrapper APIs be platform-neutral.
- Let GPU binary annotation attribute be specifiable as a PassOption.
- Naming changes within the implementation and tests.
Subsequent patches would introduce ROCm-specific tests and runtime wrapper
APIs.
Differential Revision: https://reviews.llvm.org/D80167
The subview semantics changes recently to allow for more natural
representation of constant offsets and strides. The legalization of
subview op for lowering to SPIR-V needs to account for this.
Also change the linearization to use the strides from the affine map
of a memref.
Differential Revision: https://reviews.llvm.org/D80270
Summary:
Previously, the only support partial lowering from vector transfers to SCF was
going through loops. This requires a dedicated allocation and extra memory
roundtrips because LLVM aggregates cannot be indexed dynamically (for more
details see the [deep-dive](https://mlir.llvm.org/docs/Dialects/Vector/#deeperdive)).
This revision allows specifying full unrolling which removes this additional roundtrip.
This should be used carefully though because full unrolling will spill, negating the
benefits of removing the interim alloc in the first place.
Proper heuristics are left for a later time.
Differential Revision: https://reviews.llvm.org/D80100
The SingleBlockImplicitTerminator op trait provides a function
`ensureRegionTerminator` that injects an appropriate terminator into the block
if necessary, which is used during operation constructing and parsing.
Currently, this function directly modifies the IR using low-level APIs on
Operation and Block. If this function is called from a conversion pattern,
these manipulations are not reflected in the ConversionPatternRewriter and thus
cannot be undone or, worse, lead to tricky memory errors and malformed IR.
Change `ensureRegionTerminator` to take an instance of `OpBuilder` instead of
`Builder`, and use it to construct the block and the terminator when required.
Maintain overloads taking an instance of `Builder` and creating a simple
`OpBuilder` to use in parsers, which don't have an `OpBuilder` and cannot
interact with the dialect conversion mechanism. This change was one of the
reasons to make `<OpTy>::build` accept an `OpBuilder`.
Differential Revision: https://reviews.llvm.org/D80138
Originally, the SCFToStandard conversion only declared Ops from the Standard
dialect as legal after conversion. This is undesirable as it would fail the
conversion if the SCF ops contained ops from any other dialect. Furthermore,
this would be problematic for progressive lowering of `scf.parallel` to
`scf.for` after `ensureRegionTerminator` is made aware of the pattern rewriting
infrastructure because it creates temporary `scf.yield` operations declared
illegal. Change the legalization target to declare any op other than `scf.for`,
`scf.if` and `scf.parallel` legal.
Differential Revision: https://reviews.llvm.org/D80137
Multiple places in the code base were erasing Blocks or operations in them
using in-place modifications (`Block::erase` or `Block::clear`) unknown to
ConversionPatternRewriter. These operations could not be undone if the pattern
failed and could lead to inconsistent in-memory state of the IR with dangling
pointers. Use `ConversionPatternRewriter::eraseOp` and `::eraseBlock` instead.
Differential Revision: https://reviews.llvm.org/D80136
PatternRewriter has support for erasing a Block from its parent region, but
this feature has not been implemented for ConversionPatternRewriter that needs
to keep track of and be able to undo block actions. Introduce support for
undoing block erasure in the ConversionPatternRewriter by marking all the ops
it contains for erasure and by detaching the block from its parent region. The
detached block is stored in the action description and is not actually deleted
until the rewrites are applied.
Differential Revision: https://reviews.llvm.org/D80135
Dialect conversion infrastructure may roll back op creation by erasing the
operations in the reverse order of their creation. While this guarantees uses
of values will be deleted before their definitions, this does not guarantee
that a parent operation will not be deleted before its child. (This may happen
in case of block inlining or if child operations, such as terminators, are
created in the parent's `build` function before the parent itself.) Handle the
parent/child relationship between ops by removing all child ops from the blocks
before erasing the parent. The child ops remain live, detached from a block,
and will be safely destroyed in their turn, which may come later than that of
the parent.
Differential Revision: https://reviews.llvm.org/D80134
When creating temporary `scf.for` loops in `toy.print` lowering, the block
insertion point was erronously set up to the beginning of the block rather than
to its end, contradicting the comment just above the insertion point change.
The code was nevertheless operational because `scf.for` was setting up its
`scf.yield` terminator in an opaque to the pattern rewriting infrastructure
way. Now that it is about to change, the problem would have been exposed and
lead to conversion failures.
Differential Revision: https://reviews.llvm.org/D80133
Summary:
This revision refactors the Linalg tiling pass to be written as pattern applications and retires the use of the folder in Linalg tiling.
In the early days, tiling was written as a pass that would create (partially) folded and canonicalized operations on the fly for better composability.
As this evolves towards composition of patterns, the pass-specific folder is counter-productive and is retired.
The tiling options struct evolves to take a tile size creation function which allows materializing tile sizes on the fly (in particular constant tile sizes). This plays better with folding and DCE.
With the folder going away in Tiling, the check on whether subviews are the same in linalg fusion needs to be more robust. This revision also implements such a check.
In the current form, there are still some canonicalizations missing due to AffineMin/Max ops fed by scf::ForOp. These will be improved at a later time.
Differential Revision: https://reviews.llvm.org/D80267
Summary:
Additionally, this adds traits and builder methods to AssumingYieldOp
and names the input witness to the AssumingOp.
Differential Revision: https://reviews.llvm.org/D80187
MLIR tests in "mlir/test/mlir-cpu-runner" fails in SystemZ (z14) because
of incompatible datalayout error. This patch fixes it by setting host
CPU name in createTargetMachine()
Differential Revision: https://reviews.llvm.org/D80130
This is consistent to the other methods of the class, as well as
AffineExpr::replaceDimsAndSymbols.
Differential Revision: https://reviews.llvm.org/D80266
This should fix the error ```
VectorToSCF.cpp:238:62: error: specialization of 'template<class
ConcreteOp> mlir::LogicalResult
{anonymous}::NDTransferOpHelper<ConcreteOp>::doReplace()' in different
namespace
```
Add a new type to SPIRV dialect for cooperative matrix and add new op for
cooperative matrix load. This is missing most instructions to support
cooperative matrix extension but this is a stop-gap patch to avoid creating big
review.
Differential Revision: https://reviews.llvm.org/D80043
This patch introduces interfaces for read and write ops with affine
restrictions. I used `read`/`write` intead of `load`/`store` for the
interfaces so that they can also be implemented by dma ops.
For now, they are only implemented by affine.load, affine.store,
affine.vector_load and affine.vector_store.
For testing purposes, this patch also migrates affine loop fusion and
required analysis to use the new interfaces. No other changes are made
beyond that.
Co-authored-by: Alex Zinenko <zinenko@google.com>
Reviewed By: bondhugula, ftynse
Differential Revision: https://reviews.llvm.org/D79829
Enclose verifier code for AttrSizedOperandSegments and AttrSizedResultSegments
in a nested code block to avoid symbol collision.
Differential Revision: https://reviews.llvm.org/D80250
Using LLVM components in LINK_LIBS means that the mechanisms for
replacing component dependencies with libLLVM.so break. Try to catch
this incorrect usage up front, instead of waiting until later when we
get difficult to understand runtime errors from incorrectly linked
libraries.
Differential Revision: https://reviews.llvm.org/D80103
Summary:
This is a basic op needed for creating shapes from SSA values
representing the extents.
Differential Revision: https://reviews.llvm.org/D79833
Summary:
Previously, after applying the mask, a negative number would convert to a
positive number because the sign flag was forgotten. This patch adds two more
shift operations to do the sign extension. This assumes that we're using two's
complement.
This patch applies sign extension unconditionally when loading a unspported integer width, and it relies the pattern to do the casting because the signedness semantic is carried by operator itself.
Differential Revision: https://reviews.llvm.org/D79753
Making these two converters more generic. FunctionAndBlockSignatureConverter now
moves only memref results (after type conversion) to the function argument and
keeps other legal function results unchanged. NonVoidToVoidReturnOpConverter is
renamed to NoBufferOperandsReturnOpConverter. It removes only the buffer
operands from the operands of the converted ReturnOp and inserts CopyOps to copy
each buffer to the target function argument.
Differential Revision: https://reviews.llvm.org/D79329
Thanks to a recent change that made `::build` functions take an instance of
`OpBuilder`, it is now possible to build operations within a region attached to
the operation about to be created. Exercise this on `scf::ForOp` by taking a
callback that populates the loop body while the loop is being created.
Additionally, provide helper functions to build perfect nests of `ForOp`s,
with support for iteration arguments. These functions provide the same
functionality as EDSC LoopNestBuilder with simpler implementation, without
relying on edsc::ScopedContext, and using `OpBuilder` in an unambiguous way.
Compatibility functions for EDSC are provided, but may be removed in the
future.
Differential Revision: https://reviews.llvm.org/D79688
Implemented tangent op from SPIR-V's GLSL extended instruction set.
Added a round-trip and serialization/deserialization tests for the op.
Differential Revision: https://reviews.llvm.org/D80152
Summary:
This patch adds support for flush operation in OpenMP dialect and translation of this construct to LLVM IR.
The OpenMP IRBuilder is used for this translation.
The patch includes code changes and testcase modifications.
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D79937
Summary: This revision adds support for assembly formats with optional attributes. It elides optional attributes that are part of the syntax from the attribute dictionary.
Reviewers: ftynse, Kayjukh
Reviewed By: ftynse, Kayjukh
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, jurahul, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80113
For now the promoted buffer is indexed using the `full view`. The full view might be
slightly bigger than the partial view (which is accounting for boundaries).
Unfortunately this does not compose easily with other transformations when multiple buffers
with shapes related to each other are involved.
Take `linalg.matmul A B C` (with A of size MxK, B of size KxN and C of size MxN) and suppose we are:
- Tiling over M by 100
- Promoting A only
This is producing a `linalg.matmul promoted_A B subview_C` where `promoted_A` is a promoted buffer
of `A` of size (100xK) and `subview_C` is a subview of size mxK where m could be smaller than 100 due
to boundaries thus leading to a possible incorrect behavior.
We propose to:
- Add a new parameter to the tiling promotion allowing to enable the use of the full tile buffer.
- By default all promoted buffers will be indexed by the partial view.
Note that this could be considered as a breaking change in comparison to the way the tiling promotion
was working.
Differential Revision: https://reviews.llvm.org/D79927
Summary:
Vector transfer ops semantic is extended to allow specifying a per-dimension `masked`
attribute. When the attribute is false on a particular dimension, lowering to LLVM emits
unmasked load and store operations.
Differential Revision: https://reviews.llvm.org/D80098
Summary:
This revision makes the use of vector transfer operatons more idiomatic by
allowing to omit and inferring the permutation_map.
Differential Revision: https://reviews.llvm.org/D80092
Summary:
When NamedAttrList::get is called against a StringRef and the
entry is not present, the Identifier::operator== crashes.
Interestingly there seems to be no use of the NamedAttrList::get(StringRef) in
the codebase so far. A subsequent commit will introduce such a use.
Differential Revision: https://reviews.llvm.org/D80089
unittest/Tablegen generates an executable that depends on MLIRIR and
LLVMMLIRTableGen. Avoid specifying linkage dependence on LLVM
libraries here because then everyone has to depend on those libraries.
Differential Revision: https://reviews.llvm.org/D80093
Generally:
1) don't use target_link_libraries() and add_mlir_library() on the same target, use LINK_LIBS PUBLIC instead.
2) don't use LINK_LIBS to specify LLVM libraries. Use LINK_COMPONENTS instead
3) no need to link against LLVMSupport. We pull it in by default.
Differential Revision: https://reviews.llvm.org/D80076
Along the lines of D77454 and D79968. Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.
Differential Revision: https://reviews.llvm.org/D80044
The JitRunner library is logically very close to the execution engine,
and shares similar dependencies.
find -name "*.cpp" -exec sed -i "s/Support\/JitRunner/ExecutionEngine\/JitRunner/" "{}" \;
Differential Revision: https://reviews.llvm.org/D79899
Summary:
First, compact implementation of lowering to LLVM IR. A bit more
challenging than the constant mask due to the dynamic indices, of course.
I like to hear if there are more efficient ways of doing this in LLVM,
but this for now at least gives us a functional reference implementation.
Reviewers: nicolasvasilache, ftynse, bkramer, reidtatge, andydavis1, mehdi_amini
Reviewed By: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79954
DimOp folding is using bare accesses to underlying SubViewOp operands.
This is generally incorrect and is fixed in this revision.
Differential Revision: https://reviews.llvm.org/D80017
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout
properties.
Differential Revision: https://reviews.llvm.org/D78862
It is possible for optimizations to create SSA code which violates
the dominance property in unreachable blocks. Equivalently, dominance
computed using normal mechanisms is undefined in unreachable blocks.
See discussion here: https://llvm.discourse.group/t/rfc-allowing-dialects-to-relax-the-ssa-dominance-condition/833/51
This patch only checks the dominance condition inside blocks which are
reachable from the the entry block of their region. Note that the
dominance conditions of regions contained in an unreachable block are
still checked.
Differential Revision: https://reviews.llvm.org/D79922
The following Conversions are affected: LoopToStandard -> SCFToStandard,
LoopsToGPU -> SCFToGPU, VectorToLoops -> VectorToSCF. Full file paths are
affected. Additionally, drop the 'Convert' prefix from filenames living under
lib/Conversion where applicable.
API names and CLI options for pass testing are also renamed when applicable. In
particular, LoopsToGPU contains several passes that apply to different kinds of
loops (`for` or `parallel`), for which the original names are preserved.
Differential Revision: https://reviews.llvm.org/D79940
have abi attributes.
To ensure there is no conflict, use the default ABI only when none of
the arguments have the spv.interface_var_abi attribute. This also
implies that if one of the arguments has a spv.interface_var_abi
attribute, all of them should have it as well.
Differential Revision: https://reviews.llvm.org/D77232
This revision starts decoupling the include the kitchen sink behavior of Linalg to LLVM lowering by inserting a -convert-linalg-to-std pass.
The lowering of linalg ops to function calls was previously lowering to memref descriptors by having both linalg -> std and std -> LLVM patterns in the same rewrite.
When separating this step, a new issue occurred: the layout is automatically type-erased by this process. This revision therefore introduces memref casts to perform these type erasures explicitly. To connect everything end-to-end, the LLVM lowering of MemRefCastOp is relaxed because it is artificially more restricted than the op semantics. The op semantics already guarantee that source and target MemRefTypes are cast-compatible. An invalid lowering test now becomes valid and is removed.
Differential Revision: https://reviews.llvm.org/D79468
This patch adds `affine.vector_load` and `affine.vector_store` ops to
the Affine dialect and lowers them to `vector.transfer_read` and
`vector.transfer_write`, respectively, in the Vector dialect.
Reviewed By: bondhugula, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D79658
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.
The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.
Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.
This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.
Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.
Differential Revision: https://reviews.llvm.org/D78403
Summary:
The assembly format for std.rank expects the operand type and not the
result type after the colon.
Differential Revision: https://reviews.llvm.org/D79857
Generally speaking, this is bad practice. It also causes the build to
break if there are editor temporary files.
Differential Revision: https://reviews.llvm.org/D79906
The Vulkan runtime wrapper will be compiled to a shared library
that are loaded by the JIT runner. Depending on LLVM libraries
means that LLVM symbols will be compiled into the shared library.
That can cause problems if we are using it with other shared
libraries depending on LLVM, notably Mesa, the open-source graphics
driver framework. The Vulkan API wrappers invoked by the JIT runner
links to the system libvulkan.so. If it's Mesa providing the
implementation, Mesa will normally try to load the system libLLVM.so
for its shader compilation. That causes issues because the JIT runner
already loaded the Vulkan runtime wrapper which has LLVM sybmols
compiled in. So system linker will instruct Mesa to use those symbols
instead.
Differential Revision: https://reviews.llvm.org/D79860
The CMake structure of the toy example is non-standard. encourage people to
copy the standalone example instead.
Differential Revision: https://reviews.llvm.org/D79889
All ops of the SCF dialect now use the `scf.` prefix instead of `loop.`. This
is a part of dialect renaming.
Differential Revision: https://reviews.llvm.org/D79844
The existing implementation of SubViewOp::getRanges relies on all
offsets/sizes/strides to be dynamic values and does not work in
combination with canonicalization. This revision adds a
SubViewOp::getOrCreateRanges to create the missing constants in the
canonicalized case.
This allows reactivating the fused pass with staged pattern
applications.
However another issue surfaces that the SubViewOp verifier is now too
strict to allow folding. The existing folding pattern is turned into a
canonicalization pattern which rewrites memref_cast + subview into
subview + memref_cast.
The transform-patterns-matmul-to-vector can then be reactivated.
Differential Revision: https://reviews.llvm.org/D79759
Due to the extension of Liveness, Buffer Assignment can now work on nested regions. This PR provides a test case to show that existing functionally of BA works properly.
Differential Revision: https://reviews.llvm.org/D79332
The current standard Alloca node is not annotated with the
MemEffect<Alloc> trait. This CL updates the Alloc and Alloca
memory-effect annotations using the latest Resource objects. Alloca
nodes will use a newly defined AutomaticAllocationScopeResource to
distinguish between Alloc and Alloca memory effects.
Differential Revision: https://reviews.llvm.org/D79620
This is only valid if the source tensors (result tensor) is static
shaped with all unit-extents when the reshape is collapsing
(expanding) dimensions.
Differential Revision: https://reviews.llvm.org/D79764
Summary:
Makes this operation runnable on CPU by generating MLIR instructions
that are eventually folded into an LLVM IR constant for the mask.
Reviewers: nicolasvasilache, ftynse, reidtatge, bkramer, andydavis1
Reviewed By: nicolasvasilache, ftynse, andydavis1
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79815
The main objective of this revision is to change the way static information is represented, propagated and canonicalized in the SubViewOp.
In the current implementation the issue is that canonicalization may strictly lose information because static offsets are combined in irrecoverable ways into the result type, in order to fit the strided memref representation.
The core semantics of the op do not change but the parser and printer do: the op always requires `rank` offsets, sizes and strides. These quantities can now be either SSA values or static integer attributes.
The result type is automatically deduced from the static information and more powerful canonicalizations (as powerful as the representation with sentinel `?` values allows). Previously static information was inferred on a best-effort basis from looking at the source and destination type.
Relevant tests are rewritten to use the idiomatic `offset: x, strides : [...]`-form. Bugs are corrected along the way that were not trivially visible in flattened strided memref form.
Lowering to LLVM is updated, simplified and now supports all cases.
A mixed static-dynamic mode test that wouldn't previously lower is added.
It is an open question, and a longer discussion, whether a better result type representation would be a nicer alternative. For now, the subview op carries the required semantic.
Differential Revision: https://reviews.llvm.org/D79662
Conversion/ folders were originally intended to store patterns for
DialectA->DialectB conversions that depend on both dialects and do not
conceptually belong to either of the dialects. As such, DialectA->DialectA
conversion does not make sense under Conversion/ and should rather live with
the dialect it operates on.
Differential Revision: https://reviews.llvm.org/D79569
cmake does not truly support dependencies on automatically generated files
which are not in the same directory as the targets which depend on them.
It works with ninja, but doesn't work with make
This patch adds an explicit dependence so that all dialects are built
before the analysis libraries.
Differential Revision: https://reviews.llvm.org/D79805
This normalize the name of the tablegen file with the name of the generated
files (SideEffectInterfaces.h.inc) and the other Interface tablegen files,
which all end in Interface(s).td
Differential Revision: https://reviews.llvm.org/D79517
This reverts commit 80d133b24f.
Per Stephan Herhut: The canonicalizer pattern that was added creates
forms of the subview op that cannot be lowered.
This is shown by failing Tensorflow XLA tests such as:
tensorflow/compiler/xla/service/mlir_gpu/tests:abs.hlo.test
Will provide more details offline, they rely on logs from private CI.
Summary:
The scalar zero + splat yields more intermediate code than the direct
dense zero constant, and ultimately is lowered to exactly the same
LLVM IR operations, so no point wasting the intermediate code.
Reviewers: nicolasvasilache, andydavis1, reidtatge
Reviewed By: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79758
Summary:
- Fix comments in several places
- Eliminate extra ' in AST dump and adjust tests accordingly
Differential Revision: https://reviews.llvm.org/D78399
Summary:
The main objective of this revision is to change the way static information is represented, propagated and canonicalized in the SubViewOp.
In the current implementation the issue is that canonicalization may strictly lose information because static offsets are combined in irrecoverable ways into the result type, in order to fit the strided memref representation.
The core semantics of the op do not change but the parser and printer do: the op always requires `rank` offsets, sizes and strides. These quantities can now be either SSA values or static integer attributes.
The result type is automatically deduced from the static information and more powerful canonicalizations (as powerful as the representation with sentinel `?` values allows). Previously static information was inferred on a best-effort basis from looking at the source and destination type.
Relevant tests are rewritten to use the idiomatic `offset: x, strides : [...]`-form. Bugs are corrected along the way that were not trivially visible in flattened strided memref form.
It is an open question, and a longer discussion, whether a better result type representation would be a nicer alternative. For now, the subview op carries the required semantic.
Reviewers: ftynse, mravishankar, antiagainst, rriddle!, andydavis1, timshen, asaadaldien, stellaraccident
Reviewed By: mravishankar
Subscribers: aartbik, bondhugula, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, bader, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79662
Summary:
This revision introduces a helper function to allow applying rewrite patterns, interleaved with more global transformations, in a staged fashion:
1. the first stage consists of an OwningRewritePatternList. The RewritePattern in this list are applied once and in order.
2. the second stage consists of a single OwningRewritePattern that is applied greedily until convergence.
3. the third stage consists of applying a lambda, generally used for non-local transformation effects.
This allows creating custom fused transformations where patterns can be ordered and applied at a finer granularity than a sequence of traditional compiler passes.
A test that exercises these behaviors is added.
Differential Revision: https://reviews.llvm.org/D79518
Summary:
This makes a common pattern of
`dyn_cast_or_null<OpTy>(v.getDefiningOp())` more concise.
Differential Revision: https://reviews.llvm.org/D79681
This [discussion](https://llvm.discourse.group/t/viewop-isnt-expressive-enough/991/2) raised some concerns with ViewOp.
In particular, the handling of offsets is incorrect and does not match the op description.
Note that with an elemental type change, offsets cannot be part of the type in general because sizeof(srcType) != sizeof(dstType).
Howerver, offset is a poorly chosen term for this purpose and is renamed to byte_shift.
Additionally, for all intended purposes, trying to support non-identity layouts for this op does not bring expressive power but rather increases code complexity.
This revision simplifies the existing semantics and implementation.
This simplification effort is voluntarily restrictive and acts as a stepping stone towards supporting richer semantics: treat the non-common cases as YAGNI for now and reevaluate based on concrete use cases once a round of simplification occurred.
Differential revision: https://reviews.llvm.org/D79541
Summary: This revision introduces LinalgPromotionOptions to more easily control the application of promotion patterns. It also simplifies the different entry points into Promotion in preparation for some behavior change in subsequent revisions.
Differential Revision: https://reviews.llvm.org/D79489
This dialect contains various structured control flow operaitons, not only
loops, reflect this in the name. Drop the Ops suffix for consistency with other
dialects.
Note that this only moves the files and changes the C++ namespace from 'loop'
to 'scf'. The visible IR prefix remains the same and will be updated
separately. The conversions will also be updated separately.
Differential Revision: https://reviews.llvm.org/D79578
Summary:
Cast from a value interpreted as floating-point to the corresponding signed
integer value. Similar to an element-wise `static_cast` in C++, performs an
element-wise conversion operation.
Differential Revision: https://reviews.llvm.org/D79373
Functions checking whether an SSA value is a valid dimension or symbol for
affine operations can be called on values defined in a detached region (a
region that is not yet attached to an operation), for example, during parsing
or operation construction. These functions will attempt to uncondtionally
dereference a pointer to the parent operation of a region, which may be null
(as fixed by the previous commit, uninitialized before that). Since one cannot
know to which operation a region will be attached, conservatively this
operation would not be a valid affine scope and act accordingly, instead of
crashing.
Region has a default constructor that is called when a region is constructed
while an operation is being created, and therefore before the region can be
attached to this operation. The `container` field is uninitialized, which makes
it impossible to check programmatically if a Region is attached to an operation
or not, leading to sly memory errors when this field is read. Initialize it to
nullptr by default and thus make sure one can check if a region is attached to
an operation or not.
Complex addition and substraction are the first two binary operations on complex
numbers.
Remaining operations will follow the same pattern.
Differential Revision: https://reviews.llvm.org/D79479
The SideEffect interface definitions currently use string expressions to
reference custom resource objects. This CL introduces Resource objects in
tablegen definitions to simplify linking of resource reference to resource
objects.
Differential Revision: https://reviews.llvm.org/D78917
This is a wrapper around vector of NamedAttributes that keeps track of whether sorted and does some minimal effort to remain sorted (doing more, e.g., appending attributes in sorted order, could be done in follow up). It contains whether sorted and if a DictionaryAttr is queried, it caches the returned DictionaryAttr along with whether sorted.
Change MutableDictionaryAttr to always return a non-null Attribute even when empty (reserve null cases for errors). To this end change the getter to take a context as input so that the empty DictionaryAttr could be queried. Also create one instance of the empty dictionary attribute that could be reused without needing to lock context etc.
Update infer type op interface to use DictionaryAttr and use NamedAttrList to avoid incurring multiple conversion costs.
Fix bug in sorting helper function.
Differential Revision: https://reviews.llvm.org/D79463
vulkan-runtime-wrappers does not need MLIRSPIRVSerialization,
which is used by the ConvertGpuLaunchFuncToVulkanLaunchFunc pass
under the hood.
Differential Revision: https://reviews.llvm.org/D79577
SPIR-V ops can mix operands and attributes in the definition. These
operands and attributes are serialized in the exact order of the definition
to match SPIR-V binary format requirements. It can cause excessive
generated code bloat because we are emitting code to handle each
operand/attribute separately. So here we probe first to check whether all
the operands are ahead of attributes. Then we can serialize all operands
together.
This removes ~1000 lines of code from the generated inc file.
Differential Revision: https://reviews.llvm.org/D79446
These template functions are used in the serializer, where we can
actually directly query the opcode from the op's definition and
use that in the auto-generated serialization logic.
This removes a set of templates accounting for 319 lines from
the auto-generated inc file.
Differential Revision: https://reviews.llvm.org/D79444
Originally, these operations were folded only if all expressions in their
affine maps could be folded to a constant expression that can be then subject
to numeric min/max computation. This introduces a more advanced version that
partially folds the affine map by lifting individual constant expression in it
even if some of the expressions remain variable. The folding can update the
operation in place to use a simpler map. Note that this is not as powerful as
canonicalization, in particular this does not remove dimensions or symbols that
became useless. This allows for better composition of Linalg tiling and
promotion transformation, where the latter can handle some canonical forms of
affine.min that the folding can now produce.
Differential Revision: https://reviews.llvm.org/D79502
In the Vector to LLVM conversion, the `replaceTransferOp` function calls
into a type converter that may fail and suppresses the status. Change
the function to return the failure status instead, Since it is called
from a pattern, the failure can be readily propagated to the rest of
infrastructure.
The list of destination load ops while evaluating producer-consumer
fusion wasn't being maintained as a set, and as such, duplicate load ops
were being added to it. Although this is harmless correctness-wise, it's
a killer efficiency-wise and it prevents interesting/useful fusions
(including for eg. reshapes into a matmul). The reason the latter
fusions would be missed is that a slice union would be unnecessarily
needed due to the duplicate load ops on a memref added to the 'dst
loads' list. Since slice union is unimplemented for the local var case,
a single destination load op that leads to local vars (like a floordiv /
mod producing fusion), a common case, would not get fused due to an
unnecessary union being tried with itself. (The union would actually be
the same thing but we would bail out.)
Besides the above, this would also significantly speed up fusion as all
the unnecessary slice computations / unions, checks, etc. due to the
duplicates go away.
Differential Revision: https://reviews.llvm.org/D79547
When the folding is performed in place, the `::fold` function does not populate
its `results` argument to indicate that. (In the folding hook for single-result
operations, the result of the original operation is expected to be returned,
but it is then ignored by the wrapper.) `OperationFolder::create` would
erronously rely on the _operation_ having zero results instead of on the
_folding_ producing zero new results to populate the list of results with those
of the original operation. This would lead to a crash for single-result ops
with in-place folds where the first result is accessed uncondtionally because
the list of results was not properly populated. Use the list of values produced
by the folding instead.
Differential Revision: https://reviews.llvm.org/D79497
The types of forward references are checked that they match with other
uses, but they do not check they match with the definition.
func @forward_reference_type_check() -> (i8) {
br ^bb2
^bb1:
return %1 : i8
^bb2:
%1 = "bar"() : () -> (f32)
br ^bb1
}
Would be parsed and the use site of '%1' would be silently changed to
'f32'.
This commit adds a test for this case, and a check during parsing for
the types to match.
Patch by Matthew Parkinson <mattpark@microsoft.com>
Closes D79317.
Summary:
This revision adds a conservative canonicalization pattern for MemRefCastOp that are typically inserted during ViewOp and SubViewOp canonicalization.
Ideally such canonicalizations would propagate the type to consumers but this is not a local behavior. As a consequence MemRefCastOp are introduced to keep type compatibility but need to be cleaned up later, in the case where more dynamic behavior than necessary is introduced.
Differential Revision: https://reviews.llvm.org/D79438
Essentially takes the lld/Common/Threads.h wrappers and moves them to
the llvm/Support/Paralle.h algorithm header.
The changes are:
- Remove policy parameter, since all clients use `par`.
- Rename the methods to `parallelSort` etc to match LLVM style, since
they are no longer C++17 pstl compatible.
- Move algorithms from llvm::parallel:: to llvm::, since they have
"parallel" in the name and are no longer overloads of the regular
algorithms.
- Add range overloads
- Use the sequential algorithm directly when 1 thread is requested
(skips task grouping)
- Fix the index type of parallelForEachN to size_t. Nobody in LLVM was
using any other parameter, and it made overload resolution hard for
for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t.
Remove Threads.h and update LLD for that.
This is a prerequisite for parallel public symbol processing in the PDB
library, which is in LLVM.
Reviewed By: MaskRay, aganea
Differential Revision: https://reviews.llvm.org/D79390
This revision allows for creating DenseElementsAttrs and accessing elements using std::complex<APInt>/std::complex<APFloat>. This allows for opaquely accessing and transforming complex values. This is used by the printer/parser to provide pretty printing for complex values. The form for complex values matches that of std::complex, i.e.:
```
// `(` element `,` element `)`
dense<(10,10)> : tensor<complex<i64>>
```
Differential Revision: https://reviews.llvm.org/D79296
This revision adds support for storing ComplexType elements inside of a DenseElementsAttr. We store complex objects as an array of two elements, matching the definition of std::complex. There is no current attribute storage for ComplexType, but DenseElementsAttr provides API for access/creation using std::complex<>. Given that the internal implementation of DenseElementsAttr is already fairly opaque, the only real complexity here is in the printing/parsing. This revision keeps it simple for now and always uses hex when printing complex elements. A followup will add prettier syntax for this.
Differential Revision: https://reviews.llvm.org/D79281
We see intermittent build errors on the windows buildbot because
mlir-opt is including Linalg headers which haven't been built yet.
This dependence should be resolved by declaring a PUBLIC dependence
on the Linalg library when building MLIROptMain.
This addresses a compilation failure on GCC 5:
error: #error This file requires compiler and library support for the
ISO C++ 2011 standard. This support must be enabled with the -std=c++11
or -std=gnu++11 compiler options.
#error This file requires compiler and library support
Differential Revision: https://reviews.llvm.org/D79439
DMA operation classes in the Standard dialect (`DmaStartOp` and `DmaWaitOp`)
provide helper functions that make numerous assumptions about the number and
order of operands, and about their types. However, these assumptions were not
checked in the verifier, leading to assertion failures or crashes when helper
functions were used on ill-formed ops. Some of the assuptions were checked in
the custom parser (and thus could not check assumption violations in ops
constructed programmatically, e.g., during rewrites) and others were not
checked at all. Introduce the verifiers for all these assumptions and drop
unnecessary checks in the parser that are now covered by the verifier.
Addresses PR45560.
Differential Revision: https://reviews.llvm.org/D79408
Summary:
Adds the loop unroll transformation for loop::ForOp.
Adds support for promoting the body of single-iteration loop::ForOps into its containing block.
Adds check tests for loop::ForOps with dynamic and static lower/upper bounds and step.
Care was taken to share code (where possible) with the AffineForOp unroll transformation to ease maintenance and potential future transition to a LoopLike construct on which loop transformations for different loop types can implemented.
Reviewers: ftynse, nicolasvasilache
Reviewed By: ftynse
Subscribers: bondhugula, mgorny, zzheng, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79184
Adding this pattern reduces code duplication. There is no need to have a
custom implementation for lowering to llvm.cmpxchg.
Differential Revision: https://reviews.llvm.org/D78753
Portions of MLIR which depend on LLVMIR generally need to depend on
intrinsics_gen, to ensure that tablegen'd header files from LLVM are built
first. Without this, we get errors, typically about llvm/IR/Attributes.inc
not being found.
Note that previously the Linalg Dialect depended on intrinsics_gen, but it
doesn't need to, since it doesn't use LLVMIR.
Differential Revision: https://reviews.llvm.org/D79389
This revision adds support for merging identical blocks, or those with the same operations that branch to the same successors. Operands that mismatch between the different blocks are replaced with new block arguments added to the merged block.
Differential Revision: https://reviews.llvm.org/D79134
This change removes tabs from the comments printed by the asmprinter after basic
block declarations in favor of two spaces. This is currently the only place in
the printed IR that uses tabs.
Differential Revision: https://reviews.llvm.org/D79377
Summary:
In the particular case of an insertion in a block without a terminator, the BlockBuilder insertion point should be block->end().
Adding a unit test to exercise this.
Differential Revision: https://reviews.llvm.org/D79363
This allows for walking the operations nested directly within a region, without traversing nested regions.
Differential Revision: https://reviews.llvm.org/D79056
Summary:
As D78974, this patch implements the emulation for store op. The emulation is
done with atomic operations. E.g., if the storing value is i8, rewrite the
StoreOp to:
1) load a 32-bit integer
2) clear 8 bits in the loading value
3) store 32-bit value back
4) load a 32-bit integer
5) modify 8 bits in the loading value
6) store 32-bit value back
The step 1 to step 3 are done by AtomicAnd as one atomic step, and the step 4
to step 6 are done by AtomicOr as another atomic step.
Differential Revision: https://reviews.llvm.org/D79272
This std::copy_n copies 8 byte data (APInt raw data) by 1 byte from the
beginning of char array. This is no problem in little endian, but the
data is not copied correctly in big endian because the data should be
copied from the end of the char array.
- Example of 4 byte data (such as float32)
Little endian (First 4 bytes):
Address | 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08
Data | 0xcd 0xcc 0x8c 0x3f 0x00 0x00 0x00 0x00
Big endian (Last 4 bytes):
Address | 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08
Data | 0x00 0x00 0x00 0x00 0x3f 0x8c 0xcc 0xcd
In general, when it copies N(N<8) byte data in big endian, the start
address should be incremented by (8 - N) bytes.
The original code has no problem when it includes 8 byte data(such as
double) even in big endian.
Differential Revision: https://reviews.llvm.org/D78076
- Exports MLIR targets to be used out-of-tree.
- mimicks `add_clang_library` and `add_flang_library`.
- Fixes libMLIR.so
After https://reviews.llvm.org/D77515 libMLIR.so was no longer containing
any object files. We originally had a cludge there that made it work with
the static initalizers and when switchting away from that to the way the
clang shlib does it, I noticed that MLIR doesn't create a `obj.{name}` target,
and doesn't export it's targets to `lib/cmake/mlir`.
This is due to MLIR using `add_llvm_library` under the hood, which adds
the target to `llvmexports`.
Differential Revision: https://reviews.llvm.org/D78773
[MLIR] Fix libMLIR.so and LLVM_LINK_LLVM_DYLIB
Primarily, this patch moves all mlir references to LLVM libraries into
either LLVM_LINK_COMPONENTS or LINK_COMPONENTS. This enables magic in
the llvm cmake files to automatically replace reference to LLVM components
with references to libLLVM.so when necessary. Among other things, this
completes fixing libMLIR.so, which has been broken for some configurations
since D77515.
Unlike previously, the pattern is now that mlir libraries should almost
always use add_mlir_library. Previously, some libraries still used
add_llvm_library. However, this confuses the export of targets for use
out of tree because libraries specified with add_llvm_library are exported
by LLVM. Instead users which don't need/can't be linked into libMLIR.so
can specify EXCLUDE_FROM_LIBMLIR
A common error mode is linking with LLVM libraries outside of LINK_COMPONENTS.
This almost always results in symbol confusion or multiply defined options
in LLVM when the same object file is included as a static library and
as part of libLLVM.so. To catch these errors more directly, there's now
mlir_check_all_link_libraries.
To simplify usage of add_mlir_library, we assume that all mlir
libraries depend on LLVMSupport, so it's not necessary to separately specify
it.
tested with:
BUILD_SHARED_LIBS=on,
BUILD_SHARED_LIBS=off + LLVM_BUILD_LLVM_DYLIB,
BUILD_SHARED_LIBS=off + LLVM_BUILD_LLVM_DYLIB + LLVM_LINK_LLVM_DYLIB.
By: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com>
Differential Revision: https://reviews.llvm.org/D79067
[MLIR] Move from using target_link_libraries to LINK_LIBS
This allows us to correctly generate dependencies for derived targets,
such as targets which are created for object libraries.
By: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com>
Differential Revision: https://reviews.llvm.org/D79243
Three commits have been squashed to avoid intermediate build breakage.
Linalg transformations are currently exposed as DRRs.
Unfortunately RewriterGen does not play well with the line of work on named linalg ops which require variadic operands and results.
Additionally, DRR is arguably not the right abstraction to expose compositions of such patterns that don't rely on SSA use-def semantics.
This revision abandons DRRs and exposes manually written C++ patterns.
Refactorings and cleanups are performed to uniformize APIs.
This refactoring will allow replacing the currently manually specified Linalg named ops.
A collateral victim of this refactoring is the `tileAndFuse` DRR, and the one associated test, which will be revived at a later time.
Lastly, the following 2 tests do not add value and are altered:
- a dot_perm tile + interchange test does not test anything new and is removed
- a dot tile + lower to loops does not need 2-D tiling and is trimmed.
Add `CreateComplexOp`, `ReOp`, and `ImOp` to the standard dialect.
This is the first step to support complex numbers.
Differential Revision: https://reviews.llvm.org/D79159
The current BufferPlacement implementation tries to find Alloc and Dealloc
operations in order to move them. However, this is a tight coupling to
standard-dialect ops which has been removed in this CL.
Differential Revision: https://reviews.llvm.org/D78993
This is useful for several reasons:
* In some situations the user can guarantee that thread-safety isn't necessary and don't want to pay the cost of synchronization, e.g., when parsing a very large module.
* For things like logging threading is not desirable as the output is not guaranteed to be in stable order.
This flag also subsumes the pass manager flag for multi-threading.
Differential Revision: https://reviews.llvm.org/D79266
In cmake, dependencies on generated files require some sophistication in the build system. At build time, files are parsed to determine which headers they depend on and these dependencies are injected into the build system. This works well with ninja, but has some constraints with the makefile generator. According to the cmake documentation, this only works reliably within the same directory.
This patch expands the usage of mlir-headers to include all generated headers and adds an mlir-generic-headers target which triggers generation of dialect-independent headers. These targets are used to express dependencies on generated headers. This is mostly handled in AddMLIR.cmake and only a few CMakeLists.txt files need to change.
Differential Revision: https://reviews.llvm.org/D79242
These libraries are distinct from other things in Analysis in that they
operate only on core IR concepts. This also simplifies dependencies
so that Dialect -> Analysis -> Parser -> IR. Previously, the parser depended
on portions of the the Analysis directory as well, which sometimes
caused issues with the way the cmake makefile generator discovers
dependencies on generated files during compilation.
Differential Revision: https://reviews.llvm.org/D79240