The alignment attribute in the 'alloca' op treats the '0' value as 'unset'.
When parsing the custom form of the 'alloca' op, ignore the alignment attribute
with if its value is '0' instead of actually creating it and producing a
slightly different textually yet equivalent semantically form in the output.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D90179
Based on discourse discussion, fix the doc string and remove examples with
wrong semantic. Also fix insert_map semantic by adding missing operand for
vector we are inserting into.
Differential Revision: https://reviews.llvm.org/D89563
This revision allows the fusion of the producer of input tensors in the consumer under a tiling transformation (which produces subtensors).
Many pieces are still missing (e.g. support init_tensors, better refactor LinalgStructuredOp interface support, try to merge implementations and reuse code) but this still allows getting started.
The greedy pass itself is just for testing purposes and will be extracted in a separate test pass.
Differential revision: https://reviews.llvm.org/D89491
This patch introduces a SPIR-V runner. The aim is to run a gpu
kernel on a CPU via GPU -> SPIRV -> LLVM conversions. This is a first
prototype, so more features will be added in due time.
- Overview
The runner follows similar flow as the other runners in-tree. However,
having converted the kernel to SPIR-V, we encode the bind attributes of
global variables that represent kernel arguments. Then SPIR-V module is
converted to LLVM. On the host side, we emulate passing the data to device
by creating in main module globals with the same symbolic name as in kernel
module. These global variables are later linked with ones from the nested
module. We copy data from kernel arguments to globals, call the kernel
function from nested module and then copy the data back.
- Current state
At the moment, the runner is capable of running 2 modules, nested one in
another. The kernel module must contain exactly one kernel function. Also,
the runner supports rank 1 integer memref types as arguments (to be scaled).
- Enhancement of JitRunner and ExecutionEngine
To translate nested modules to LLVM IR, JitRunner and ExecutionEngine were
altered to take an optional (default to `nullptr`) function reference that
is a custom LLVM IR module builder. This allows to customize LLVM IR module
creation from MLIR modules.
Reviewed By: ftynse, mravishankar
Differential Revision: https://reviews.llvm.org/D86108
This patch introduces a pass for running
`mlir-spirv-cpu-runner` - LowerHostCodeToLLVMPass.
This pass emulates `gpu.launch_func` call in LLVM dialect and lowers
the host module code to LLVM. It removes the `gpu.module`, creates a
sequence of global variables that are later linked to the varables
in the kernel module, as well as a series of copies to/from
them to emulate the memory transfer to/from the host or to/from the
device sides. It also converts the remaining Standard dialect into
LLVM dialect, emitting C wrappers.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D86112
This dependency was already existing indirectly, but is now more direct
since the registration relies on a inline function. This fixes the
link of the tools with BFD.
The current pattern for vector unrolling takes the native shape to
unroll to at pattern instantiation time, but the native shape might
defer based on the types of the operand. Introduce a
UnrollVectorOptions struct which allows for using a function that will
return the native shape based on the operation. Move other options of
unrolling like `filterConstraints` into this struct.
Differential Revision: https://reviews.llvm.org/D89744
Add folder for the case where ExtractStridedSliceOp source comes from a chain
of InsertStridedSliceOp. Also add a folder for the trivial case where the
ExtractStridedSliceOp is a no-op.
Differential Revision: https://reviews.llvm.org/D89850
This patch provides C API for MLIR affine expression.
- Implement C API for methods of AffineExpr class.
- Implement C API for methods of derived classes (AffineBinaryOpExpr, AffineDimExpr, AffineSymbolExpr, and AffineConstantExpr).
Differential Revision: https://reviews.llvm.org/D89856
Added optimization pass to convert heap-based allocs to stack-based allocas in
buffer placement. Added the corresponding test file.
Differential Revision: https://reviews.llvm.org/D89688
Before this change, we would run `maxIterations` if the first iteration changed the op.
After this change, we exit the loop as soon as an iteration hasn't changed the op.
Assuming that we have reached a fixed point when an iteration doesn't change the op, this doesn't affect correctness.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D89981
This reverts commit 4986d5eaff with
proper patches to CMakeLists.txt:
- Add MLIRAsync as a dependency to MLIRAsyncToLLVM
- Add Coroutines as a dependency to MLIRExecutionEngine
Lower from Async dialect to LLVM by converting async regions attached to `async.execute` operations into LLVM coroutines (https://llvm.org/docs/Coroutines.html):
1. Outline all async regions to functions
2. Add LLVM coro intrinsics to mark coroutine begin/end
3. Use MLIR conversion framework to convert all remaining async types and ops to LLVM + Async runtime function calls
All `async.await` operations inside async regions converted to coroutine suspension points. Await operation outside of a coroutine converted to the blocking wait operations.
Implement simple runtime to support concurrent execution of coroutines.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D89292
Forward missing attributes when creating the new transfer op otherwise the
builder would use default values.
Differential Revision: https://reviews.llvm.org/D89907
* Adds a new MlirOpPrintingFlags type and supporting accessors.
* Adds a new mlirOperationPrintWithFlags function.
* Adds a full featured python Operation.print method with all options and the ability to print directly to files/stdout in text or binary.
* Adds an Operation.get_asm which delegates to print and returns a str or bytes.
* Reworks Operation.__str__ to be based on get_asm.
Differential Revision: https://reviews.llvm.org/D89848
A "structural" type conversion is one where the underlying ops are
completely agnostic to the actual types involved and simply need to update
their types. An example of this is shape.assuming -- the shape.assuming op
and the corresponding shape.assuming_yield op need to update their types
accordingly to the TypeConverter, but otherwise don't care what type
conversions are happening.
Also, the previous conversion code would not correctly materialize
conversions for the shape.assuming_yield op. This should have caused a
verification failure, but shape.assuming's verifier wasn't calling
RegionBranchOpInterface::verifyTypes (which for reasons can't be called
automatically as part of the trait verification, and requires being
called manually). This patch also adds that verification.
Differential Revision: https://reviews.llvm.org/D89833
A "structural" type conversion is one where the underlying ops are
completely agnostic to the actual types involved and simply need to update
their types. An example of this is scf.if -- the scf.if op and the
corresponding scf.yield ops need to update their types accordingly to the
TypeConverter, but otherwise don't care what type conversions are happening.
To test the structural type conversions, it is convenient to define a
bufferize pass for a dialect, which exercises them nicely.
Differential Revision: https://reviews.llvm.org/D89757
The documentation claims that an op with the trait FunctionLike has a
single region containing the blocks that corresponding to the body of
the function. It then goes on to say that the absence of a region
corresponds to an external function when, in fact, this is represented
by a single empty region. This patch changes the wording in the
documentation to match the implementation.
Signed-off-by: Frej Drejhammar <frej.drejhammar@gmail.com>
Co-authored-by: Frej Drejhammar <frej.drejhammar@gmail.com>
Co-authored-by: Klas Segeljakt <klasseg@kth.se>
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D89868
Historically, custom builder specification in OpBuilder has been accepting the
formal parameter list for the builder method as a raw string containing C++.
While this worked well to connect the signature and the body, this became
problematic when ODS needs to manipulate the parameter list, e.g. to inject
OpBuilder or to trim default values when generating the definition. This has
also become inconsistent with other method declarations, in particular in
interface definitions.
Introduce the possibility to define OpBuilder formal parameters using a
TableGen dag similarly to other methods. Additionally, introduce a mechanism to
declare parameters with default values using an additional class. This
mechanism can be reused in other methods. The string-based builder signature
declaration is deprecated and will be removed after a transition period.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D89470
Docstrings for `__str__` method in many classes was recycling the constant
string defined for `Type`, without being types themselves. Use proper
docstrings instead. Since they are succint, use string literals instead of
top-level constants to avoid further mistakes.
Differential Revision: https://reviews.llvm.org/D89780
The pybind class typedef for concrete attribute classes was erroneously
deriving all of them from PyAttribute instead of the provided base class. This
has not been triggering any error because only one level of the hierarchy is
currently exposed.
Differential Revision: https://reviews.llvm.org/D89779
Values are ubiquitous in the IR, in particular block argument and operation
results are Values. Define Python classes for BlockArgument, OpResult and their
common ancestor Value. Define pseudo-container classes for lists of block
arguments and operation results, and use these containers to access the
corresponding values in blocks and operations.
Differential Revision: https://reviews.llvm.org/D89778
The CfgTraits abstraction simplfies writing algorithms that are
generic over the type of CFG, and enables writing such algorithms
as regular non-template code that operates on opaque references
to CFG blocks and values.
Implementations of CfgTraits provide operations on the concrete
CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock *`.
CfgInterface is an abstract base class which provides operations
on opaque types CfgBlockRef and CfgValueRef. Those opaque types
encapsulate a `void *`, but the meaning depends on the concrete
CFG type. For example, MachineCfgTraits -- for use with MachineIR
in SSA form -- encodes a Register inside CfgValueRef. Converting
between concrete references and opaque/generic ones is done by
CfgTraits::{fromGeneric,toGeneric}. Convenience methods
CfgTraits::{un}wrap{Iterator,Range} are available as well.
Writing algorithms in terms of CfgInterface adds some overhead
(virtual method calls, plus in same cases it removes the
opportunity to inline iterators), but can be much more convenient
since generic algorithms can be written as non-templates.
This patch adds implementations of CfgTraits for all CFGs on
which dominator trees are calculated, so that the dominator
tree can be ported to this machinery. Only IrCfgTraits (LLVM IR)
and MachineCfgTraits (Machine IR in SSA form) are complete, the
other implementations are limited to the absolute minimum
required to make the upcoming dominator tree changes work.
v5:
- fix MachineCfgTraits::blockdef_iterator and allow it to iterate over
the instructions in a bundle
- use MachineBasicBlock::printName
v6:
- implement predecessors/successors for all CfgTraits implementations
- fix error in unwrapRange
- rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming
that is consistent with {wrap,unwrap}{Iterator,Range}
- use getVRegDef instead of getUniqueVRegDef
v7:
- std::forward fix in wrapping_iterator
- fix typos
v8:
- cleanup operators on CfgOpaqueType
- address other review comments
Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d
Differential Revision: https://reviews.llvm.org/D83088
This still satisfies the constraints required by the affine dialect and
gives more flexibility in what iteration bounds can be used when
loewring to the GPU dialect.
Differential Revision: https://reviews.llvm.org/D89782
The Value hierarchy consists of BlockArgument and OpResult, both of which
derive Value. Introduce IsA functions and functions specific to each class,
similarly to other class hierarchies. Also, introduce functions for
pointer-comparison of Block and Operation that are necessary for testing and
are generally useful.
Reviewed By: stellaraccident, mehdi_amini
Differential Revision: https://reviews.llvm.org/D89714
* Interops with Python buffers/numpy arrays to create.
* Also cleans up 'get' factory methods on some types to be consistent.
* Adds mlirAttributeGetType() to C-API to facilitate error handling and other uses.
* Punts on a lot of features of the ElementsAttribute hierarchy for now.
* Does not yet support bool or string attributes.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D89363
Now, convert-shape-to-std doesn't internally create memrefs, which was
previously a bit of a layering violation. The conversion to memrefs
should logically happen as part of bufferization.
Differential Revision: https://reviews.llvm.org/D89669
It's unfortunate that this requires adding a dependency on scf dialect
to std bufferization (and hence all of std transforms). This is a bit
perilous. We might want a lib/Transforms/Bufferize/ with a separate
bufferization library per dialect?
Differential Revision: https://reviews.llvm.org/D89667
Move the class to where all base classes are defined.
Also remove all the builders since they are definted in subclasses anyway.
Differential Revision: https://reviews.llvm.org/D89620
The current BufferPlacement transformation contains several concepts for
hoisting allocations. However, more advanced hoisting techniques should not be
integrated into the BufferPlacement transformation. Hence, this CL refactors the
current BufferPlacement pass into three separate pieces: BufferDeallocation and
BufferAllocation(Loop)Hoisting. Moreover, it extends the hoisting functionality
by allowing to move allocations out of loops.
Differential Revision: https://reviews.llvm.org/D87756
LLVM dialect has been defining Op arguments by deriving the `Arguments` ODS
class. This has arguably worse readability due to large indentation caused by
multiple derivations, and is inconsistent with other ODS files. Use the `let
arguments` form instead.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D89560
Usage of nested parallel regions were not working correctly and leading
to assertion failures. Fix contains the following changes,
1) Don't set the insertion point in the body callback.
2) Save the continuation IP in a stack and set the branch to
continuationIP at the terminator.
Reviewed By: SouraVX, jdoerfert, ftynse
Differential Revision: https://reviews.llvm.org/D88720
AllReduceLowering is currently the only GPU rewrite pattern, but more are coming. This is a preparation change.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D89370
Have the ODS TypeDef generator write the getChecked() definition.
Also add to TypeParamCommaFormatter a `JustParams` format and
refactor around that.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D89438
This transforms the symbol lookups to O(1) from O(NM), greatly speeding up both passes. For a large MLIR module this shaved seconds off of the compilation time.
Differential Revision: https://reviews.llvm.org/D89522
The initial goal of this interface is to fix the current problems with verifying symbol user operations, but can extend beyond that in the future. The current problems with the verification of symbol uses are:
* Extremely inefficient:
Most current symbol users perform the symbol lookup using the slow O(N) string compare methods, which can lead to extremely long verification times in large modules.
* Invalid/break the constraints of verification pass
If the symbol reference is not-flat(and even if it is flat in some cases) a verifier for an operation is not permitted to touch the referenced operation because it may be in the process of being mutated by a different thread within the pass manager.
The new SymbolUserOpInterface exposes a method `verifySymbolUses` that will be invoked from the parent symbol table to allow for verifying the constraints of any referenced symbols. This method is passed a `SymbolTableCollection` to allow for O(1) lookups of any necessary symbol operation.
Differential Revision: https://reviews.llvm.org/D89512
This revision contains two optimizations related to symbol checking:
* Optimize SymbolOpInterface to only check for a name attribute if the operation is an optional symbol.
This removes an otherwise unnecessary attribute lookup from a majority of symbols.
* Add a new SymbolTableCollection class to represent a collection of SymbolTables.
This allows for perfoming non-flat symbol lookups in O(1) time by caching SymbolTables for symbol table operations. This class is very useful for algorithms that operate on multiple symbol tables, either recursively or not.
Differential Revision: https://reviews.llvm.org/D89505
(Note: This is a reland of D82597)
This class allows for defining thread local objects that have a set non-static lifetime. This internals of the cache use a static thread_local map between the various different non-static objects and the desired value type. When a non-static object destructs, it simply nulls out the entry in the static map. This will leave an entry in the map, but erase any of the data for the associated value. The current use cases for this are in the MLIRContext, meaning that the number of items in the static map is ~1-2 which aren't particularly costly enough to warrant the complexity of pruning. If a use case arises that requires pruning of the map, the functionality can be added.
This is especially useful in the context of MLIR for implementing thread-local caching of context level objects that would otherwise have very high lock contention. This revision adds a thread local cache in the MLIRContext for attributes, identifiers, and types to reduce some of the locking burden. This led to a speedup of several seconds when compiling a somewhat large mlir module.
Differential Revision: https://reviews.llvm.org/D89504
This trait simply adds a fold of f(f(x)) = f(x) when an operation is labelled as idempotent
Reviewed By: rriddle, andyly
Differential Revision: https://reviews.llvm.org/D89421
* Also fixes the const-ness of the various DenseElementsAttr construction functions.
* Both issues identified when trying to use the DenseElementsAttr functions.
Differential Revision: https://reviews.llvm.org/D89517
Added an underlying matcher for generic constant ops. This
included a rewriter of RewriterGen to make variable use more
clear.
Differential Revision: https://reviews.llvm.org/D89161
Adding unroll support for transfer read and transfer write operation. This
allows to pick the ideal size for the memory access for a given target.
Differential Revision: https://reviews.llvm.org/D89289
The opposite of tensor_to_memref is tensor_load.
- Add some basic tensor_load/tensor_to_memref folding.
- Add source/target materializations to BufferizeTypeConverter.
- Add an example std bufferization pattern/pass that shows how the
materialiations work together (more std bufferization patterns to come
in subsequent commits).
- In coming commits, I'll document how to write composable
bufferization passes/patterns and update the other in-tree
bufferization passes to match this convention. The populate* functions
will of course continue to be exposed for power users.
The naming on tensor_load/tensor_to_memref and their pretty forms are
not very intuitive. I'm open to any suggestions here. One key
observation is that the memref type must always be the one specified in
the pretty form, since the tensor type can be inferred from the memref
type but not vice-versa.
With this, I've been able to replace all my custom bufferization type
converters in npcomp with BufferizeTypeConverter!
Part of the plan discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17
Differential Revision: https://reviews.llvm.org/D89437
Parsing of a scalar subview did not create the required static_offsets attribute.
This also adds support for folding scalar subviews away.
Differential Revision: https://reviews.llvm.org/D89467
Each hardware that supports SPV_C_CooperativeMatrixNV has a list of
configurations that are supported natively. Add an attribute to
specify the configurations supported to the `spv.target_env`.
Reviewed By: antiagainst, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D89364
The current fusion on tensors fuses reshape ops with generic ops by
linearizing the indexing maps of the fused tensor in the generic
op. This has some limitations
- It only works for static shapes
- The resulting indexing map has a linearization that would be
potentially prevent fusion later on (for ex. tile + fuse).
Instead, try to fuse the reshape consumer (producer) with generic op
producer (consumer) by expanding the dimensionality of the generic op
when the reshape is expanding (folding). This approach conflicts with
the linearization approach. The expansion method is used instead of
the linearization method.
Further refactoring that changes the fusion on tensors to be a
collection of patterns.
Differential Revision: https://reviews.llvm.org/D89002
This CL allows user to specify the same name for the operands in the source pattern which implicitly enforces equality on operands with the same name.
E.g., Pat<(OpA $a, $b, $a) ... > would create a matching rule for checking equality for the first and the last operands. Equality of the operands is enforced at any depth, e.g., OpA ($a, $b, OpB($a, $c, OpC ($a))).
Example usage: Pat<(Reshape $arg0, (Shape $arg0)), (replaceWithValue $arg0)>
Note, this feature only covers operands but not attributes.
Current use cases are based on the operand equality and explicitly add the constraint into the pattern. Attribute equality will be worked out on the different CL.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D89254
The buffers are used as source or destination of transfer commands
so always add VK_BUFFER_USAGE_TRANSFER_{DST,SRC}_BIT to their usage
flags.
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
This patch adds a couple missing LLVM IR dialect floating point types to
the legality check.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D89350
This revision adds a programmable codegen strategy from linalg based on staged rewrite patterns. Testing is exercised on a simple linalg.matmul op.
Differential Revision: https://reviews.llvm.org/D89374
* It reads as more of a TODO for the future and has been long obsoleted by later work.
* One of the authors of the referenced paper called this out as "weird stuff from two years ago" when reviewing the more recent TOSA RFC.
Differential Revision: https://reviews.llvm.org/D89329
This reverts commit 7271c1bcb9.
This broke the gcc-5 build:
/usr/include/c++/5/ext/new_allocator.h:120:4: error: no matching function for call to 'std::pair<const std::__cxx11::basic_string<char>, mlir::tblgen::SymbolInfoMap::SymbolInfo>::pair(llvm::StringRef&, mlir::tblgen::SymbolInfoMap::SymbolInfo)'
{ ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }
^
In file included from /usr/include/c++/5/utility:70:0,
from llvm/include/llvm/Support/type_traits.h:18,
from llvm/include/llvm/Support/Casting.h:18,
from mlir/include/mlir/Support/LLVM.h:24,
from mlir/include/mlir/TableGen/Pattern.h:17,
from mlir/lib/TableGen/Pattern.cpp:14:
/usr/include/c++/5/bits/stl_pair.h:206:9: note: candidate: template<class ... _Args1, long unsigned int ..._Indexes1, class ... _Args2, long unsigned int ..._Indexes2> std::pair<_T1, _T2>::pair(std::tuple<_Args1 ...>&, std::tuple<_Args2 ...>&, std::_Index_tuple<_Indexes1 ...>, std::_Index_tuple<_Indexes2 ...>)
pair(tuple<_Args1...>&, tuple<_Args2...>&,
^
Adds a TypeDef class to OpBase and backing generation code. Allows one
to define the Type, its parameters, and printer/parser methods in ODS.
Can generate the Type C++ class, accessors, storage class, per-parameter
custom allocators (for the storage constructor), and documentation.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D86904
This CL allows user to specify the same name for the operands in the source pattern which implicitly enforces equality on operands with the same name.
E.g., Pat<(OpA $a, $b, $a) ... > would create a matching rule for checking equality for the first and the last operands. Equality of the operands is enforced at any depth, e.g., OpA ($a, $b, OpB($a, $c, OpC ($a))).
Example usage: Pat<(Reshape $arg0, (Shape $arg0)), (replaceWithValue $arg0)>
Note, this feature only covers operands but not attributes.
Current use cases are based on the operand equality and explicitly add the constraint into the pattern. Attribute equality will be worked out on the different CL.
Differential Revision: https://reviews.llvm.org/D89254
This is the same diff as https://reviews.llvm.org/D88809/ except side effect
free check is removed for involution and a FIXME is added until the dependency
is resolved for shared builds. The old diff has more details on possible fixes.
Reviewed By: rriddle, andyly
Differential Revision: https://reviews.llvm.org/D89333
Update linalg-to-loops lowering for pooling operations to perform
padding of the input when specified by the corresponding attribute.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D88911
CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
The dependency target "Core" of target "mlir-cuda-runner" does not exist.
Call Stack (most recent call first):
llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)
CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
The dependency target "LINK_COMPONENTS" of target "mlir-cuda-runner" does
not exist.
Call Stack (most recent call first):
llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)
CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
The dependency target "Support" of target "mlir-cuda-runner" does not
exist.
Call Stack (most recent call first):
llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)
* Extends Context/Operation interning to cover Module as well.
* Implements Module.context, Attribute.context, Type.context, and Location.context back-references (facilitated testing and also on the TODO list).
* Adds method to create an empty Module.
* Discovered missing in npcomp.
Differential Revision: https://reviews.llvm.org/D89294
For some reason the variable `cumulativeSizeInBytes` in
`getCumulativeSizeInBytes` was actually storing number of elements. I decided
to fix it and refactor the function a bit.
Differential Revision: https://reviews.llvm.org/D89336
The build of MLIR occasionally fails (especially on Windows) because there is missing dependency between MLIRLLVMIR and MLIROpenMPOpsIncGen.
1) LLVMDialect.cpp includes LLVMDialect.h
2) LLVMDialect.h includes OpenMPDialect.h
3) OpenMPDialect.h includes OpenMPOpsDialect.h.inc, OpenMPOpsEnums.h.inc and OpenMPOps.h.inc
The OpenMP .inc files are generated by MLIROpenMPOpsIncGen, so MLIRLLVMIR which builds LLVMDialect.cpp should depend on MLIROpenMPOpsIncGen
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D89275
TensorConstantOp bufferization currently uses the vector dialect to store constant data into memory.
Due to natural vector size and alignment properties, this is problematic with n>1-D vectors whose most minor dimension is not naturally aligned.
Instead, this revision linearizes the constant and introduces a linalg.reshape to go back to the desired shape.
Still this is still to be considered a workaround and a better longer term solution will probably involve `llvm.global`.
Differential Revision: https://reviews.llvm.org/D89311
This combines two separate ops (D88972: `gpu.create_token`, D89043: `gpu.host_wait`) into one.
I do after all like the idea of combining the two ops, because it matches exactly the pattern we are
going to have in the other gpu ops that will implement the AsyncOpInterface (launch_func, copies, alloc):
If the op is async, we return a !gpu.async.token. Otherwise, we synchronize with the host and don't return a token.
The use cases for `gpu.wait async` and `gpu.wait` are further apart than those of e.g. `gpu.h2d async` and `gpu.h2d`,
but I like the consistent meaning of the `async` keyword in GPU ops.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D89160
This PR adds support for identified and recursive structs.
This includes: parsing, printing, serializing, and
deserializing such structs.
The following C struct:
```C
struct A {
A* next;
};
```
which is translated to the following MLIR code as:
```mlir
!spv.struct<A, (!spv.ptr<!spv.struct<A>, Generic>)>
```
would be represented in the SPIR-V module as:
```spirv
OpName %A "A"
OpTypeForwardPointer %APtr Generic
%A = OpTypeStruct %APtr
%APtr = OpTypePointer Generic %A
```
In particular the following changes are included:
- SPIR-V structs can now be either identified or literal
(i.e. non-identified).
- All structs now have their members surrounded by a ()-pair.
- For recursive references,
(1) an OpTypeForwardPointer instruction is emitted before
the OpTypeStruct instruction defining the recursive struct
(2) an OpTypePointer instruction is emitted after the
OpTypeStruct instruction which actually defines the recursive
pointer to struct type.
Reviewed By: antiagainst, rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D87206
* Links against libMLIR.so if the project is built for DYLIBs.
* Puts things in the right place in build and install time python/ trees so that RPaths line up.
* Adds install actions to install both the extension and sources.
* Copies py source files to the build directory to match (consistent layout between build/install time and one place to point a PYTHONPATH for tests and interactive use).
* Finally, "import mlir" from an installed LLVM just works.
Differential Revision: https://reviews.llvm.org/D89167
The TensorConstantOp bufferize conversion pattern has a bug that
makes it incorrect in the case of vectors whose alignment is not
the natural alignment. Circumvent it temporarily by using a power of 2.
Differential Revision: https://reviews.llvm.org/D89265
This revision reduces the number of places that specific information needs to be modified when adding new named Linalg ops.
Differential Revision: https://reviews.llvm.org/D89223
This revision introduces support for buffer allocation for any named linalg op.
To avoid template instantiating many ops, a new ConversionPattern is created to capture the LinalgOp interface.
Some APIs are updated to remain consistent with MLIR style:
`OwningRewritePatternList * -> OwningRewritePatternList &`
`BufferAssignmentTypeConverter * -> BufferAssignmentTypeConverter &`
Differential revision: https://reviews.llvm.org/D89226
The buffer placement preparation tests in
test/Transforms/buffer-placement-preparation* are using Linalg as a test
dialect which leads to confusion and "copy-pasta", i.e. Linalg is being
extended now and when TensorsToBuffers.cpp is changed, TestBufferPlacement is
sometimes kept in-sync, which should not be the case.
This has led to the unnoticed bug, because the tests were in a different directory and the patterns were slightly off.
Differential Revision: https://reviews.llvm.org/D89209
This patch introduces the acc.enter_data operation that represents an OpenACC Enter Data directive.
Operands and attributes are dervied from clauses in the spec 2.6.6.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88941
This is required or broadcasting with operands of different ranks will lead to
failures as the select op requires both possible outputs and its output type to
be the same.
Differential Revision: https://reviews.llvm.org/D89134
The patch adds a canonicalization pattern that removes the unused results of scf.if operation. As a result, cse may remove unused computations in the then and else regions of the scf.if operation.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D89029
This patch introduces the acc.exit_data operation that represents an OpenACC Exit Data directive.
Operands and attributes are derived from clauses in the spec 2.6.6.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88969
There is an atomic_rmw and a generic_atomic_rmw operation.
The doc of the latter incorrectly referred to former though.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D89172
They are currently marked as unsupported when windows is part of the triple, but they actually fail when they are run on Windows, so they are unsupported on system-windows
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D89169
This revision belongs to a series of patches that reduce reliance of Linalg transformations on templated rewrite and conversion patterns.
Instead, this uses a MatchAnyTag pattern for the vast majority of cases and dispatches internally.
Differential revision: https://reviews.llvm.org/D89133
This revision belongs to a series of patches that reduce reliance of Linalg transformations on templated rewrite and conversion patterns.
Instead, this uses a MatchAnyTag pattern for the vast majority of cases and dispatches internally.
Differential Revision: https://reviews.llvm.org/D89133
* Isolates the visibility controlled parts of its implementation to a detail namespace.
* Applies a struct level visibility attribute which applies to the static local within the get() functions.
* The prior version was not emitting a symbol for the static local "instance" fields when the user TU was compiled with -fvisibility=hidden.
Differential Revision: https://reviews.llvm.org/D89153
Without this PatternRewriting infrastructure does not know of modifications and
cannot properly legalize nor rollback changes.
Differential Revision: https://reviews.llvm.org/D89129
Async execute operation can take async arguments as dependencies.
Change `async.execute` custom parser/printer format to use `%value as %unwrapped: !async.value<!type>` sytax.
Reviewed By: mehdi_amini, herhut
Differential Revision: https://reviews.llvm.org/D88601
Without this, legalization might not recursively handle child ops properly.
Additionally, this is required for pattern rewriting to properly rollback conversions.
Differential Revision: https://reviews.llvm.org/D89122
The updated version of kernel outlining did not handle cases correctly
where an operand of a candidate for sinking itself was defined by an operation
that is a sinking candidate. In such cases, it could happen that sunk
operations were inserted in the wrong order, breaking ssa properties.
Differential Revision: https://reviews.llvm.org/D89112
When attempting to compute a differential orderIndex we were calculating the
bailout condition correctly, but then an errant "+ 1" meant the orderIndex we
created was invalid.
Added test.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D89115
This reverts commit 1ceaffd95a.
The build is broken with -DBUILD_SHARED_LIBS=ON ; seems like a possible
layering issue to investigate:
tools/mlir/lib/IR/CMakeFiles/obj.MLIRIR.dir/Operation.cpp.o: In function `mlir::MemoryEffectOpInterface::hasNoEffect(mlir::Operation*)':
Operation.cpp:(.text._ZN4mlir23MemoryEffectOpInterface11hasNoEffectEPNS_9OperationE[_ZN4mlir23MemoryEffectOpInterface11hasNoEffectEPNS_9OperationE]+0x9c): undefined reference to `mlir::MemoryEffectOpInterface::getEffects(llvm::SmallVectorImpl<mlir::SideEffects::EffectInstance<mlir::MemoryEffects::Effect> >&)'
mlir-tblgen was incompatible with libLLVM, due to explicit linkage with
libLLVMSupport etc.
As it cannot link with libLLVM, make sure all lib it uses are not using libLLVM
either.
As a side effect, also remove some explicit references to LLVM libs and use
components instead.
Differential Revision: https://reviews.llvm.org/D88846
This change allows folds to be done on a newly introduced involution trait rather than having to manually rewrite this optimization for every instance of an involution
Reviewed By: rriddle, andyly, stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D88809
* I believe this was done early on due to it being experimental/etc.
* Needed for dynamic linking in npcomp.
Differential Revision: https://reviews.llvm.org/D89081
When distributing a vector larger than the given multiplicity, we can
distribute it by block where each id gets a chunk of consecutive element
along the dimension distributed. This adds a test for this case and adds extra
checks to make sure we don't distribute for cases not multiple of multiplicity.
Differential Revision: https://reviews.llvm.org/D89061
The methods allow to check
- if an operation has dependencies,
- if there is a dependence from one operation to another.
Differential Revision: https://reviews.llvm.org/D88993
This revision also inserts an end-to-end test that lowers tensors to buffers all the way to executable code on CPU.
Differential revision: https://reviews.llvm.org/D88998
The simplest case is when the indexing maps are DimIds in every component. This covers cwise ops.
Also:
* Expose populateConvertLinalgOnTensorsToBuffersPatterns in Transforms.h
* Expose emitLoopRanges in Transforms.h
Differential Revision: https://reviews.llvm.org/D88781
Added missing strides check to verification method of rank reducing subview
which enforces strides specification for the resulting type.
Differential Revision: https://reviews.llvm.org/D88879
Rationale:
More consistent with the other names. Also forward looking to reading
in other kinds of matrices. Also fixes lint issue on hard-coded %llu.
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D89005
* New functions: mlirOperationSetAttributeByName, mlirOperationRemoveAttributeByName
* Also adds some *IsNull checks and standardizes the rest to use "static inline" form, which makes them all non-opaque and not part of the ABI (which is desirable).
* Changes needed to resolve TODOs in npcomp PyTorch capture.
Differential Revision: https://reviews.llvm.org/D88946
Subtraction is a foundational arithmetic operation that is often used when computing, for example, data transfer sets or cache hits. Since the result of subtraction need not be a convex polytope, a new class `PresburgerSet` is introduced to represent unions of convex polytopes.
Reviewed By: ftynse, bondhugula
Differential Revision: https://reviews.llvm.org/D87068
The patch fixes the types used to access the elements of the kernel parameter structure from a pointer to the structure to a pointer to the actual parameter type.
Reviewed By: csigg
Differential Revision: https://reviews.llvm.org/D88959
Add basic support for registering diagnostic handlers with the context
(actually, the diagnostic engine contained in the context) and processing
diagnostic messages from the C API.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D88736
Setting up input data for benchmarks and integration tests can be tedious in
pure MLIR. With more sparse tensor work planned, this convenience library
simplifies reading sparse matrices in the popular Matrix Market Exchange
Format (see https://math.nist.gov/MatrixMarket). Note that this library
is *not* part of core MLIR. It is merely intended as a convenience library
for benchmarking and integration testing.
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D88856
Add conversion pass for Vector dialect to SPIR-V dialect and add some simple
conversion pattern for vector.broadcast, vector.insert, vector.extract.
Differential Revision: https://reviews.llvm.org/D88761
This change replaces container used for storing temporary
strings for generated code to std::list.
SmallVector may reallocate internal data, which will invalidate
references when more than one extended instruction set is
generated.
Reviewed By: mravishankar, antiagainst
Differential Revision: https://reviews.llvm.org/D88626
Combine ExtractOp with scalar result with BroadcastOp source. This is useful to
be able to incrementally convert degenerated vector of one element into scalar.
Differential Revision: https://reviews.llvm.org/D88751
This revision adds init_tensors support to buffer allocation for Linalg on tensors.
Currently makes the assumption that the init_tensors fold onto the first output tensors.
This assumption is not currently enforced or cast in stone and requires experimenting with tiling linalg on tensors for ops **without reductions**.
Still this allows progress towards the end-to-end goal.
A pattern to convert `spv.CompositeInsert` and `spv.CompositeExtract`.
In LLVM, there are 2 ops that correspond to each instruction depending
on the container type. If the container type is a vector type, then
the result of conversion is `llvm.insertelement` or `llvm.extractelement`.
If the container type is an aggregate type (i.e. struct, array), the
result of conversion is `llvm.insertvalue` or `llvm.extractvalue`.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D88205
Adds support for SPIR-V composite speciailization constants to spv._reference_of.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D88732
The previous code did the lowering to alloca, malloc, and aligned_malloc
in a single class with different code paths that are somewhat difficult to
follow.
This change moves the common code to a base class and has a separte
derived class per lowering target that contains the specifics.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88696
This canonicalization is the counterpart of MemRefCastOp -> LinalgOp but on tensors.
This is needed to properly canonicalize post linalg tiling on tensors.
Differential Revision: https://reviews.llvm.org/D88729
While affine maps are part of the builtin memref type, there is very
limited support for manipulating them in the standard dialect. Add
transpose to the set of ops to complement the existing view/subview ops.
This is a metadata transformation that encodes the transpose into the
strides of a memref.
I'm planning to use this when lowering operations on strided memrefs,
using the transpose to remove the stride without adding a dependency on
linalg dialect.
Differential Revision: https://reviews.llvm.org/D88651
This reverts commit e9b87f43bd.
There are issues with macros generating macros without an obvious simple fix
so I'm going to revert this and try something different.
This aligns the behavior with the standard call as well as the LLVM verifier.
Reviewed By: ftynse, dcaballe
Differential Revision: https://reviews.llvm.org/D88362
New projects (particularly out of tree) have a tendency to hijack the existing
llvm configuration options and build targets (add_llvm_library,
add_llvm_tool). This can lead to some confusion.
1) When querying a configuration variable, do we care about how LLVM was
configured, or how these options were configured for the out of tree project?
2) LLVM has lots of defaults, which are easy to miss
(e.g. LLVM_BUILD_TOOLS=ON). These options all need to be duplicated in the
CMakeLists.txt for the project.
In addition, with LLVM Incubators coming online, we need better ways for these
incubators to do things the "LLVM way" without alot of futzing. Ideally, this
would happen in a way that eases importing into the LLVM monorepo when
projects mature.
This patch creates some generic infrastructure in llvm/cmake/modules and
refactors MLIR to use this infrastructure. This should expand to include
add_xxx_library, which is by far the most complicated bit of building a
project correctly, since it has to deal with lots of shared library
configuration bits. (MLIR currently hijacks the LLVM infrastructure for
building libMLIR.so, so this needs to get refactored anyway.)
Differential Revision: https://reviews.llvm.org/D85140
Class simplifies keeping track of the indentation while emitting. For every new line the current indentation is simply prefixed (if not at start of line, then it just emits as normal). Add a simple Region helper that makes it easy to have the C++ scope match the emitted scope.
Use this in op doc generator and rewrite generator.
This reverts revert commit be185b6a73 addresses shared lib failure by fixing up cmake files.
Differential Revision: https://reviews.llvm.org/D84107
Class simplifies keeping track of the indentation while emitting. For every new line the current indentation is simply prefixed (if not at start of line, then it just emits as normal). Add a simple Region helper that makes it easy to have the C++ scope match the emitted scope.
Use this in op doc generator and rewrite generator.
Differential Revision: https://reviews.llvm.org/D84107
This commit adds support to SPIR-V's composite specialization constants.
These are specialization constants which are composed of other spec
constants (whehter scalar or composite), regular constatns, or undef
values.
This commit adds support for parsing, printing, verification, and
(De)serialization.
A few TODOs are still in order:
- Supporting more types of constituents; currently, only scalar spec constatns are supported.
- Extending `spv._reference_of` to support composite spec constatns.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D88568
Add basic canonicalization patterns for the extractMap/insertMap to allow them
to be folded into Transfer ops.
Also mark transferRead as memory read so that it can be removed by dead code.
Differential Revision: https://reviews.llvm.org/D88622
Based on PyAttribute and PyConcreteAttribute classes, this patch implements the bindings of Float Attribute, Integer Attribute and Bool Attribute subclasses.
This patch also defines the `mlirFloatAttrDoubleGetChecked` C API which is bound with the `FloatAttr.get_typed` python method.
Differential Revision: https://reviews.llvm.org/D88531
Previously the actual types were not shown, which makes the message
difficult to grok in the context of long lowering chains. Also, it
appears that there were no actual tests for this.
Differential Revision: https://reviews.llvm.org/D88318
We hit an llvm_unreachable related to unranked memrefs for call ops
with scalar types. Removing the llvm_unreachable since the conversion
should gracefully bail out in the presence of unranked memrefs. Adding
tests to verify that.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88709
Instead of recursive helper method `topologicalSortImpl()`,
sort's implementation is moved to `topologicalSort()` function's
body directly. `llvm::ReversePostOrderTraversal` is used to create
a traversal of blocks in reverse post order.
Reviewed By: kiranchandramohan, rriddle
Differential Revision: https://reviews.llvm.org/D88544
This revision introduces a `subtensor` op, which is the counterpart of `subview` for a tensor operand. This also refactors the relevant pieces to allow reusing the `subview` implementation where appropriate.
This operation will be used to implement tiling for Linalg on tensors.
The documentation for the NormalizeMemRefs pass and the associated MemRefsNormalizable
traits was confusing and not on the website. This update clarifies the language
around the difference between a MemRef Type, an operation that accesses the value of
MemRef Type, and better documents the limitations of the current implementation.
This patch also includes some basic debugging information for the pass so people
might have a chance of figuring out why it doesn't work on their code.
Differential Revision: https://reviews.llvm.org/D88532
```
LinalgTilingOptions &setTileSizes(ValueRange ts)
```
makes it all too easy to create stack-use-after-return errors.
In particular, c694588fc5 introduced one such issue.
Instead just take a copy in the lambda and be done with it.
The current implementation uses a fold expression to add all of the operations at once. This is really nice, but apparently the lifetime of each of the AbstractOperation instances is for the entire expression which may lead to a stack overflow for large numbers of operations. This splits the method in two to allow for the lifetime of the AbstractOperation to be properly scoped.
The pattern is structured similar to other patterns like
LinalgTilingPattern. The fusion patterns takes options that allows you
to fuse with producers of multiple operands at once.
- The pattern fuses only at the level that is known to be legal, i.e
if a reduction loop in the consumer is tiled, then fusion should
happen "before" this loop. Some refactoring of the fusion code is
needed to fuse only where it is legal.
- Since the fusion on buffers uses the LinalgDependenceGraph that is
not mutable in place the fusion pattern keeps the original
operations in the IR, but are tagged with a marker that can be later
used to find the original operations.
This change also fixes an issue with tiling and
distribution/interchange where if the tile size of a loop were 0 it
wasnt account for in these.
Differential Revision: https://reviews.llvm.org/D88435
This is the first of several steps to support distributing large vectors. This
adds instructions extract_map and insert_map that allow us to do incremental
lowering. Right now the transformation only apply to simple pointwise operation
with a vector size matching the multiplicity of the IDs used to distribute the
vector.
This can be used to distribute large vectors to loops or SPMD.
Differential Revision: https://reviews.llvm.org/D88341
Switch to a dummy op in the test dialect so we can remove the -allow-unregistred-dialect
on ops.mlir and invalid.mlir. Change after comment on D88272.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D88587
while folding tensor_reshape op.
While folding reshapes that introduce unit extent dims, the logic to
compute the reassociation maps can be generalized to handle some
corner cases, for example, when the folded shape still has unit-extent
dims but corresponds to folded unit extent dims of the expanded shape.
Differential Revision: https://reviews.llvm.org/D88521
AffineMapAttr is already part of base, it's just impossible to refer to
it from ODS without pulling in the definition from Affine dialect.
Differential Revision: https://reviews.llvm.org/D88555
Current setup for conv op vectorization does not enable user to specify tile
sizes as well as dimensions for vectorization. In this commit we change that by
adding tile sizes as pass arguments. Every dimension with corresponding tile
size > 1 is automatically vectorized.
Differential Revision: https://reviews.llvm.org/D88533
This commit adds support for subviews which enable to reduce resulting rank
by dropping static dimensions of size 1.
Differential Revision: https://reviews.llvm.org/D88534
Added support for different function control
in serialization and deserialization.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D88280
Also add a verifier pass to ExecutionEngine.
It's hard to come up with a test case, since mlir-opt always add location info after parsing it (?)
Differential Revision: https://reviews.llvm.org/D88135
This patch adds support for the 'return' and 'call' ops to the bare-ptr
calling convention. These changes also align the bare-ptr calling
convention code with the latest changes in the default calling convention
and reduce the amount of customization code needed.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D87724
* Providing stable, C-accessible definitions for bridging MLIR Python<->C APIs, we eliminate inter-extension dependencies (i.e. they can all share a diamond dependency on the MLIR C-API).
* Just provides accessors for context and module right now.
* Needed in NPComp in ~a week or so for high level Torch APIs.
Differential Revision: https://reviews.llvm.org/D88426
This patch introduces the acc.shutdown operation that represents an OpenACC shutdown directive.
Clauses are derived from the spec 2.14.2
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88272
This patch introduces the init operation that represents the init executable directive
from the OpenACC 3.0 specifications.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88254
This patch introduce the wait operation that represent the OpenACC wait directive.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88125
- Add a minimalist C API for mlir::Dialect.
- Allow one to query the context about registered and loaded dialects.
- Add API for loading dialects.
- Provide functions to register the Standard dialect.
When used naively, this will require to separately register each dialect. When
we have more than one exposed, we can add variadic macros that expand to
individual calls.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D88162
This patch introduce the update operation that represent the OpenACC update directive.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88102
Manually-defined named ops do not currently support `init_tensors` or return values and may never support them. Add extra interface to the StructuredOpInterface so that we can still write op-agnostic transformations based on StructuredOpInterface.
This is an NFC extension in preparation for tiling on tensors.
Differential Revision: https://reviews.llvm.org/D88481
This revision changes the signatures of helper function that Linalg uses to create loops so that they can also take iterArgs.
iterArgs are asserted empty to ensure no functional change.
This is a mechanical change in preparation of tiling on linalg on tensors to avoid polluting the implementation with an NFC change.
Differential Revision: https://reviews.llvm.org/D88480
The previous implementation did not support sinking simple expressions. In particular,
it is often beneficial to sink dim operations.
Differential Revision: https://reviews.llvm.org/D88439
Summary:
========
Bugzilla Ticket No: Bug 46884 [https://bugs.llvm.org/show_bug.cgi?id=46884]
Flush op assembly syntax was ambiguous:
Consider the below test case:
flush operation is not having any arguments.
But the next statement token i.e "%2" is read as the argument for flush operation and then translator issues an error.
***************************************************************
$ cat -n flush.mlir
1 llvm.func @_QQmain(%arg0: !llvm.i32) {
2 %0 = llvm.mlir.constant(1 : i64) : !llvm.i64
3 %1 = llvm.alloca %0 x !llvm.i32 {in_type = i32, name = "a"} : (!llvm.i64) -> !llvm.ptr<i32>
4 omp.flush
5 %2 = llvm.load %1 : !llvm.ptr<i32>
6 llvm.return
7 }
$ mlir-translate -mlir-to-llvmir flush.mlir
flush.mlir:5:6: error: expected ':'
%2 = llvm.load %1 : !llvm.ptr<i32>
^
***************************************************************
Solution:
=========
Introduced begin ( `(` ) and end token ( `)` ) to determince the begin and end of variadic arguments.
The patch includes code changes and testcase modifications.
Reviewed By: Valentin Clement, Mehdi AMINI
Differential Revision: https://reviews.llvm.org/D88376
Add a basic verifier for the data operation following the restriction from the standard.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88334
The OmpDialect is in practice optional during translation to LLVM IR: the code is tolerant
to have a "nullptr" when not present / needed.
The dependency still exists on the export to LLVMIR.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88351
Adding missing code that should have been part of "D85869: Utility to
vectorize loop nest using strategy."
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D88346
- use select-ops to make the lowering simpler
- change style of FileCheck variables names to be consistent
- change some variable names in the code to be more explicit
Differential Revision: https://reviews.llvm.org/D88258
Recently, restrictions on vector reductions were made more relaxed by
accepting any width signless integer and floating-point. This CL relaxes
the restriction even more by including unsigned and signed integers.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D88442
(1) simplify integer printing logic by always using 64-bit print
(2) add index support (since vector<16xindex> is planned to be added)
(3) adjust naming convention print_x -> printX
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D88436
Add operands to represent if and deviceptr. Default clause is represented with
an attribute.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88331
This patch remove the printer/parser for the acc.data operation since its syntax
fits nicely with the assembly format. It reduces the maintenance for this op.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88330
This patch remove the detach and delete operands. Those operands represent the detach
and delete clauses that will appear in another operation acc.exit_data
Reviewed By: kiranktp, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88326
Normalizing memrefs failed when a caller of symbolic use in a function
can not be casted to `CallOp`. This patch avoids the failure by checking
the result of the casting. If the caller can not be casted to `CallOp`,
it is skipped.
Differential Revision: https://reviews.llvm.org/D87746
- Eliminate incorrect |
- Eliminate memspace0 as the memory spaces currently are integer literals and memory
space 0 is not explicitly printed.
Differential Revision: https://reviews.llvm.org/D88171