This change exposes printer flags in AsmState and AsmStateImpl. All functions
receiving AsmState as a parameter now use the flags from the AsmState instead of
taking an additional OpPrintingFlags parameter.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D119870
This allow user to register a callback that can annotate operations
during software pipelining. This allows user potential annotate op to
know what part of the pipeline they correspond to.
Differential Revision: https://reviews.llvm.org/D119866
Without results, there is no getType injected and so generating one in prefixed form doesn't result in any failures during C++ compilation.
Differential Revision: https://reviews.llvm.org/D119871
This test shows that when access patterns do not match (e.g. transposing
a row-wise sparse matrix into another row-wise sparse matrix), a conversion
operation in between can enable codegen (i.e. avoid cycle in iteration graph).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D119864
Optional parameters with `defaultValue` set will be populated with that value if they aren't encountered during parsing. Moreover, parameters equal to their default values are elided when printing.
Depends on D118210
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118544
This change is needed when lowering alloca()-using code on targets
such as ROCDL that represent private scratch space as a separate
address space.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119775
Casting between scalable vectors and fixed-length vectors doesn't make
sense. If one of the operands is scalable, the other has to be scalable
to be able to guarantee they have the same shape at runtime.
Differential Revision: https://reviews.llvm.org/D119568
This patch changes the syntax of omp.atomic.update to allow the other
dialects to modify the variable with appropriate operations in the
region.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119522
Add verifier for gpu.alloc op to verify if the dimension operand counts
and symbol operand counts are same as their memref counterparts.
Differential Revision: https://reviews.llvm.org/D117427
Support ALLOW filters and DENY filters. This is needed for compatibility with existing code that specifies more complex op filters.
Differential Revision: https://reviews.llvm.org/D119820
Also, it seems Khronos has changed html spec format so small adjustment to script was needed.
Base op parsing is also probably broken.
Differential Revision: https://reviews.llvm.org/D119678
The only method to create a true dense tensor (i.e un-annotated) in MLIR-PyTACO
is through the from_array method. However, the annotated all dense tensors are
also implemented as true dense tensor currently. The PR fixes the
implementation to support annotated all dense sparse tensors.
Extend the tensor init method to support the construction of a tensor without
any sparsity annotation.
Change the tensor to_file method to only support writing unpacked sparse
tensors to file through the MLIR sparse tensor dialect.
Add unit tests for true dense tensors and all dense sparse tensors.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D119500
This patch changes the argument from template-IRBuilder to IRBuilderBase
thus allowing us to write less code while getting the location from a
builder.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119717
* While annoying, this is the only way to get C++ exception handling out of the happy path for normal iteration.
* Implements sq_length and sq_item for the sequence protocol (used for iteration, including list() construction).
* Implements mp_subscript for general use (i.e. foo[1] and foo[1:1]).
* For constructing a `list(op.results)`, this reduces the time from ~4-5us to ~1.5us on my machine (give or take measurement overhead) and eliminates C++ exceptions, which is a worthy goal in itself.
* Compared to a baseline of similar construction of a three-integer list, which takes 450ns (might just be measuring function call overhead).
* See issue discussed on the pybind side: https://github.com/pybind/pybind11/issues/2842
Differential Revision: https://reviews.llvm.org/D119691
Rationale:
empty line between main include for this file
moved include that actually defines code into right section
Note that this revision started as breaking up ops/attrs even more
(for bug https://github.com/llvm/llvm-project/issues/52748), but due
the the connection in Dialect.initalize(), this cannot be split further).
All heavy lifting refactoring was already done by River in previous cleanup.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D119617
Adds a pointer type to EmitC. The emission of pointers is so far only
possible by using the `emitc.opaque` type
Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D119337
Adapt the region builder signature to hand in the attributes of the created ops. The revision is a preparation step the support named ops that need access to the operation attributes during op creation.
Depends On D119692
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D119693
Group and reorder the classed defined by comprehension.py and add type annotations.
Depends On D119126
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D119692
Index attributes had no default value, which means the attribute values had to be set on the operation. This revision adds a default parameter to `IndexAttrDef`. After the change, every index attribute has to define a default value. For example, we may define the following strides attribute:
```
```
When using the operation the default stride is used if the strides attribute is not set. The mechanism is implemented using `DefaultValuedAttr`.
Additionally, the revision uses the naming index attribute instead of attribute more consistently, which is a preparation for follow up revisions that will introduce function attributes.
Depends On D119125
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D119126
This pass doesn't have any limitations specific to FuncOp and it will be useful to be able to run it on other ops (e.g. gpu.func).
Differential Revision: https://reviews.llvm.org/D119662
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.
If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.
The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.
Reviewed By: jdoerfert, arsenm, kpyzhov
Differential Revision: https://reviews.llvm.org/D119216
When lowering to memrefCopy call, the size for i1 type was calculated as 0.
Instead of using getTypeSizeInBits() and dividing by 8, we should just use getTypeSize().
Differential Revision: https://reviews.llvm.org/D119540
The lowering creates llvm.insertvalue with the rank value, so it needs to use
index type instead of 64 bit integer type. Otherwise, we get an error:
llvm.insertvalue' op Type mismatch: cannot insert 'i64' into '!llvm.struct<(i32, ptr<i8>)>'
Differential Revision: https://reviews.llvm.org/D119534
This patch simply adds an optional garbage collector attribute to LLVMFuncOp which maps 1:1 to the "gc" property of functions in LLVM.
Differential Revision: https://reviews.llvm.org/D119492
This allows operations to control the block ids used by the printer in nested regions.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D115849
Previously, OpDSL did not support rank polymorphism, which required a separate implementation of linalg.fill. This revision extends OpDSL to support rank polymorphism for a limited class of operations that access only scalars and tensors of rank zero. At operation instantiation time, it scales these scalar computations to multi-dimensional pointwise computations by replacing the empty indexing maps with identity index maps. The revision does not change the DSL itself, instead it adapts the Python emitter and the YAML generator to generate different indexing maps and and iterators depending on the rank of the first output.
Additionally, the revision introduces a `linalg.fill_tensor` operation that in a future revision shall replace the current handwritten `linalg.fill` operation. `linalg.fill_tensor` is thus only temporarily available and will be renamed to `linalg.fill`.
Reviewed By: nicolasvasilache, stellaraccident
Differential Revision: https://reviews.llvm.org/D119003
Add new operations to the gpu dialect to represent device side
asynchronous copies. This also add the lowering of those operations to
nvvm dialect.
Those ops are meant to be low level and map directly to llvm dialects
like nvvm or rocdl.
We can further add higher level of abstraction by building on top of
those operations.
This has been discuss here:
https://discourse.llvm.org/t/modeling-gpu-async-copy-ampere-feature/4924
Differential Revision: https://reviews.llvm.org/D119191