The support for this has been added by 946311b893
but then ignored by bc22b5c9a2.
This enables one to write generic code that can be instantiated for both
specific operation classes and the common base class without
specialization. Examples include functions that take/return ops, such
as:
```mlir
template <typename FnTy>
void applyIf(FnTy &&lambda, ...) {
for (Operation *op : ...) {
auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op);
if (specific)
lambda(specific);
}
}
```
that would otherwise need to rely on template specialization to support
lambdas that take specific operations and those that take `Operation *`.
Differential Revision: https://reviews.llvm.org/D125543
Reviewed by: rriddle
This is a followup to D125431, to keep from confusing the machinery that generates diffs (since combining these two changes into one would obfuscate the changes actually made in the previous differential).
Depends On D125431
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D125432
This commit enables proper highlighting when inner statements are
outside of a constraint/pattern/etc. This shouldn't really happen in
actual code, but can happen in documentation (which uses the same
syntax grammar).
In addition to reducing code repetition, this also helps ensure that the various API functions follow the naming convention of mlir::sparse_tensor::primaryTypeFunctionSuffix (e.g., due to typos in the repetitious code).
Depends On D125428
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D125431
This follows the same general structure of the MLIR and PDLL language
servers. This commits adds the basic functionality for setting up the server,
and initially only supports providing diagnostics. Followon commits will
build out more comprehensive behavior.
Realistically this should eventually live in llvm/, but building in MLIR is an easier
initial step given that:
* All of the necessary LSP functionality is already here
* It allows for proving out useful language features (e.g. compilation databases)
without affecting wider scale tablegen users
* MLIR has a vscode extension that can immediately take advantage of it
Differential Revision: https://reviews.llvm.org/D125440
This enables the compiler to perform devirtualization. And benchmarks
indicate devirtualization can sometimes give considerable speedup.
Depends On D122061
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D125428
In the overwhelmingly majority of cases only one dialect is generated at a time
anyways, and this restriction more easily catches user error when multiple
dialects might be generated. We hit this semi-recently with the PDL dialect,
and circt+other downstream users are also actively hitting this as well.
Differential Revision: https://reviews.llvm.org/D125651
Op registration mechanism does not allow for ops with the same name to be
re-registered. This is okay to avoid name conflicts and debug
double-registration, but may be problematic for dialect extensions that may get
registered several times (unlike dialects that are deduplicated in the
registry). When registering ops through the Transform dialect extension
mechanism, check first if the ops are already registered and only complain in
the case of repeated registration with the same name but different TypeID.
Differential Revision: https://reviews.llvm.org/D125554
An attribute without a type builder followed by a colon in an assembly format is potentially ambiguous because the parser will read ahead to parse the colon-type and pass this as the type argument to the attribute's constructor.
However, the previous verifier that checks for this ambiguity erroneously produces an error in the case of
```
let assemblyFormat = "( `(` $attr `)` )? `:`";
```
This patch fixes the bug by implementing a checker that correctly handles all edge cases, including very strange assembly formats like:
```
let assemblyFormat = "( `(` $attr ) : (`>`)? attr-dict (`>` $a^) : (`<`)? `:`";
```
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125445
This was carry over from LLVM IR where the alias definition can
be ambiguous, but MLIR type aliases have no such problems.
Having the `type` keyword is superfluous and doesn't add anything.
This commit drops it, which also nicely aligns with the syntax for
attribute aliases (which doesn't have a keyword).
Differential Revision: https://reviews.llvm.org/D125501
This patch adds a topological sort utility and pass. A topological sort reorders
the operations in a block without SSA dominance such that, as much as possible,
users of values come after their producers.
The utility function sorts topologically the operation range in a given block
with an optional user-provided callback that can be used to virtually break cycles.
The toposort pass itself recursively sorts graph regions under the target op.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D125063
The attribute self type parameter is currently treated like any other attribute parameter in the assembly format. The self type parameter should be handled by the operation parser and printer and play no role in the generated parsers and printers of attributes.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125724
This is the first implementation of complex (f64 and f32) support
in the sparse compiler, with complex add/mul as first operations.
Note that various features are still TBD, such as other ops, and
reading in complex values from file. Also, note that the
std::complex<float> had a bit of an ABI issue when passed as
single argument. It is still TBD if better solutions are possible.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D125596
We were custom counting per bit for the clz instruction. Math dialect
now has an intrinsic to do this in one instruction. Migrated to this
instruction and fixed a minor bug math-to-llvm for the intrinsic.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D125592
In this instance, the trailing return type does not improve readability
as it repeats what is returned in the same line.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125697
This changes replaces the `fully-dynamic-layout-maps` options (which was badly named) with two new options:
* `unknown-type-conversion` controls the layout maps on buffer types for which no layout map can be inferred.
* `function-boundary-type-conversion` controls the layout maps on buffer types inside of function signatures.
Differential Revision: https://reviews.llvm.org/D125615
Instead of recomputing memref types from tensor types, try to infer them when possible. This results in more precise layout maps.
Differential Revision: https://reviews.llvm.org/D125614
Erase gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.
Reviewed By: bondhugula, csigg
Differential Revision: https://reviews.llvm.org/D124257
The warning caused build errors on a couple flang testers that are
building with -Werror. The diagnostic change makes the generated
error correct.
This is a followup to https://reviews.llvm.org/D125549
Differential Revision: https://reviews.llvm.org/D125587
There are a lot of cases where we accidentally ignored the result of some
parsing hook. Mark ParseResult as LLVM_NODISCARD just like ParseResult is.
This exposed some stuff to clean up, so do.
Differential Revision: https://reviews.llvm.org/D125549
In d4555698f8, the name of nano precision timer function has changed from `nano_time` to `nanoTime`, but benchmarks were not updated to reflect that. This change addresses the discrepancy.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D125217
This patch fixes a bug in areIdsUnique where it ignores the [start, end] range.
No test case is added since there are no use cases through IR from where it
can be tested, and it is hard to create a unittest since we do not currently
have Values in unittests.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D124735
This pass is to handle computationally complex operations like
tensor.pad which are not simply lowered to the exact same operation in
the memref dialect.
Differential Revision: https://reviews.llvm.org/D125384
Due to an apparent bug in the Doxygen version <1.8.16 used to generate
documentation for MLIR, parts of the navigation (specifically, the lists
of inherited methods for classes) are unusable due to dynsections.js
missing from the output generated by Doxygen. Setting this flag makes
Doxygen always produce the file.
Implements a floating-point sign operator (using the new semi-ring ops)
that accomodates +/-Inf and +/-NaN in consistent way.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D125494
The current grammar is really crusty, only supports a handful of
cases, and is also out-of-date after various refactorings. This commit
refactors the textmate grammar to handle significantly more cases,
and now provides proper coloring for a majority of cases (including
dialect attributes, operations, types, etc.)
Differential Revision: https://reviews.llvm.org/D125458
We shouldn't be making assumptions about the result of llvm::getTypeName,
which may have different results for anonymous namespaces depending
on the platform.
This commit refactors the current pass manager support to allow for
operation agnostic pass managers. This allows for a series of passes
to be executed on any viable pass manager root operation, instead
of one specific operation type. Op-agnostic/generic pass managers
only allow for adding op-agnostic passes.
These types of pass managers are extremely useful when constructing
pass pipelines that can apply to many different types of operations,
e.g., the default inliner simplification pipeline. With the advent of
interface/trait passes, this support can be used to define FunctionOpInterface
pass managers, or other pass managers that effectively operate on
specific interfaces/traits/etc (see #52916 for an example).
Differential Revision: https://reviews.llvm.org/D123536
This patch references code for translating memref.reinterpret_cast ops
to add translation rules for memref.reshape ops that have a static shape
argument. Since reshape ops don't have offsets, sizes, or strides, this
patch simply sets the allocated and aligned pointers of the MemRef
descriptor.
Reviewed By: ftynse, cathyzhyi
Differential Revision: https://reviews.llvm.org/D125039
Instead of requiring the client to compute the "isSplat" bit,
compute it internally. This makes the logic more consistent
and defines away a lot of "elements.size()==1" in the clients.
This addresses Issue #55185
Differential Revision: https://reviews.llvm.org/D125447
Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory
transfer for the operands and yield values.
This also add an integration test running on GPU warp. The same tests can be
later re-used with different comment lines to tests distribution
transformations.
This is mostly from @springerm contribution.
Differential Revision: https://reviews.llvm.org/D125430
Complex nested in other types is perfectly fine, just nested structs
aren't supported. Instead of checking whether there's nesting just check
whether the struct we're dealing with is a complex number.
Differential Revision: https://reviews.llvm.org/D125381
1. Call copy constructor of the base class
2. Assign value of the option directly
Reviewed By: dcaballe, rriddle
Differential Revision: https://reviews.llvm.org/D125101
D125214 split off a MLIRExecutionEngineUtils library that is used
by MLIRGPUTransforms. However, currently the entire ExecutionEngine
directory is skipped if the LLVM_NATIVE_ARCH target is not available.
Move the check for LLVM_NATIVE_ARCH, such that MLIRExecutionEngineUtils
always gets built, and only the JIT-related libraries are omitted
without native arch.
Differential Revision: https://reviews.llvm.org/D125357
This change integrates the BufferResultsToOutParamsPass into One-Shot Module Bufferization. This improves memory management (deallocation) when buffers are returned from a function.
Note: This currently only works with statically-sized tensors. The generated code is not very efficient yet and there are opportunities for improvment (fewer copies). By default, this new functionality is deactivated.
Differential Revision: https://reviews.llvm.org/D125376
Bufferization has an optional filter to exclude certain ops from analysis+bufferization. There were a few remaining places in the codebase where the filter was not checked.
Differential Revision: https://reviews.llvm.org/D125356
When a custom operation is unknown and does not have a dialect prefix, we currently
emit an error using the name of the operation with the default dialect prefix. This
leads to a confusing error message, especially when operations get moved between dialects.
For example, `func` was recently moved out of `builtin` and to the `func` dialect. The current
error message we get is:
```
func @foo()
^ custom op 'builtin.func' is unknown
```
This could lead users to believe that there is supposed to be a `builtin.func`,
because there used to be. This commit adds a better error message that does
not assume that the operation is supposed to be in the default dialect:
```
func @foo()
^ custom op 'func' is unknown (tried 'builtin.func' as well)
```
Differential Revision: https://reviews.llvm.org/D125351
`linalg.generic` ops have canonicalizers that either remove arguments
not used in the payload, or redundant arguments. Combine these and
enhance the canonicalization to also remove results that have no use.
This is effectively dead code elimination for Linalg ops.
Differential Revision: https://reviews.llvm.org/D123632
We can simplify the code needed to implement dyn_cast/cast/isa support for MLIR operations with documented interfaces via the CastInfo structures. This will also provide an example of how to use CastInfo.
Depends on D123901
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124963
Using "replaceUsesOfWith" is incorrect because the same initializer value may appear multiple times.
For example, if the epilogue is needed when this loop is unrolled
```
%x:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
```
then both epilogue's arguments will be incorrectly renamed to use the same result index (note #1 in both cases):
```
%x_unrolled:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
%x_epilogue:2 = scf.for ... iter_args(%arg1 = %x_unrolled#1, %arg2 = %x_unrolled#1) {
...
}
```
This is a full audit of emitError calls, I took the opportunity
to remove extranous parens and fix a couple cases where we'd
generate multiple diagnostics for the same error.
Differential Revision: https://reviews.llvm.org/D125355
Now that TableGen no longer relies on global Record state, we can allow
for the client to own the RecordKeeper and SourceMgr. Given that TableGen
internally still relies on the global llvm::SrcMgr, this method unfortunately
still isn't thread-safe.
Differential Revision: https://reviews.llvm.org/D125277
Change the parsing logic to use StringRef instead of lower level
char* logic. Also, if emitting a diagnostic on the first token
in the file, we make sure to use that position instead of the
very start of the file.
Differential Revision: https://reviews.llvm.org/D125353
Before dump, Insetad of switching to generic form silently after
verification failure. Print some debug logs to help identify why an op
may be printed in a different way.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125136
Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.
Differential Revision: https://reviews.llvm.org/D125244
The current implementation of `cloneWithNewYields` has a few issues
- It clones the loop body of the original loop to create a new
loop. This is very expensive.
- It performs `erase` operations which are incompatible when this
method is called from within a pattern rewrite. All erases need to
go through `PatternRewriter`.
To address these a new utility method `replaceLoopWithNewYields` is added
which
- moves the operations from the original loop into the new loop.
- replaces all uses of the original loop with the corresponding
results of the new loop
- use a call back to allow caller to generate the new yield values.
- the original loop is modified to just yield the basic block
arguments corresponding to the iter_args of the loop. This
represents a no-op loop. The loop itself is dead (since all its uses
are replaced), but is not removed. The caller is expected to erase
the op. Consequently, this method can be called from within a
`matchAndRewrite` method of a `PatternRewriter`.
The `cloneWithNewYields` could be replaces with
`replaceLoopWithNewYields`, but that seems to trigger a failure during
walks, potentially due to the operations being moved. That is left as
a TODO.
Differential Revision: https://reviews.llvm.org/D125147
This ensures that attributes such as the index bitwidth propagate
correctly to the AMDGPUToROCDL patterns.
Differential Revision: https://reviews.llvm.org/D125320
This patch updates calls to AnalysisState::getBuffer() so that we return
early with a failure if the call does not succeed.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D125251
By analogy with the NVGPU dialect, introduce an AMDGPU dialect for
AMD-specific intrinsic wrappers.
The dialect initially includes wrappers around the raw buffer intrinsics.
On AMD GPUs, a memref can be converted to a "buffer descriptor" that
allows more precise control of memory access, such as by allowing for
out of bounds loads/stores to be replaced by 0/ignored without adding
additional conditional logic, which is important for performance.
The repository currently contains a limited conversion from
transfer_read/transfer_write to Mubuf intrinsics, which are an older,
deprecated intrinsic for the same functionality.
The new amdgpu.raw_buffer_* ops allow these operations to be used
explicitly and for including metadata such as whether the target
chipset is an RDNA chip or not (which impacts the interpretation of
some bits in the buffer descriptor), while still maintaining an
MLIR-like interface.
(This change also exposes the floating-point atomic add intrinsic.)
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D122765
A typical problem with missing a token is that the missing
token is at the end of a line. The problem with this is that
the error message gets reported on the start of the following
line (which is where the next / invalid token is) which can
be confusing.
Handle this by noticing this case and backing up to the end of
the previous line.
Differential Revision: https://reviews.llvm.org/D125295
Building libMLIR.so currently fails with:
> /usr/bin/ld: /tmp/ccNzulEA.ltrans39.ltrans.o: in function `(anonymous namespace)::SerializeToHsacoPass::optimizeLlvm(llvm::Module&, llvm::TargetMachine&)':
> /builddir/build/BUILD/llvm-project-15.0.0.src/mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp:328: undefined reference to `mlir::makeOptimizingTransformer(unsigned int, unsigned int, llvm::TargetMachine*)'
This is because MLIRGPUTransforms depends on MLIRExecutionEngine in
61bb2e4ea8/mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp (L328),
but MLIRExecutionEngine is marked as excluded from libMLIR.so.
However, this code doesn't require the full execution engine: It
only performs middle-end optimization, and does not need any of
the JIT/codegen infrastructure. As such, split off a separate
library MLIRExecutionEngineUtils, which only contains that part
and is not excluded from libMLIR.so.
Fixes https://github.com/llvm/llvm-project/issues/54242.
Differential Revision: https://reviews.llvm.org/D125214
Currently, building mlir with the python bindings enabled on Windows in Debug is broken because pybind11, python and cmake don't like to play together. This change normalizes how the three interact, so that the builds can now run and succeed.
The main issue is that python and cmake both make assumptions about which libraries are needed in a Windows build based on the flavor.
- cmake assumes that a debug (or a debug-like) flavor of the build will always require pythonX_d.lib and provides no option/hint to tell it to use a different library. cmake does find both the debug and release versions, but then uses the debug library.
- python (specifically pyconfig.h and by extension python.h) hardcodes the dependency on pythonX_d.lib or pythonX.lib depending on whether `_DEBUG` is defined. This is NOT transparent - it does not show up anywhere in the build logs until the link step fails with `pythonX_d.lib is missing` (or `pythonX.lib is missing`)
- pybind11 tries to "fix" this by implementing a workaround - unless Py_DEBUG is defined, `_DEBUG` is explicitly undefined right before including python headers. This also requires some windows headers to be included differently, so while clever, this is a non-trivial workaround.
mlir itself includes the pybind11 headers (which contain the workaround) AS WELL AS python.h, essentially always requiring both pythonX.lib and pythonX_d.lib for linking. cmake explicitly only adds one or the other, so the build fails.
This change does a couple of things:
- In the cmake files, explicitly add the release version of the python library on Windows builds regardless of flavor. Since Py_DEBUG is not defined, pybind11 will always require release and it will be satisfied
- To satisfy python as well, this change removes any explicit inclusions of Python.h on Windows instead relying on the fact that pybind11 headers will bring in what is needed
There are a few additional things that we could do but I rejected as unnecessary at this time:
- define Py_DEBUG based on the CMAKE_BUILD_TYPE - this will *mostly* work, we'd have to think through multiconfig generators like VS, but it's possible. There doesn't seem to be a need to link against debug python at the moment, so I chose not to overcomplicate the build and always default to release
- similar to above, but define Py_DEBUG based on the CMAKE_BUILD_TYPE *as well as* the presence of the debug python library (`Python3_LIBRARY_DEBUG`). Similar to above, this seems unnecessary right now. I think it's slightly better than above because most people don't actually have the debug version of python installed, so this would prevent breaks in that case.
- similar to the two above, but add a cmake variable to control the logic
- implement the pybind11 workaround directly in mlir (specifically in Interop.h) so that Python.h can still be included directly. This seems prone to error and a pain to maintain in lock step with pybind11
- reorganize how the pybind11 headers are included and place at least one of them in Interop.h directly, so that the header has all of its dependencies included as was the original intention. I decided against this because it really doesn't need pybind11 logic and it's always included after pybind11 is, so we don't necessarily need the python includes
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D125284
Merge the documentation of the definition of extensible dialects
with the definition of dialects.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125200
Add attribute to be able to generate the intrinsic version of async copy
generating a copy with l1 bypass. This correspond to
cp.async.cg.shared.global in ptx.
Differential Revision: https://reviews.llvm.org/D125241
There are a couple of issues with the python bindings on Windows:
- `create_symlink` requires special permissions on Windows - using `copy_if_different` instead allows the build to complete and then be usable
- the path to the `python_executable` is likely to contain spaces if python is installed in Program Files. llvm's python substitution adds extra quotes in order to account for this case, but mlir's own python substitution does not
- the location of the shared libraries is different on windows
- if the type is not specified for numpy arrays, they appear to be treated as strings
I've implemented the smallest possible changes for each of these in the patch, but I would actually prefer a slightly more comprehensive fix for the python_executable and the shared libraries.
For the python substitution, I think it makes sense to leverage the existing %python instead of adding %PYTHON and instead add a new variable for the case when preloading is needed. This would also make it clearer which tests are which and should be skipped on platforms where the preloading won't work.
For the shared libraries, I think it would make sense to pass the correct path and extension (possibly even the names) to the python script since these are known by lit and don't have to be hardcoded in the test at all.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D125122
This patch fixed the padding size calculation for Conv2d ops when the stride > 1. It contains the changes below:
- Use addBound to add constraint for AffineApplyOp in getUpperBoundForIndex. So the result value can be mapped and retrieved later.
- Fixed the bound from AffineMinOp by adding as a closed bound. Originally the bound was added as an open upper bound, which results in the incorrect bounds when we multiply the values. For example:
```
%0 = affine.min affine_map<()[s0] -> (4, -s0 + 11)>()[iv0]
%1 = affine.apply affine_map<()[s0] -> (s0 * 2)>()[%0]
If we add the affine.min as an open bound, addBound will internally transform it into the close bound "%0 <= 3". The following sliceBounds will derive the bound of %1 as "%1 <= 6" and return the open bound "%1 < 7", while the correct bound should be "%1 <= 8".
```
- In addition to addBound, I also changed sliceBounds to support returning closed upper bound, since for the size computation, we usually care about the closed bounds.
- Change the getUpperBoundForIndex to favor constant bounds when required. The sliceBounds will return a tighter but non-constant bounds, which can't be used for padding. The constantRequired option requires getUpperBoundForIndex to get the constant bounds when possible.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124821
This patch augments the `tensor-bufferize` pass by adding a conversion
rule to translate ReshapeOp from the `tensor` dialect to the `memref`
dialect, in addition to adding a unit test to validate the translation.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D125031
libm doesn't have overloads for the small types, so promote them to a
bigger type and use the f32 function.
Differential Revision: https://reviews.llvm.org/D125093
Adds missing logic in the lowering from NvGPU to NVVM to support fp32
(in an accumulator operand) and tf32 (in multiplicand operand) types.
Fixes logic in one of the helper functions for converting the result
of a mma.sync operation with multiple 8x256bit output tiles, which is
the case for f32 outputs.
Differential Revision: https://reviews.llvm.org/D124533
All llvm-project fuzzers use this library to parse command-line arguments.
Many of them don't deal with LLVM IR or modules in any way. Bundling those
functions in one library forces build dependencies that don't need to be there.
Among other things, this means check-clang-pseudo no longer depends on most of
LLVM.
Differential Revision: https://reviews.llvm.org/D125081
This was leftover from when the standard dialect was destroyed, and
when FuncOp moved to the func dialect. Now that these transitions
have settled a bit we can drop these.
Most updates were handled using a simple regex: replace `^( *)func` with `$1func.func`
Differential Revision: https://reviews.llvm.org/D124146
The LLVM ThreadPool recently got the addition of the concept of
ThreadPoolTaskGroup: this is a way to "partition" the threadpool
into a group of tasks and enable nested parallelism through this
grouping at every level of nesting.
We make use of this feature in MLIR threading abstraction to fix a long
lasting TODO and enable nested parallelism.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124902
https://reviews.llvm.org/D124075 causes MLIR to no longer build
when using make rather than ninja, due to a tablegen-generated
header being used before it is created.
It seems that this is related to the use of LLVM_ENABLE_OBJLIB when
using add_tablgen with a non-Ninja/Xcode generator. In that case an
intermediate objlib target is generated.
This patch fixes the issue by a) declaring dependencies in
add_tablegen for mlir-pdll and b) making sure those dependencies
are added to the objlib target.
Differential Revision: https://reviews.llvm.org/D125010
Ops that are created during the bufferization were not analyzed (when run with One-Shot Bufferize), and users should instead create memref ops directly.
Futhermore, this fixes an issue where an op was erased (and put on the `erasedOps` list), but subsequently a new tensor op was created at the same memory location. This op was then not bufferized. Disallowing the creation of new tensor ops simplifies the bufferization and fixes such issues.
Differential Revision: https://reviews.llvm.org/D125017
This follows the same implementation strategy as scf::ForOp and common functionality is extracted into helper functions.
This implementation works well in cases where each yielded value (from either body/condition region) is equivalent to the corresponding bbArg of the parent block. In that case, each OpResult of the loop may be aliasing with the corresponding OpOperand of the loop (and with no other OpOperand).
In the absence of said equivalence relationship, new buffer copies must be inserted, so that the aliasing OpOperand/OpResult contract of scf::WhileOp is honored. In essence, by yielding a newly allocated buffer, we can enforce the specified may-alias relationship. (Newly allocated buffers cannot alias with any OpOperands of the loop.)
Differential Revision: https://reviews.llvm.org/D124929
A large DenseElementsAttr of i1could trigger a bug in printer/parser roundtrip.
Ex. A DenseElementsAttr of i1 with 200 elements will print as Hex format of length 400 before the fix. However, when parsing the printed text, an error will be triggered. After fix, the printed length will be 50.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D122925
Although we now have semi-rings to deal with arbitrary ops,
it is still good to convey zero-preserving semantics of
ops to the sparse compiler.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D125043
The fallback attribute parse path is parsing a Type attribute, but this results
in a really unintuitive error message: `expected non-function type`, which
doesn't really hint at tall that we were trying to parse an attribute. This
commit fixes this by trying to optionally parse a type, and on failure
emitting an error that we were expecting an attribute.
Differential Revision: https://reviews.llvm.org/D124870
The names of the functions that are supposed to be exported do not match the implementations. This is due in part to cac7aabbd8.
This change makes the implementations and declarations match and adds a couple missing declarations.
The new names follow the pattern of the existing `verify` functions where the prefix is maintained as `_mlir_ciface_` but the suffix follows the new naming convention.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124891
The NVVM dialect test coverage for all possible type/shape combinations
in the `nvvm.mma.sync` op is mostly complete. However, there were tests
missing for TF32 datatype support. This change adds tests for the one
relevant shape/type combination. This uncovered a small bug in the op
verifier, which this change also fixes.
Differential Revision: https://reviews.llvm.org/D124975
The previous error message was technically incorrect. We do not compare equivalence of YieldOp operands and ForOp operands.
Differential Revision: https://reviews.llvm.org/D124934
Inside processInstruction, we assign the translated mlir::Value to a
reference previously taken from the corresponding entry in instMap.
However, instMap (a DenseMap) might resize after the entry reference was
taken, rendering the assignment useless since it's assigning to a
dangling reference. Here is a (pseudo) snippet that shows the concept:
```
// inst has type llvm::Instruction *
Value &v = instMap[inst];
...
// op is one of the operands of inst, has type llvm::Value *
processValue(op);
// instMap resizes inside processValue
...
translatedValue = b.createOp<Foo>(...);
// v is already a dangling reference at this point!
// The following assignment is bogus.
v = translatedValue;
```
Nevertheless, after we stop caching llvm::Constant into instMap, there
is only one case that can cause processValue to resize instMap: If the
operand is a llvm::ConstantExpr. In which case we will insert the
derived llvm::Instruction into instMap.
To trigger instMap to resize, which is a DenseMap, the threshold depends
on the ratio between # of map entries and # of (hash) buckets. More specifically,
it resizes if (# of map entries / # of buckets) >= 0.75.
In this case # of map entries is equal to # of LLVM instructions, and # of
buckets is the power-of-two upperbound of # of map entries. Thus, eventually
in the attaching test case (test/Target/LLVMIR/Import/incorrect-instmap-assignment.ll),
we picked 96 and 128 for the # of map entries and # of buckets, respectively.
(We can't pick numbers that are too small since DenseMap used inlined
storage for small number of entries). Therefore, the ConstantExpr in the
said test case (i.e. a GEP) is the 96-th llvm::Value cached into the
instMap, triggering the issue we're discussing here on its enclosing
instruction (i.e. a load).
This patch fixes this issue by calling `operator[]` everytime we need to
update an entry.
Differential Revision: https://reviews.llvm.org/D124627
Support int8, int16, int32 and int32. Also fix source code format in mlir_pytaco_utils.py.
Add tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D124925
This commit relaxes the rules around ops that define a value but do not specify the tensor's contents. (The only such op at the moment is init_tensor.)
When such a tensor is written in a loop, it should not cause out-of-place bufferization.
Differential Revision: https://reviews.llvm.org/D124849
This will enable our usual set of element types in external
environments, such as PyTACO support.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D124875
There are only a couple of warnings when compiling with VS on Windows. This fixes the last remaining warnings so that we can enable LLVM_ENABLE_WERROR on the mlir windows bot.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124862
Adding lowering for Unary and Binary required several changes due to
their unique nature of containing custom code for different "regions"
of the sparse structure being operated on. Along with a Kind, a pointer
to the Operation is passed along to be merged once the lattice
structure is figured out.
The original operation is maintained, as it is required for subsequent
lattice decisions. However, sparse_tensor.binary has some branches
are considered as fully handled and therefore are marked with as
kBinaryBranch to distinguish them.
A unique aspect of the custom code is that sometimes the desired result
is no result at all -- i.e. a user wants overlapping sparse entries to
become empty in the output. The solution to this is to return an
uninitialized Value(), which is checked and handled elsewhere in the
code and results in nothing being written to the output tensor for that
case.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D123057
It is very wrong if the ranges can't be infered. It's also checked in
verifyStructuredOpInterface, so we don't need the Optional return type.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D124596
Add the mechanism for TransformState extensions to update the mapping between
Transform IR values and Payload IR operations held by the state. The mechanism
is intentionally restrictive, similarly to how results of the transform op are
handled.
Introduce test ops that exercise a simple extension that maintains information
across the application of multiple transform ops.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D124778
This patch add supports for translating FCmp and more kinds of FP
constants in addition to 32 & 64-bit ones. However, we can't express
ppc_fp128 constants right now because the semantics for its underlying
APFloat is `S_PPCDoubleDouble` but mlir::FloatType doesn't support such
semantics right now.
Differential Revision: https://reviews.llvm.org/D124630
When CRunnerUtils included together with MLIR IR headers, it can lead to compilation errors.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D124744
This patch restricts the value of `if` clause expression to an I1 value.
It also restricts the value of `num_threads` clause expression to an I32
value.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D124142
This enables one to write generic code that can be instantiated for both
specific operation classes and the common base class without
specialization. Examples include functions that take/return ops, such
as:
```mlir
template <typename FnTy>
void applyIf(FnTy &&lambda, ...) {
for (Operation *op : ...) {
auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op);
if (specific)
lambda(specific);
}
}
```
that would otherwise need to rely on template specialization to support
lambdas that take specific operations and those that take `Operation *`.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124675
The current implementation uses a discrete "pdl_interp.inferred_types"
operation, which acts as a "fake" handle to a type range. This op is
used as a signal to pdl_interp.create_operation that types should be
inferred. This is terribly awkward and clunky though:
* This op doesn't have a byte code representation, and its conversion
to bytecode kind of assumes that it is only used in a certain way. The
current lowering is also broken and seemingly untested.
* Given that this is a different operation, it gives off the assumption
that it can be used multiple times, or that after the first use
the value contains the inferred types. This isn't the case though,
the resultant type range can never actually be used as a type range.
This commit refactors the representation by removing the discrete
InferredTypesOp, and instead adds a UnitAttr to
pdl_interp.CreateOperation that signals when the created operations
should infer their types. This leads to a much much cleaner abstraction,
a more optimal bytecode lowering, and also allows for better error
handling and diagnostics when a created operation doesn't actually
support type inferrence.
Differential Revision: https://reviews.llvm.org/D124587
MLIR has a common pattern for "arguments" that uses syntax
like `%x : i32 {attrs} loc("sourceloc")` which is implemented
in adhoc ways throughout the codebase. The approach this uses
is verbose (because it is implemented with parallel arrays) and
inconsistent (e.g. lots of things drop source location info).
Solve this by introducing OpAsmParser::Argument and make addRegion
(which sets up BlockArguments for the region) take it. Convert the
world to propagating this down. This means that we correctly
capture and propagate source location information in a lot more
cases (e.g. see the affine.for testcase example), and it also
simplifies much code.
Differential Revision: https://reviews.llvm.org/D124649
We weren't properly returning the result of the constraint,
which leads to errors when actually trying to use the generated
C++.
Differential Revision: https://reviews.llvm.org/D124586
We currently aren't handling this properly, and in the case
of a string block just crash. This commit adds proper error handling
and detection for eof.
Differential Revision: https://reviews.llvm.org/D124585
SourceMgr generally uses 1-based locations, whereas the LSP is zero based.
This commit corrects this conversion and also enhances the conversion from SMLoc
to SMRange to support string tokens.
Differential Revision: https://reviews.llvm.org/D124584
We currently emit an error during verification if a pdl.operation with non-inferrable
results is used within a rewrite. This allows for catching some errors during compile
time, but is slightly broken. For one, the verification at the PDL level assumes that
all dialects have been loaded, which is true at run time, but may not be true when
the PDL is generated (such as via PDLL). This commit fixes this by not emitting the
error if the operation isn't registered, i.e. it uses the `mightHave` variant of trait/interface
methods.
Secondly, we currently don't verify when a pdl.operation has no explicit results, but the
operation being created is known to expect at least one. This commit adds a heuristic
error to detect these cases when possible and fail. We can't always capture when the user
made an error, but we can capture the most common case where the user expected an
operation to infer its result types (when it actually isn't possible).
Differential Revision: https://reviews.llvm.org/D124583
pdl.attribute currently has a syntax ambiguity that leads to the incorrect parsing
of pdl.attribute operations with locations that don't also have a constant value. For example:
```
pdl.attribute loc("foo")
```
The above IR is treated as being a pdl.attribute with a constant value containing the location,
`loc("foo")`, which is incorrect. This commit changes the syntax to use `= <constant-value>` to
clearly distinguish when the constant value is present, as opposed to just trying to parse an attribute.
Differential Revision: https://reviews.llvm.org/D124582
This allows for inferring the result types of operations in certain situations by using the type of
an operand. This commit allowed for automatically supporting type inference for many more
operations with no additional effort, e.g. nearly all Arithmetic operations now support
result type inferrence with no additional changes.
Differential Revision: https://reviews.llvm.org/D124581
This allows for using attribute types in result type inference for use with
InferTypeOpInterface. This was a TODO before, but it isn't much
additional work to properly support this. After this commit,
arith::ConstantOp can now have its InferTypeOpInterface implementation automatically
generated.
Differential Revision: https://reviews.llvm.org/D124580
I would ideally like to eliminate 'requiredOperandCount' as a bit of
verification that should be in the client side, but it is much more
widely used than I expected. Just tidy some pieces up around it given
we can't drop it immediately.
NFC.
Differential Revision: https://reviews.llvm.org/D124629
tree-sitter grammar file that tries to closely matches LangRef (it could use
some tweaking and cleanup, but kept fairly basic). Also updated LangRef in
places where found some issues while doing the nearly direct transcription.
This only adds a grammar file, not all the other parts (npm etc) that
accompanies it. Those I'll propose for separate repo like we do for vscode
extension.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124352
The asm parser had a notional distinction between parsing an
operand (like "%foo" or "%4#3") and parsing a region argument
(which isn't supposed to allow a result number like #3).
Unfortunately the implementation has two problems:
1) It didn't actually check for the result number and reject
it. parseRegionArgument and parseOperand were identical.
2) It had a lot of machinery built up around it that paralleled
operand parsing. This also was functionally identical, but
also had some subtle differences (e.g. the parseOptional
stuff had a different result type).
I thought about just removing all of this, but decided that the
missing error checking was important, so I reimplemented it with
a `allowResultNumber` flag on parseOperand. This keeps the
codepaths unified and adds the missing error checks.
Differential Revision: https://reviews.llvm.org/D124470
This adds a cast operation that allows to perform an explicit type
conversion. The cast op is emitted as a C-style cast. It can be applied
to integer, float, index and EmitC types.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D123514
Fordbids to express pointer via the `!emitc.opaque` type. Point the user
to use the `!emitc.ptr` type instead.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D124002
Per SPIR-V validation rules, explict layout decorations are only
needed for StorageBuffer, PhysicalStorageBuffer, Uniform, and
PushConstant storage classes. (And even that is for Shader
capabilities). So we don't need such decorations on the rest.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124543
Current index value generation uses fixed-length vector ops, this patch
adds an alterantive codegen path compatible with scalable vectors by
using `LLVM::StepVectorOp`.
Differential Revision: https://reviews.llvm.org/D124454
By using a shared index pool, we reduce the footprint of each "Element"
in the COO scheme and, in addition, reduce the overhead of allocating
indices (trading many allocations of vectors for allocations in a single
vector only). When the capacity is known, this means *all* allocation
can be done in advance.
This is a big win. For example, reading matrix SK-2005, with dimensions
50,636,154 x 50,636,154 and 1,949,412,601 nonzero elements improves
as follows (time in ms), or about 3.5x faster overall
```
SK-2005 before after speedup
---------------------------------------------
read 305,086.65 180,318.12 1.69
sort 2,836,096.23 510,492.87 5.56
pack 364,485.67 312,009.96 1.17
---------------------------------------------
TOTAL 3,505,668.56 1,002,820.95 3.50
```
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D124502
Constants in MLIR are not globally unique, unlike that in LLVM IR.
Therefore, reusing previous-translated constants might cause the user
operations not being dominated by the constant (because the
previous-translated ones can be placed in arbitrary place)
This indeed misses some opportunities where we actually can reuse a
previous-translated constants, but verbosity is not our first priority
here.
Differential Revision: https://reviews.llvm.org/D124404
More specifically, the llvm::Instruction generated by
llvm::ConstantExpr::getAsInstruction. Such Instruction will be deleted
right away, but it's possible that when getAsInstruction is called
again, it will create a new Instruction that has the same address with
the one we just deleted. Thus, we shouldn't keep it in the `instMap` to
avoid a conflicting index that triggers an assertion in
processInstruction.
Differential Revision: https://reviews.llvm.org/D124402
And move importer test files from `test/Target/LLVMIR` into
`test/Target/LLVMIR/Import`.
We simply translate struct-type ConstantAggregate(Zero) into a
serious of `llvm.insertvalue` operations against a `llvm.undef` root.
Note that this doesn't affect the original logics on translating
vector/array-type ConstantAggregate values.
Differential Revision: https://reviews.llvm.org/D124399
This is necessary to handle conversions of operations defined at runtime in extensible dialects.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124353
Only supports addition and multiplication for now; other cases
to be implemented.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124380
`index` type is converted to `i32` in SPIR-V. This is fine to
support for all signed/unsigned ops.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124451
Depends on D104534
Add support for extensible dialects, which are dialects that can be
extended at runtime with new operations and types.
These operations and types cannot at the moment implement traits
or interfaces.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D104554
This allows for providing completion results for include directive
file paths by searching the set of include directories for the current
file.
Differential Revision: https://reviews.llvm.org/D124112
This allows for navigating to included files on click, and also provides hover
information about the include file (similarly to clangd).
Differential Revision: https://reviews.llvm.org/D124077
The compilation database acts in a similar way to the compilation database
(compile_commands.json) used by clang-tidy, i.e. it provides additional
information about the compilation of project files to help the language
server. The main piece of information provided by the PDLL compilation
database in this commit is the set of include directories used when processing
the input .pdll file. This allows for the server to properly process .pdll files
that use includes anchored by the include directories set up in the build system.
The structure of the textual form of a compilation database is a yaml file
containing documents of the following form:
```
--- !FileInfo:
filepath: <string> - Absolute file path of the file.
includes: <string> - Semi-colon delimited list of include directories.
```
This commit also adds support to cmake for automatically generating
a `pdll_compile_commands.yml` file at the top-level of the build
directory.
Differential Revision: https://reviews.llvm.org/D124076
This essentially sets up mlir-pdll to function in a similar manner to mlir-tblgen. Aside
from the boilerplate of configuring CMake and setting up a basic initial test, two new
options are added to mlir-pdll to mirror options provided by tblgen:
* -d
This option generates a dependency file (i.e. a set of build time dependencies) while
processing the input file.
* --write-if-changed
This option only writes to the output file if the data would have changed, which for
the build system prevents unnecesarry rebuilds if the file was touched but not actually
changed.
Differential Revision: https://reviews.llvm.org/D124075
In the case of anonymous defs this may return the name of the base def class,
which can lead to two different defs with the same name (which hits an assert).
This commit adds a new `getUniqueDefName` method that returns a unique name
for the constraint.
Differential Revision: https://reviews.llvm.org/D124074
The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishing. This creates problems:
1. The InsertPoint used for CodeGenIP does not need to be the end of a block. If it is not, a naive callback will insert a branch instruction into the middle of the block.
2. The BasicBlock the CodeGenIP is pointing to may or may not have a terminator. There is an conflict where to branch to if the block already has a terminator.
3. Some API functions work only with block having a terminator. Some workarounds have been used to insert a temporary terminator that is removed again.
4. Some callbacks are sensitive to whether the BasicBlock has a terminator or not. This creates a callback ordering problem where different callback may have different behaviour depending on whether a previous callback created a terminator or not. The problem also exists for FinalizeCallbackTy where some callbacks do create branch to another "continue" block, but unlike BodyGenCallbackTy does not receive the target as argument. This is not addressed in this patch.
With this patch, the callback receives an CodeGenIP into a BasicBlock where to insert instructions. If it has to insert control flow, it can split the block at that position as needed but otherwise no separate ContinuationBB is needed. In particular, a callback can be empty without breaking the emitted IR. If the caller needs the control flow to branch to a specific target, it can insert the branch instruction itself and pass an InsertPoint before the terminator to the callback.
Certain frontends such as Clang may expect the current IRBuilder position to be at the end of a basic block. In this case its callbacks must split the block at CodeGenIP before setting the IRBuilder position such that the instructions after CodeGenIP are moved to another basic block and before returning create a new branch instruction to the split block.
Some utility functions such as `splitBB` are supporting correct splitting of BasicBlocks, independent of whether they have a terminator or not, returning/setting the InsertPoint of an IRBuilder to the end of split predecessor block, and optionally omitting creating a branch to the split successor block to be added later.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D118409
This change borrows the ideas from `computeExpanded/CollapsedLayoutMap`
and computes the dynamic strides at runtime for the memref descriptors.
Differential Revision: https://reviews.llvm.org/D124001
This commit adds the visitNonControlFlowArguments method to
DataFlowAnalysis, allowing analyses to provide lattice values for the
arguments to a RegionSuccessor block that aren't directly tied to an
op's inputs. For example, integer range interface can use this method
to infer bounds for the step values in loops.
This method has a default implementation that keeps the old behavior
of assigning a pessimistic fixedpoint state to all such arguments.
Reviewed By: Mogball, rriddle
Differential Revision: https://reviews.llvm.org/D124021
This diff causes mlir-tblgen to generate code for an additional builder for an
operation argument with a return type that can be inferred *AND* an attribute in
the argument list can be "unwrapped." (Previously, the unwrapped build function
was only generated for builders with explicit return types in separate or
aggregate form.) As an example, this builder might be used by code that creates
operations that implement the `SameOperandsAndResultType` interface. A test case
was created.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D124043
This diff allows the EnumAttr class to be used for bit enum attributes (in
addition to previously supported integer enum attributes). While integer
and bit enum attributes share many common implementation aspects, parsing
bit enum values requires a separate implementation. This is accomplished
by creating empty parser and printer strings in the EnumAttrInfo record,
and having derived classes (specific to bit and integer enums) override with
an appropriate parser/printer string.
To support existing bit enums that may use a vertical bar separator, the
parser is modified to support the | token.
Tests were added for bit enums alongside integer enums.
Future diffs for fastmath attributes in the arithmetic dialect will use these
changes.
(resubmission of earlier abaondoned diff, updated to reflect subsequent changes
in the repository)
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D123880
This diff introduces a tablegen field for bit enum attributes
(`printBitEnumPrimaryGroups`) to control printing when the enum uses "group"
cases. An example would be an implementation that uses a `fastmath` enum value
as an alias for individual fastmath flags. The proposed field would allow
printing of simply `fast` for the enum value, instead of the more verbose list
that would include `fast` as well as the individual flags (e.g. `reassoc,nnan,
ninf,nsz,arcp,contract,afn,fast`).
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D123871
The verifier of llvm.mlir.addressof did not properly account for opaque pointers, that is, the pointer type not having an element type equal to the type of the referenced global or function. This patch fixes that by skipping the test for the element type if the pointer is opaque.
Differential Revision: https://reviews.llvm.org/D124333
After https://reviews.llvm.org/D119743 added the `AutomaticAllocationScope`
trait to loop-like constructs, the vector transfer full/partial splitting pass
started inserting allocations for temporaries within the closest loop rather
than the closest function (or other allocation scope such as `async.execute`).
While this is correct as long as the lowered code takes care of automatic
deallocation at the end of each iteration of the loop, this interferes with
downstream optimizations that expect `alloca`s to be at the function level.
Step over loops when looking for the closest allocation scope in vector
transfer full/partial splitting pass thus restoring the original behavior.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124366
This is likely preferable to having it crash if one were to specify an opaque pointer type, and the actual element type is unused either way.
Differential Revision: https://reviews.llvm.org/D124334
The SparseTensor passes currently use opaque numbers for the CLI, despite using an enum internally. This patch exposes the enums instead of numbered items that are matched back to the enum.
Fixes GitHub issue #53389
Reviewed by: aartbik, mehdi_amini
Differential Revision: https://reviews.llvm.org/D123876
Run `one-shot-bufferize` instead of `linalg-comprehensive-module-bufferize` and move some test cases to their respective dialects.
Differential Revision: https://reviews.llvm.org/D124323
Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file.
Differential Revision: https://reviews.llvm.org/D124298
By generating in the .h file, we were forcing dialects to include
a lot of additional header files because:
* Fields of the dialect, e.g. std::unique_ptr<>, were unable to use
forward declarations.
* Dependent dialects are loaded in the constructor, requiring the
full definition of each dependent dialect (which, depending on
the file structure of the dialect, may include the operations).
By generating in the .cpp we get much faster builds, and also
better align with the rest of the code base.
Fixes#55044
Differential Revision: https://reviews.llvm.org/D124297
As a fallback mechanism, if no entry was supplied for a given address space, the size or alignment for a pointer type with the default address space is returned instead.
This code currently crashes with opaque pointers, as it tries to construct a typed pointer type from the opaque pointer type, leading to a null pointer dereference when fetching the element type.
This patch fixes the issue by handling the opaque pointer cases explicitly.
Differential Revision: https://reviews.llvm.org/D124290
Using opaque pointers in function signatures leads to an attempt to recursively convert all types, including sub types in LLVM types. In the case of LLVM pointers, it may not have a subtype aka element type if it is opaque which would then lead to a null pointer dereference.
Differential Revision: https://reviews.llvm.org/D124291
This change fixes `CollapsedLayoutMap` for cases where the collapsed
dims are size 1. The cases where inner most dims are size 1 and
noncontiguous can be represented by the strided form and therefore can
be allowed. For such cases, the new stride should be of the next entry
in an association whose dimension is not size 1. If the next entry is
dynamic, it's not possible to decide which stride to use at compilation
time and the stride is set to dynamic.
Differential Revision: https://reviews.llvm.org/D124137