Rename the check prefixes to HOIST21 and HOIST32 to clarify the different flag configurations.
Depends On D114438
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114442
Instead of checking for unexpected operations (any operation with a region except for scf::For and `padTensorOp` or operations with a memory effect) while cloning the packing loop nest perform the checks early. Update `dropNonIndexDependencies` to check for unexpected operations. Additionally, check all of these operations have index type operands only.
Depends On D114428
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114438
Limit hoist padding to pad tensor ops that depend only on a constant value. Supporting arbitrary padding values that depend on computations part of the backward slice to hoist require complex analysis to ensure the computation can be hoisted.
Depends On D114420
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114428
Adapt hoist padding to filter the backward slice before cloning the packing loop nest. The filtering removes all operations that are not used to index the hoisted pad tensor op and its extract slice op. The filtering is needed to support the more complex loop nests created after fusion. For example, fusing the producer of an output operand can added linalg ops and pad tensor ops to the backward slice. These operations have regions and currently prevent hoisting.
The following example demonstrates the effect of the newly introduced `dropNonIndexDependencies` method that filters the backward slice:
```
%source = linalg.fill(%cst, %arg0)
scf.for %i
%unrelated = linalg.fill(%cst, %arg1) // not used to index %source!
scf.for %j (%arg2 = %unrelated)
scf.for %k // not used to index %source!
%ubi = affine.min #map(%i)
%ubj = affine.min #map(%j)
%slice = tensor.extract_slice %source [%i, %j] [%ubi, %ubj]
%padded_slice = linalg.pad_tensor %slice
```
dropNonIndexDependencies(%padded_slice, %slice)
removes [scf.for %k, linalg.fill(%cst, %arg1)] from backwardSlice.
Depends On D114175
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114420
This fixes "textDocument/prepareCallHierarchy" in clangd for ObjC methods. Details at https://github.com/clangd/vscode-clangd/issues/247.
clangd uses Decl::isFunctionOrFunctionTemplate to check if the decl given in a prepareCallHierarchy request is eligible for prepareCallHierarchy. We change to use isFunctionOrMethod which includes functions and ObjC methods.
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D114058
The changes in D113888 / 32b6c17b29 altered the memory size of a
masked store, as it will store an unknown number of bytes not the full
vector size. We can have situations where the masked stores is legalized
and then turned to a normal store, as the mask is known to be all ones.
This creates a store with an unknown size MMO that was hitting this
assert.
The store created can be given a better size in a followup patch. This
currently adjusts the assert to handle unknown sizes.
From GCC's manpage:
-fplugin-arg-name-key=value
Define an argument called key with a value of value for the
plugin called name.
Since we don't have a key-value pair similar to gcc's plugin_argument
struct, simply accept key=value here anyway and pass it along as-is to
plugins.
This translates to the already existing '-plugin-arg-pluginname arg'
that clang cc1 accepts.
There is an ambiguity here because in clang, both the plugin name
as well as the option name can contain dashes, so when e.g. passing
-fplugin-arg-foo-bar-foo
it is not clear whether the plugin is foo-bar and the option is foo,
or the plugin is foo and the option is bar-foo. GCC solves this by
interpreting all dashes as part of the option name. So dashes can't be
part of the plugin name in this case.
Differential Revision: https://reviews.llvm.org/D113250
Add a helper function to ControlFlowInterfaces for checking if two ops
are in mutually exclusive regions according to RegionBranchOpInterface.
Utilize this new helper in Linalg ComprehensiveBufferize. This makes the
analysis independent of the SCF dialect and generalizes it to other ops
that implement RegionBranchOpInterface.
Differential Revision: https://reviews.llvm.org/D114220
https://bugs.llvm.org/show_bug.cgi?id=47936
Using the MultiLine setting for BraceWrapping.AfterControlStatement appears to disable AllowShortFunctionsOnASingleLine, even in cases without any control statements
Reviewed By: HazardyKnusperkeks, curdeius
Differential Revision: https://reviews.llvm.org/D114521
Rename test/python/dialects/math.py -> math_dialect.py to avoid a
collision with a Python standard package of the same name. These test
scripts are run by path and are not part of a package. Python apparently
implicitly adds the containing directory to its PYTHONPATH. As such,
test scripts with common names run the risk of conflicting with global
names and resolution of an import for the latter happens to the former.
Differential Revision: https://reviews.llvm.org/D114568
* Implement `FlatAffineConstraints::getConstantBound(EQ)`.
* Inject a simpler constraint for loops that have at most 1 iteration.
* Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`.
Differential Revision: https://reviews.llvm.org/D114138
This change is NFC. There were two issues when passing/reading upper bounds into/from FlatAffineConstraints that negate each other, so the bug was not apparent. However, it made debugging harder because some constraints in the FlatAffineConstraints were off by one when dumping all constraints.
Differential Revision: https://reviews.llvm.org/D114137
For synthesizing an op's implementation of the generated interface
from {Min|Max}Version, we need to define an `initializer` and
`mergeAction`. The `initializer` specifies the initial version,
and `mergeAction` specifies how version specifications from
different parts of the op should be merged to generate a final
version requirements.
Previously we use the specified version enum as the type for both
the initializer and thus the final return type. This means we need
to perform `static_cast` over some hopefully not used number (`~0u`)
as the initializer. This is quite opaque and sort of not guaranteed
to work. Also, there are ops that have an enum attribute where some
values declare version requirements (e.g., enumerant `B` requires
v1.1+) but some not (e.g., enumerant `A` requires nothing). Then a
concrete op instance with `A` will still declare it implements the
version interface (because interface implementation is static for
an op) but actually theirs no requirements for version.
So this commit changes to use an more explicit `llvm::Optional`
to wrap around the returned version enum. This should make it
more clear.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D108312
According to the C++ standard, the stored pointer and the stored deleter
should be value-initialized.
Differential Revision: https://reviews.llvm.org/D113612
Introduced/discussed in https://reviews.llvm.org/D38719
The header validation logic was also explicitly building the DWARFUnits
to validate. But then other calls, like "Units.getUnitForOffset" creates
the DWARFUnits again in the DWARFContext proper - so, let's avoid
creating the DWARFUnits twice by walking the DWARFContext's units rather
than building a new list explicitly.
This does reduce some verifier power - it means that any unit with a
header parsing failure won't get further validation, whereas the
verifier-created units were getting some further validation despite
invalid headers. I don't think this is a great loss/seems "right" in
some ways to me that if the header's invalid we should stop there.
Exposing the raw DWARFUnitVectors from DWARFContext feels a bit
sub-optimal, but gave simple access to the getUnitForOffset to keep the
rest of the code fairly similar.
In 1fa27f2a10, we made <filesystem>'s iterator types model concepts
from <ranges>, but we forgot to add the appropriate availability
annotations. This broke back-deployment to platforms that don't have
<filesystem> for which we have availability annotations.
For some reason, this wasn't caught by our back-deployment CI.
I believe this is due to the fact that we use a slightly older
compiler in the CI, and perhaps that compiler does not honour
our `#pragma clang attribute push` properly.
Differential Revision: https://reviews.llvm.org/D114456
This canonicalization breaks the ability to discard checks in some cases.
Add a command line option to disable it. This option is on by default,
so the change is NFC.
See for details:
https://reviews.llvm.org/D112895#3149487
Compiler has an analysis for perfect diamond matching but it does not
support nodes with main/alternate opcodes. The problem is that the
scalars themselves are different and might not match directly with other
nodes, but operands and main/alternate opcodes might match and compiler
might reuse some previously emitted vector instructions. Need to include
this analysis in the cost model and actual vector instructions emission
process.
Differential Revision: https://reviews.llvm.org/D114101
The 14.31.30818 toolset has the following in the `stdatomic.h`:
~~~
#ifndef __cplusplus
#error <stdatomic.h> is not yet supported when compiling as C, but this is planned for a future release.
#endif
~~~
This results in clang failing to build existing code which relied on
`stdatomic.h` in C mode on Windows. Simply fallback to the clang header
until that header is available as a complete implementation.
Building cfi with recent clang on a 64-bit system results in the
following warnings:
compiler-rt/lib/cfi/cfi.cpp:233:64: warning: format specifies type 'void *' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
VReport(1, "Can not handle: symtab > strtab (%p > %zx)\n", symtab, strtab);
~~ ^~~~~~
%lu
compiler-rt/lib/sanitizer_common/sanitizer_common.h:231:46: note: expanded from macro 'VReport'
if ((uptr)Verbosity() >= (level)) Report(__VA_ARGS__); \
^~~~~~~~~~~
compiler-rt/lib/cfi/cfi.cpp:253:59: warning: format specifies type 'void *' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
VReport(1, "Can not handle: symtab %p, strtab %zx\n", symtab, strtab);
~~ ^~~~~~
%lu
compiler-rt/lib/sanitizer_common/sanitizer_common.h:231:46: note: expanded from macro 'VReport'
if ((uptr)Verbosity() >= (level)) Report(__VA_ARGS__); \
^~~~~~~~~~~
Since `__sanitizer::uptr` has the same size as `size_t`, consistently
use `%z` as a printf specifier.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D114466
Almost all of the time, call instructions don't actually lead to SP being
different after they return. An exception is win32's _chkstk, which which
implements stack probes. We need to recognise that as modifying SP, so
that copies of the value are tracked as distinct vla pointers.
This patch adds a target frame-lowering hook to see whether stack probe
functions will modify the stack pointer, store that in an internal flag,
and if it's true then scan CALL instructions to see whether they're a
stack probe. If they are, recognise them as defining a new stack-pointer
value.
The added test exercises this behaviour: two calls to _chkstk should be
considered as producing two different values.
Differential Revision: https://reviews.llvm.org/D114443
The padding tests previously contained the tile loops. This revision removes the tile loops since padding itself does not consider the loops. Instead the induction variables are passed in as function arguments which promotes them to symbols in the affine expressions. Note that the pad-and-hoist.mlir test still exercises padding in the context of the full loop nest.
Depends On D114175
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114227
Add the makeComposedPadHighOp method which creates a new PadTensorOp if necessary. If the source to pad is actually the result of a sequence of padded LinalgOps, the method checks if padding is needed or if we can use the padded result of the padded LinalgOp sequence directly.
Example:
```
%0 = tensor.extract_slice %arg0 [%iv0, %iv1] [%sz0, %sz1]
%1 = linalg.pad_tensor %0 low[0, 0] high[...] { linalg.yield %cst }
%2 = linalg.matmul ins(...) outs(%1)
%3 = tensor.extract_slice %2 [0, 0] [%sz0, %sz1]
```
when padding %3 return %2 instead of introducing
```
%4 = linalg.pad_tensor %3 low[0, 0] high[...] { linalg.yield %cst }
```
Depends On D114161
Reviewed By: nicolasvasilache, pifon2a
Differential Revision: https://reviews.llvm.org/D114175
Change the failure condition of padOperandToSmallestStaticBoundingBox to never fail if the operand is already statically sized.
In particular:
- if the padding value computation fails -> return failure if the operand shape is dynamic and success if it is static.
- if there is no extract slice op -> return failure if the operand shape is dynamic and success if it is static.
The latter change prevents padding from failure if the output operand passed by iteration argument is statically sized since in this case the extract / insert slice pairs are removed by canonicalization.
Depends On D114153
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114161
The clang portion of c933c2eb33 was missed as I made
some kind of mistake squashing the commits with git.
This patch just adds those.
The original review: https://reviews.llvm.org/D114088