With the exceptions of AttrOrTypeParameter and DerivedAttr, all of MLIR consistently uses $_ctxt as the substitute variable for the MLIRContext in TableGen C++ code.
Usually this does not matter unless one where to reuse some code in multiple fields but it is still needlessly inconsistent and prone to error.
This patch fixes that by consistently using _$ctxt everywhere.
Differential Revision: https://reviews.llvm.org/D129153
We use LLD to perform AMDGPU linking. This linker accepts some arguments
through the `-plugin-opt` facilities. These options match what `Clang`
will output when given the same input.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D128923
Depends on D128068.
Added a new test code that fails an assertion in the baseline.
That is because `getAPSIntType` works only with integral types.
Differential Revision: https://reviews.llvm.org/D126779
In `RegionStore::getBinding` we call `evalCast` unconditionally to align
the stored value's type to the one that is being queried. However, the
stored type might be the same, so we may end up having redundant
`SymbolCasts` emitted.
The solution is to check whether the `to` and `from` type are the same
in `makeNonLoc`.
Note, we can't just do type equivalence check at the beginning of `evalCast`
because when `evalCast` is called from `getBinding` then the original type
(`OriginalTy`) is not set, so one operand is missing for the comparison. In
`evalCastSubKind(nonloc::SymbolVal)` when the original type is not set,
we get the `from` type via `SymbolVal::getType()`.
Differential Revision: https://reviews.llvm.org/D128068
HLSL vector types are ext_vector types, but they are also exposed via a
template syntax `vector<T, #>`. This is morally equavalent to the code:
```c++
template <typename T, int Size>
using vector = T __attribute__((ext_vector_type(Size)))
```
The problem is that templates aren't supported before HLSL 2021, and
type aliases still aren't supported in HLSL.
To resolve this (and other issues where HLSL can't represent its own
types), we rely on an external AST & Sema source being registered for
HLSL code.
This patch adds the HLSLExternalSemaSource and registers the vector
type alias.
Depends on D127802
Differential Revision: https://reviews.llvm.org/D128012
- Applying CallOpInterface on CallOp and InvokeOp.
- Applying CallableInterface on LLVMFuncOp.
We're testing the changes using CallGraph, which uses both interfaces.
Differential Revision: https://reviews.llvm.org/D129026
This in an extension to the code added in D123911 which added vector
combine folding of shuffle-select patterns, attempting to reduce the
total amount of shuffling required in patterns like:
%x = shuffle %i1, %i2
%y = shuffle %i1, %i2
%a = binop %x, %y
%b = binop %x, %y
shuffle %a, %b, selectmask
This patch extends the handing of shuffles that are dependent on one
another, which can arise from the SLP vectorizer, as-in:
%x = shuffle %i1, %i2
%y = shuffle %x
The input shuffles can also be emitted, in which case they are treated
like identity shuffles. This patch also attempts to calculate a better
ordering of input shuffles, which can help getting lower cost input
shuffles, pushing complex shuffles further down the tree.
This is a recommit with some additional checks for supported forms and
out-of-bounds mask elements, with some extra tests.
Differential Revision: https://reviews.llvm.org/D128732
"attachment" was a temporary name chosen for the information attached to a
variable in a PresburgerSpace. After the disambiguation of "variables" and
"identifiers" in PresburgerSpace, we use the word "identifiers" for this
information, since this information is used to "identify" these variables.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D128751
This reverts commit 4e545bdb35.
The newly added test is the third infinite combine loop caused by
this change. In this case, it's a combination of the branch to
common dest and jump threading folds that keeps peeling off loop
iterations.
The core problem here is that we ideally would not thread over
loop backedges, both because it is potentially non-profitable
(it may break canonical loop structure) and because it may result
in these kinds of loops. Unfortunately, due to the lack of a
dominator tree in SimplifyCFG, there is no good way to prevent
this. While we have LoopHeaders, this is an optional structure and
we don't do a good job of keeping it up to date. It would be fine
for a profitability check, but is not suitable for a correctness
check.
So for now I'm just giving up here, as I don't see a good way to
robustly prevent infinite combine loops.
Fixes https://github.com/llvm/llvm-project/issues/56203.
The result shape of a rank-reducing subview cannot be inferred in the general case. Just the result rank is not enough. The only thing that we can infer is the layout map.
This change also improves the bufferization patterns of tensor.extract_slice and tensor.insert_slice to fully support rank-reducing operations.
Differential Revision: https://reviews.llvm.org/D129144
Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP
instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC
with DPP.
Depends on D128656
Reviewed By: #amdgpu, rampitec
Differential Revision: https://reviews.llvm.org/D128682
- Extend the GLR parser to allow conditional reduction based on the
guard functions;
- Implement two simple guards (contextual-override/final) for cxx.bnf;
- layering: clangPseudoCXX depends on clangPseudo (as the guard function need
to access the TokenStream);
Differential Revision: https://reviews.llvm.org/D127448
This removes creation of udiv/sdiv/urem/srem constant expressions,
in preparation for their removal. I've added a
ConstantExpr::isDesirableBinOp() predicate to determine whether
an expression should be created for a certain operator.
With this patch, div/rem expressions can still be created through
explicit IR/bitcode, forbidding them entirely will be the next step.
Differential Revision: https://reviews.llvm.org/D128820
Treat `std::nullptr_t` as a regular scalar type to avoid tripping
assertions when analyzing code that uses `std::nullptr_t`.
Differential Revision: https://reviews.llvm.org/D129097
We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a legal VOPD, namely that the matching operands
(e.g. src0x and src0y, src1x and src1y) must be in different register
banks. We add a PostRA scheduler
mutation to put possible VOPD components back-to-back.
Depends on D128442, D128270
Reviewed By: #amdgpu, rampitec
Differential Revision: https://reviews.llvm.org/D128656
When trying to prove an implied condition on a phi by proving it
for all incoming values, we need to be careful about values coming
from a backedge, as these may refer to a previous loop iteration.
A variant of this issue was fixed in D101829, but the dominance
condition used there isn't quite right: It checks that the value
dominates the incoming block, which doesn't exclude backedges
(values defined in a loop will usually dominate the loop latch,
which is the incoming block of the backedge).
Instead, we should be checking for domination of the phi block.
Any values defined inside the loop will not dominate the loop
header phi.
Fixes https://github.com/llvm/llvm-project/issues/56242.
Differential Revision: https://reviews.llvm.org/D128640
This patch was pushed for calixte@mozilla.com
- this function (Windows only) is called when gcda are dumped on disk;
- according to its documentation, it's only useful in case of hard failures, this is highly improbable;
- it drastically decreases the time in the tests and consequently it avoids timeouts when we use slow disks.
Differential Revision: https://reviews.llvm.org/D129128
Tell the matcher what we are looking for instead of matching everything
and then discarding the result if doesn't fit.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D128171
If there are multiple predecessors that have the same condition
value (and thus same "real destination"), these were previously
handled by copying the threaded block for each predecessor.
Instead, we can reuse one block for all of them. This makes the
behavior of SimplifyCFG's jump threading match that of the
actual JumpThreading pass.
This also avoids the infinite combine loop reported in:
https://reviews.llvm.org/D124159#3624387
When doing CTU analysis setup you pre-compile .cpp to .ast and then
you run clang-extdef-mapping on the .cpp file as well. This is a
pretty slow process since we have to recompile the file each time.
With this patch you can now run clang-extdef-mapping directly on
the .ast file. That saves a lot of time.
I tried this on llvm/lib/AsmParser/Parser.cpp and running
extdef-mapping on the .cpp file took 5.4s on my machine.
While running it on the .ast file it took 2s.
This can save a lot of time for the setup phase of CTU analysis.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D128704
This patch extends the affine parser to allow affine constraints with `<=`.
This is useful in writing unittests for Presburger library and test in general.
The internal storage and printing of IntegerSet is still in the original format.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D129046
This test has some race condition which is making it hang on LLDB
Arm/AArch64 Linux buildbot. I am marking it as skipped until we
investigate whats going wrong.
Restructure the current implementation of eliminateFrameIndex function
in order to support more instructions.
Reviewed By: efocht
Differential Revision: https://reviews.llvm.org/D129034