The closing parenthesis needs to be consumed when a NaN
with parenthesized (ignored) information is read on the
real input path that preprocesses input characters before
passing them to the decimal-to-binary converter.
Differential Revision: https://reviews.llvm.org/D125048
Add attribute to be able to generate the intrinsic version of async copy
generating a copy with l1 bypass. This correspond to
cp.async.cg.shared.global in ptx.
Differential Revision: https://reviews.llvm.org/D125241
Fix the warning
warning: 'polly::ScopViewer' has virtual functions but non-virtual destructor [-Wnon-virtual-dtor]
and for several other classes by inserting virtual destructors.
Rename the legacy `DOTGraphTraits{Module,}{Viewer,Printer}` to the corresponding `DOTGraphTraits...WrapperPass`, and implement a new `DOTGraphTraitsViewer` with new pass manager.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D123677
In supporting opaque pointers we need to re-materialize typed pointers
in bitcode emission. Because of how the value-enumerator pre-allocates
types and instructions we need to insert some no-op bitcasts in the
places that we'll need bitcasts for the pointer types.
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D122269
There are a couple of issues with the python bindings on Windows:
- `create_symlink` requires special permissions on Windows - using `copy_if_different` instead allows the build to complete and then be usable
- the path to the `python_executable` is likely to contain spaces if python is installed in Program Files. llvm's python substitution adds extra quotes in order to account for this case, but mlir's own python substitution does not
- the location of the shared libraries is different on windows
- if the type is not specified for numpy arrays, they appear to be treated as strings
I've implemented the smallest possible changes for each of these in the patch, but I would actually prefer a slightly more comprehensive fix for the python_executable and the shared libraries.
For the python substitution, I think it makes sense to leverage the existing %python instead of adding %PYTHON and instead add a new variable for the case when preloading is needed. This would also make it clearer which tests are which and should be skipped on platforms where the preloading won't work.
For the shared libraries, I think it would make sense to pass the correct path and extension (possibly even the names) to the python script since these are known by lit and don't have to be hardcoded in the test at all.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D125122
This patch clarifies the semantics of ADDCARRY/SUBCARRY, specifically
stating that both the incoming and outgoing carries are active high.
Differential Revision: https://reviews.llvm.org/D125130
The DAG node for the Test Data Class is defined using i64 as the second parameter.
However, the code to lower is_fpclass uses `i32` as type. This only works because no
type check is generated in the DAG matcher.
This PR changes the type of the mask constant to `i64`.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D125230
Matches the error message we emit with `-opt -O# --passes=foo`.
Otherwise we crash later on.
Makes #55320 much less confusing.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D125196
We can try to vectorize number of stores less than MinVecRegSize
/ scalar_value_size, if it is allowed by target. Gives an extra
opportunity for the vectorization.
Fixes PR54985.
Differential Revision: https://reviews.llvm.org/D124284
It should be useful clang-fuzzer itself, though my own motivation is
to use this in fuzzing clang-pseudo. (clang-tools-extra/pseudo/fuzzer).
Differential Revision: https://reviews.llvm.org/D125166
These are aimed at a possible miscompile spotted in the vmv.s.x/f mutation case, but it appears this is a latent bug. Or at least, I haven't been able to construct a case with compatible policy flags via intrinsics.
These can be selected to unmasked from masked instructions by the
post-process DAG step.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125239
This patch fixed the padding size calculation for Conv2d ops when the stride > 1. It contains the changes below:
- Use addBound to add constraint for AffineApplyOp in getUpperBoundForIndex. So the result value can be mapped and retrieved later.
- Fixed the bound from AffineMinOp by adding as a closed bound. Originally the bound was added as an open upper bound, which results in the incorrect bounds when we multiply the values. For example:
```
%0 = affine.min affine_map<()[s0] -> (4, -s0 + 11)>()[iv0]
%1 = affine.apply affine_map<()[s0] -> (s0 * 2)>()[%0]
If we add the affine.min as an open bound, addBound will internally transform it into the close bound "%0 <= 3". The following sliceBounds will derive the bound of %1 as "%1 <= 6" and return the open bound "%1 < 7", while the correct bound should be "%1 <= 8".
```
- In addition to addBound, I also changed sliceBounds to support returning closed upper bound, since for the size computation, we usually care about the closed bounds.
- Change the getUpperBoundForIndex to favor constant bounds when required. The sliceBounds will return a tighter but non-constant bounds, which can't be used for padding. The constantRequired option requires getUpperBoundForIndex to get the constant bounds when possible.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124821
This patch augments the `tensor-bufferize` pass by adding a conversion
rule to translate ReshapeOp from the `tensor` dialect to the `memref`
dialect, in addition to adding a unit test to validate the translation.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D125031
Adding the `-ignorelist` option that may eventually replace `-blacklist`.
With this patch `sancov` accepts both options.
Reviewed By: quinnp
Differential Revision: https://reviews.llvm.org/D113514
This flag is added by clang::driver::tools::addLTOOptions() and was causing
errors for me when building the llvm-test-suite repository with LTO and
-DTEST_SUITE_COLLECT_STATS=ON. This replaces the --stats-file= option
added in 1c04b52b25 since the flag is only
used for LTO and should therefore be in the -plugin-opt= namespace.
Additionally, this commit fixes the `REQUIRES: asserts` that was added in
948d05324a150a5a24e93bad07c9090d5b8bd129: the feature was never defined in
the lld test suite so it effectively disabled the test.
Reviewed By: MaskRay, MTC
Differential Revision: https://reviews.llvm.org/D124105
Need to use actual index instead of the tree entry position, since the
insert index may be different than 0. It mean, that we vectorized part
of the buildvector starting from not initial insertelement instruction
beause of some reason.
Fold %x umin_seq %y to %x if %x ule %y. This also subsumes the
special handling for constant operands, as if %y is constant this
folds to umin via implied poison reasoning, and if %x is constant
then either %x is not zero and it folds to umin, or it is known
zero, in which case it is ule anything.
Ensures an -Wenum-conversion warning happens when one of the enums is
signed and the other is unsigned. Also adds a test file to verify these
warnings.
This warning would not happen since the -Wsign-conversion would make a
diagnostic then return, never allowing the -Wenum-conversion checks.
For example:
C
enum PE { P = -1 };
enum NE { N };
enum NE conv(enum PE E) { return E; }
Before this would only create a diagnostic with -Wsign-conversion and
never on -Wenum-conversion. Now it will create a diagnostic for both
-Wsign-conversion and -Wenum-conversion.
I could change it to just warn on -Wenum-conversion as that was what I
initially did. Seeing PR35200 (or GitHub Issue 316268), I let both
diagnostics check so that the sign conversion could generate a warning.
We know we're going to overwrite it anyway.
It'd be a bit of work to coordinate not generating it at all, but setting this
flag avoids generating ~10k of the 13k string.
Differential Revision: https://reviews.llvm.org/D125180
This patch restores the symmetry between how operator new and operator delete
are handled by also inlining the content of operator delete when possible.
Patch by Fred Tingaud.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D124845
Like regular assignment, compound assignment operators can be assumed to
write to their left-hand side operand. So we strengthen the requirements
there. (Previously only the default read access had been required.)
Just like operator->, operator->* can also be assumed to dereference the
left-hand side argument, so we require read access to the pointee. This
will generate new warnings if the left-hand side has a pt_guarded_by
attribute. This overload is rarely used, but it was trivial to add, so
why not. (Supporting the builtin operator requires changes to the TIL.)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D124966
This is unneccesary work.
With this change, we skip generating and lexing ~10k of predefines twice.
A dumb benchmark of building a preamble for an empty file in a loop shows:
- before: 1.90ms/run
- after: 1.36ms/run
So this should be worth 0.5ms for each AST build and code completion.
There can be a functional difference, but it's very minor.
If the preamble contains e.g. `#ifndef __llvm__ ... #endif` then before we would
not take it. After this change we will take the branch (single-file mode takes
all branches with unknown conditions) and so gather different directives.
However I think this is negligible:
- this is already true of non-builtin macros (from included headers).
We've had no complaints.
- this affects the baseline and modified in the same way, so only makes a
difference transiently when code guarded by such an #ifdef is being edited
Differential Revision: https://reviews.llvm.org/D125179
This includes a fix for the libc++ issue I ran across with friend
declarations not properly being identified as overloads.
This reverts commit 45c07db31c.
The ifdef is not required in the header, common::int128_t is always
defined. The function declaration must be available in lowering
regardless of the host int128_t support.
Differential Revision: https://reviews.llvm.org/D125211
This fixes the first of several cases where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries.
In this particular case, we recognize that for the first vsetvli in a block, that if the AVL is a phi of GPR results from previous vsetvlis and the VTYPE field matches, we can avoid emitting a vsetvli as the register contents don't change. Unfortunately, the abstract state does change and that update was lost.
As noted in the test change, this can actually improve results by preserving information until later state transitions in the block. However, this minor codegen improvement is not the motivation for the patch. The motivation is to avoid cases a case where we break a key internal correctness invariant.
Differential Revision: https://reviews.llvm.org/D125133
With the demangler parenthesizing 'a >> b' inside template parameters,
because C++11 parsing of >> there, we don't really need to add spaces
between adjacent template arg closing '>' chars. In 2022, that just
looks odd.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D123134
As suggested from 02f8519502, this uses the
isAnyConstantBuildVector method in lieu of separate
isBuildVectorOfConstantSDNodes calls. It should
otherwise be an NFC.
Fold %x umin_seq %y to %x umin %y if %x cannot be zero. They only
differ in semantics for %x==0.
More generally %x *_seq %y folds to %x * %y if %x cannot be the
saturation fold (though currently we only have umin_seq).
D117829 added the generic "__builtin_reduce_mul" which we can use to replace the x86 specific integer mul reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required.
Differential Revision: https://reviews.llvm.org/D125222
When extracting the first lane of a predicate created using the
llvm.get.active.lane.mask intrinsic, it should give the same codegen as
when the predicate is created using the llvm.aarch64.sve.whilelo
intrinsic, since get.active.lane.mask is lowered to whilelo. This patch
ensures the codegen is the same by recognizing
llvm.get.active.lane.mask as a flag-setting operation in this case.
Differential Revision: https://reviews.llvm.org/D125215
Previously the EXPECT_AVAILABLE macros would rebuild the code at each marked
point, by expanding the cases textually.
There were often lots, and it's nice to have lots!
This reduces total unittest time by ~10% on my machine.
I did have to sacrifice a little apply() coverage in AddUsingTests (was calling
expandCases directly, which was otherwise unused), but we have
EXPECT_AVAILABLE tests covering that, I don't think there's real risk here.
Differential Revision: https://reviews.llvm.org/D125109
This is a clever cross-cutting sanity test for clang's arg parsing I suppose.
But clangd creates thousands of invocations, ~all with identical trivial
arguments, and problems with these would be caught by clang's tests.
This overhead accounts for 10% of total unittest time!
Differential Revision: https://reviews.llvm.org/D125169