This will allow us to make osx specific changes easier. Because apple silicon macs also run on aarch64, it was easy to confuse it with iOS.
rdar://75302812
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D100157
The checks did not work in __config, since no header defining
`_NEWLIB_VERSION` was included before. This patch moves the two
checks for newlib to the headers that actually need it - and after
they already include relevant headers.
Differential Revision: https://reviews.llvm.org/D79888
SROA can handle invariant group intrinsics, let the inliner know that
for better heuristics when the intrinsics are present.
This fixes size issues in a couple files when turning on
-fstrict-vtable-pointers in Chrome.
Reviewed By: rnk, mtrofin
Differential Revision: https://reviews.llvm.org/D100249
This patch removes all uses of `std::iterator`, which was deprecated in C++17.
While this isn't currently an issue while compiling LLVM, it's useful for those using LLVM as a library.
For some reason there're a few places that were seemingly able to use `std` functions unqualified, which no longer works after this patch. I've updated those places, but I'm not really sure why it worked in the first place.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D67586
This patch adds more optimized codegen for the above SETCC forms,
by matching the '.vi' vector forms when the immediate is a 5-bit signed
immediate plus 1. The immediate can be decremented and the corresponding
SET[U]LE or SET[U]GT forms can be matched.
This work was left as a TODO from D94168.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100096
Symbols are now supported in the integer emptiness check. Remove some outdated assertions checking that there are no symbols.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D100327
This CL introduces a generic attribute (called "encoding") on tensors.
The attribute currently does not carry any concrete information, but the type
system already correctly determines that tensor<8xi1,123> != tensor<8xi1,321>.
The attribute will be given meaning through an interface in subsequent CLs.
See ongoing discussion on discourse:
[RFC] Introduce a sparse tensor type to core MLIR
https://llvm.discourse.group/t/rfc-introduce-a-sparse-tensor-type-to-core-mlir/2944
A sparse tensor will look something like this:
```
// named alias with all properties we hold dear:
#CSR = {
// individual named attributes
}
// actual sparse tensor type:
tensor<?x?xf64, #CSR>
```
I see the following rough 5 step plan going forward:
(1) introduce this format attribute in this CL, currently still empty
(2) introduce attribute interface that gives it "meaning", focused on sparse in first phase
(3) rewrite sparse compiler to use new type, remove linalg interface and "glue"
(4) teach passes to deal with new attribute, by rejecting/asserting on non-empty attribute as simplest solution, or doing meaningful rewrite in the longer run
(5) add FE support, document, test, publicize new features, extend "format" meaning to other domains if useful
Reviewed By: stellaraccident, bondhugula
Differential Revision: https://reviews.llvm.org/D99548
We will soon be adding non-AVX512 operations to MLIR, such as AVX's rsqrt. In https://reviews.llvm.org/D99818 several possibilities were discussed, namely to (1) add non-AVX512 ops to the AVX512 dialect, (2) add more dialects (e.g. AVX dialect for AVX rsqrt), and (3) expand the scope of the AVX512 to include these SIMD x86 ops, thereby renaming the dialect to something more accurate such as X86Vector.
Consensus was reached on option (3), which this patch implements.
Reviewed By: aartbik, ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100119
Fusing a constant with a linalg.generic operation can result in the
fused operation being illegal since the loop bound computation
fails. Avoid such fusions.
Differential Revision: https://reviews.llvm.org/D100272
Mark the test as unsupported to bring the bot online. Could probably be
permanently fixed by using one of the workarounds already present in
compiler-rt.
(this was originally part of https://reviews.llvm.org/D96281 and has been split off into its own patch)
If a macro is used within a function, the code inside the macro
doesn't make the code less readable. Instead, for a reader a macro is
more like a function that is called. Thus the code inside a macro
shouldn't increase the complexity of the function in which it is called.
Thus the flag 'IgnoreMacros' is added. If set to 'true' code inside
macros isn't considered during analysis.
This isn't perfect, as now the code of a macro isn't considered at all,
even if it has a high cognitive complexity itself. It might be better if
a macro is considered in the analysis like a function and gets its own
cognitive complexity. Implementing such an analysis seems to be very
complex (if possible at all with the given AST), so we give the user the
option to either ignore macros completely or to let the expanded code
count to the calling function's complexity.
See the code example from vgeof (originally added as note in https://reviews.llvm.org/D96281)
bool doStuff(myClass* objectPtr){
if(objectPtr == nullptr){
LOG_WARNING("empty object");
return false;
}
if(objectPtr->getAttribute() == nullptr){
LOG_WARNING("empty object");
return false;
}
use(objectPtr->getAttribute());
}
The LOG_WARNING macro itself might have a high complexity, but it do not make the
the function more complex to understand like e.g. a 'printf'.
By default 'IgnoreMacros' is set to 'false', which is the original behavior of the check.
Reviewed By: lebedev.ri, alexfh
Differential Revision: https://reviews.llvm.org/D98070
With clang 11 on macos we were getting this warning:
```
flang/runtime/random.cpp:61:30: error: non-constant-expression cannot be narrowed from type 'unsigned long long' to 'runtime::GeneratedWord' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
GeneratedWord word{(generator() - generator.min()) & rangeMask};
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
flang/runtime/random.cpp:99:5: note: in instantiation of function template specialization 'runtime::Generate<double, 53>' requested here
Generate<CppTypeFor<TypeCategory::Real, 8>, 53>(harvest);
^
```
Changing the type of `rangeMask` fixes it.
Differential Revision: https://reviews.llvm.org/D100320
D24453 enabled libcalls simplication for ARM PCS. This may cause
caller/callee calling conventions mismatch in some situations such as
LTO. This patch makes instcombine aware that the compatible calling
conventions differences are benign (not emitting undef idom).
Differential Revision: https://reviews.llvm.org/D99773
These [[nodiscard]] annotations are added as a conforming extension;
it's unclear whether the paper will actually be adopted and make them
mandatory, but they do seem like good ideas regardless.
https://isocpp.org/files/papers/D2351R0.pdf
This patch implements the paper's effect on:
- std::to_integer, std::to_underlying
- std::forward, std::move, std::move_if_noexcept
- std::as_const
- std::identity
The paper also affects (but libc++ does not yet have an implementation of):
- std::bit_cast
Differential Revision: https://reviews.llvm.org/D99895
`__decay_copy` is used by `std::thread`'s constructor to copy its arguments
into the new thread. If `__decay_copy` claims to be noexcept, but then
copying the argument does actually throw, we'd call std::terminate instead
of passing this test. (And I've verified that adding an unconditional `noexcept`
to `__decay_copy` does indeed fail this test.)
Differential Revision: https://reviews.llvm.org/D100277
If we run passes before lowering llvm.expect intrinsics to metadata,
then those passes have no way to act on the hints provided by llvm.expect.
SimplifyCFG is the known offender, and we made it smarter about profile
metadata in D98898.
In the motivating example from https://llvm.org/PR49336 , this means we
were ignoring the recommended method for a programmer to tell the compiler
that a compare+branch is expensive. This change appears to solve that case -
the metadata survives to the backend, the compare order is as expected in IR,
and the backend does not do anything to reverse it.
We make the same change to the old pass manager to keep things synchronized.
Differential Revision: https://reviews.llvm.org/D100213
Add a number of intrinsics which natively lower to MVE operations to the
lane interleaving pass, allowing it to efficiently interleave the lanes
of chucks of operations containing these intrinsics.
Differential Revision: https://reviews.llvm.org/D97293
After this patch, we can use `--param std=c++20` even if the compiler only
supports -std=c++2a. The test suite will handle that for us. The only Lit
feature that isn't fully baked will always be the "in development" one,
since we don't know exactly what year the standard will be ratified in.
This is another take on https://reviews.llvm.org/D99789.
Differential Revision: https://reviews.llvm.org/D100210
Fixes the issues noted in PR48768, where the and/or/xor instruction had been promoted to avoid i8/i16 partial-dependencies, but the test against zero had not.
We can almost certainly relax this fold to work for any truncation, although it breaks a number of existing folds (notable movmsk folds which tend to rely on the truncate to determine the demanded bits/elts in the source vector).
There is a reverse combine in TargetLowering.SimplifySetCC so we must wait until after legalization before attempting this.
Users can reset any external index set by previous fragments by
putting a `None` for the external block, e.g:
```
Index:
External: None
```
Differential Revision: https://reviews.llvm.org/D100106
The previous code calculated the first ldtilecfg by dominating all AMX registers' def. This may result in the ldtilecfg being inserted into a loop.
This patch try to calculate the nearest point where all shapes of AMX registers are reachable.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D99010
FP16 to FP32 converts can be handled in MVE lane interleaving, much like
the sext/zext lowering we do. This expands the pass with fpext and
fptrunc handling, and basic fp operations allowing more efficient
lowering of fp vectors.
Differential Revision: https://reviews.llvm.org/D97292
The patch makes two updates to the arm-block-placement pass:
- Handle arbitrarily nested loops
- Extends the search (for t2WhileLoopStartLR) to the predecessor of the
preHeader.
Differential Revision: https://reviews.llvm.org/D99649
The `linalg.index` operation provides access to the iteration indexes of immediately enclosing linalg operations. It takes a dimension `dim` attribute and returns the iteration index in the given dimension. Having `linalg.index` allows us to unify `linalg.generic` and `linalg.indexed_generic` and also enables index access in named operations.
Differential Revision: https://reviews.llvm.org/D100292
This reverts commit cca9b5985c.
Buildbot reported an error for CodeGen/AArch64/machine-combiner-fmul-dup.mir:
*** Bad machine code: Virtual register killed in block, but needed live out. ***
- function: indexed_2s
- basic block: %bb.0 entry (0x640fee8)
Virtual register %7 is used after the block.
*** Bad machine code: Virtual register defs don't dominate all uses. ***
- function: indexed_2s
- v. register: %7
LLVM ERROR: Found 2 machine code errors.
This patch adds DUP+FMUL => FMUL_indexed pattern to InstCombiner.
FMUL_indexed is normally selected during instruction selection, but it
does not work in cases when VDUP and VMUL are in different basic
blocks.
Differential Revision: https://reviews.llvm.org/D99662
That code is unused since it's check-in in 2010 (and I believe it would leak
memory when called as it releases the passed unique_ptr), so let's delete it.
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D100212
When LLDB's DWARF parser is parsing the member DIEs of a struct/class it
currently fully resolves the types of static member variables in a class before
adding the respective `VarDecl` to the record.
For record types fully resolving the type will also parse the member DIEs of the
respective class. The other way of resolving is just 'forward' resolving the type
which will try to load only the minimum amount of information about the type
(for records that would only be the name/kind of the type). Usually we always
resolve types on-demand so it's rarely useful to speculatively fully resolve
them on the first use.
This patch changes makes that we only 'forward' resolve the types of static
members. This solves the fact that LLDB unnecessarily loads debug information
to parse the type if it's maybe not needed later and it also avoids a crash where
the parsed type might in turn reference the surrounding class that is currently
being parsed.
The new test case demonstrates the crash that might happen. The crash happens
with the following steps:
1. We parse class `ToLayout` and it's members.
2. We parse the static class member and fully resolve its type
(`DependsOnParam2<ToLayout>`).
3. That type has a non-static class member `DependsOnParam1<ToLayout>` for which
LLDB will try to calculate the size.
4. The layout (and size)`DependsOnParam1<ToLayout>` turns depends on the
`ToLayout` size/layout.
5. Clang will calculate the record layout/size for `ToLayout` even though we are
currently parsing it and it's missing it's non-static member.
The created is missing the offset for the yet unparsed non-static member. If we
later try to get the offset we end up hitting different asserts. Most common is
the one in `TypeSystemClang::DumpValue` where it checks that the record layout
has offsets for the current FieldDecl.
```
assert(field_idx < record_layout.getFieldCount());
```
Fixed rdar://67910011
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D100180