If the GEP instruction contanins only constants as its arguments,
then it should be recognized as a constant. For now, there was
also added a flag to turn off this simplification if it causes
any regressions ("disable-gep-const-evaluation") which is off
by default. Once I gather needed data of the effectiveness of
this simplification, the flag will be deleted.
Reviewers: apilipenko, davidxl, mtrofin
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D81026
This patch enables printing of constants to see which instructions were
constant-folded. Needed for tests and better visiual analysis of
inliner's work.
Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D81024
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".
Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev
Reviewed By: mtrofin
Differential revision: https://reviews.llvm.org/D81743
When using `-warnings-as-errors`, If there are any warnings promoted to errors, clang-tidy exits with the number of warnings. This really isn't needed and can cause issues when the number of warnings doesn't fit into 8 bits as POSIX terminals aren't designed to handle more than that.
This addresses https://bugs.llvm.org/show_bug.cgi?id=46305.
Bug originally added in D15528
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D81953
This reverts commit 5a95be22d2.
It causes GCC 5.3 to segfault:
In file included from /work/llvm.monorepo/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp:357:0: lib/Target/AArch64/AArch64GenGlobalISel.inc:189:17: in constexpr expansion of ‘llvm::LLT::scalar(16u)’
lib/Target/AArch64/AArch64GenGlobalISel.inc:205:1: internal compiler error: Segmentation fault
Summary:
The OpenMP loops are normalized and transformed into the loops from 0 to
max number of iterations. In some cases, original scheme may lead to
overflow during calculation of number of iterations. If it is unknown,
if we can end up with overflow or not (the bounds are not constant and
we cannot define if there is an overflow), cast original type to the
unsigned.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, openmp-commits, cfe-commits, caomhin
Tags: #clang, #openmp
Differential Revision: https://reviews.llvm.org/D81881
"error: 'get' is deprecated: The base class version of get with the scalable
argument defaulted to false is deprecated."
Changed VectorType::get() -> FixedVectorType::get().
Summary:
SYCL and OpenMP prohibits thread local storage in device code,
so this commit ensures that error is emitted for device code and not
emitted for host code when host target supports it.
Reviewers: jdoerfert, erichkeane, bader
Reviewed By: jdoerfert, erichkeane
Subscribers: guansong, riccibruno, ABataev, yaxunl, ebevhan, Anastasia, sstefan1, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81641
In more complicated loops we can easily hit the complexity limits of
loop strength reduction. If we do and filtering occurs, it's all too
easy to remove the wrong formulae for post-inc preferring accesses due
to it attempting to maximise register re-use. The patch adds an
alternative filtering step when the target is preferring postinc to pick
postinc formulae instead, hopefully lowering the complexity to below the
limit so that aggressive filtering is not needed.
There is also a change in here to stop considering existing addrecs as
free under postinc. We should already be modelling them as a reg so
don't want it to cause us to get the cost wrong. (I'm not sure that code
makes sense in general, but there are X86 tests specifically for it
where it seems to be helping so have left it around for the standard
non-post-inc case).
Differential Revision: https://reviews.llvm.org/D80273
`Elf_GnuHash_Impl` has the following method:
```
ArrayRef<Elf_Word> values(unsigned DynamicSymCount) const {
return ArrayRef<Elf_Word>(buckets().end(), DynamicSymCount - symndx);
}
```
When DynamicSymCount is less than symndx we return an array with the huge broken size.
This patch fixes the issue and adds an assert. This assert helped to fix an issue
in one of the test cases.
Differential revision: https://reviews.llvm.org/D81937
`printGnuHashTable` contains the code to check the GNU hash table.
This patch splits it to `getGnuHashTableChains` helper
(and reorders slightly to reduce).
Differential revision: https://reviews.llvm.org/D81928
Spills of VCC (SGPR64) will fail with new SGPR spill code,
because super register is not correctly resolved.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D81224
I originally reverted the patch because it was causing performance
issues, but now I think it's just enabling simplify-cfg to do
something that I don't want instead :)
Sorry for the noise.
This reverts commit 3e39760f8e.
The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.
Differential revision: https://reviews.llvm.org/D81996
This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised
loops if the intrinsic is supported by the backend, which is checked by
querying TargetTransform hook emitGetActiveLaneMask.
This intrinsic creates a mask representing active and inactive vector lanes,
which is used by the masked load/store instructions that are created for
tail-folded loops. The semantics of @llvm.get.active.mask are described here in
LangRef:
https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics
This intrinsic is also used to provide a hint to the backend. That is, the
second argument of the intrinsic represents the back-edge taken count of the
loop. For MVE, for example, we use that to set up tail-predication, which is a
new form of predication in MVE for vector loops that implicitely predicates the
last vector loop iteration by implicitely setting active/inactive lanes, i.e.
the tail loop is predicated. In order to set up a tail-predicated vector loop,
we need to know the number of data elements processed by the vector loop, which
corresponds the the tripcount of the scalar loop, which we can now reconstruct
using @llvm.get.active.mask.
Differential Revision: https://reviews.llvm.org/D79100
llvm::getHeatColor becomes a problem when maxFreq = 0 -> freq = 0 =>
log2(double(freq)) / log2(maxFreq) -> log2(0.) / log2(0.) which
results in illegal instruction on some architectures.
Problematic revision: https://reviews.llvm.org/D77172
Currently load instructions are added to the cache for invariant pointer
group dependencies, but only pointer values are removed currently. That
leads to dangling AssertingVHs in the test case below, where we delete a
load from an invariant pointer group. We should also remove the entries
from the cache.
Fixes PR46054.
Reviewers: efriedma, hfinkel, asbirlea
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D81726
This is a natural extension of the previous changes to use the Cursor
class independently in the standard and extended opcode paths, and in
turn allows delaying error handling until the entire line has been
printed in verbose mode, removing interleaved output in some cases.
Reviewed by: MaskRay, JDevlieghere
Differential Revision: https://reviews.llvm.org/D81562
Summary:
This code is going to be used in StackSafety.
This patch is file move with minimal changes. Identifiers
will be fixed in the followup patch.
Reviewers: eugenis, pcc
Reviewed By: eugenis
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81831
Summary:
In the patch D62907 the PPC CTRLoops pass has been replaced by Generic
Hardware Loop pass, and it has imported some new intrinsic for Generic
Hardware Loop.
The old intrinsic used in PPC CTRLoops int_ppc_mtctr and
int_ppc_is_decremented_ctr_nonzero is been replaced by
int_set_loop_iterations and loop_decrement.
This patch is to remove above unused two instrinsic.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D81539
We weren't re-entering template scopes in the right order, causing this
to break self-host with -fdelayed-template-parsing.
This reverts commit 237c2a23b6.
The functions sys::ExcecuteAndWait and sys::Wait now have additional
argument of type pointer to structure, which is filled with process
execution statistics upon process termination. These are total and user
execution times and peak memory consumption. By default this argument is
nullptr so existing users of these function must not change behavior.
Differential Revision: https://reviews.llvm.org/D78901
This diagnostic (which defaults to an error, added in
95833f33bd) was intended to clearly
point out cases where the C++ ABI won't match the Microsoft C++ ABI,
for cases when this is enabled via a pragma over a region of code.
The MSVC compatible struct layout feature can also be enabled via a
compiler option (-mms-bitfields). If enabled that way, one essentially
can't compile any C++ code unless also building with
-Wno-incompatible-ms-struct (which GCC doesn't support, and projects
developed with GCC aren't setting).
For the MinGW target, it's expected that the C++ ABI won't match
the MSVC one, if this option is used for getting the struct
layout to match MSVC.
Differential Revision: https://reviews.llvm.org/D81794