Previously any load (global, local or constant) feeding into a
global load or store would be counted as an indirect access. This
patch only counts global loads feeding into a global load or store.
The rationale is that the latency for global loads is generally
much larger than the other kinds.
As a side effect this makes it easier to write small kernels test
cases that are not counted as having indirect accesses, despite
the fact that arguments to the kernel are accessed with an SMEM
load.
Differential Revision: https://reviews.llvm.org/D122804
During skeleton construction for the epilogue vector loop, generic
helpers use getOrCreateTripCount, which will re-expand the trip count
computation. Instead, re-use the TripCount created during main loop
vectorization.
* Add instantiation tests to ItaniumDemangleTest, to make sure all
match functions provide constructor arguments to the provided functor.
* Fix the Node constructors that lost const qualification on arguments.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D122665
This patch modifies the name "integerRelations" and "relation" to refer to the
disjuncts in PresburgerRelation to "disjunct(s)". This is done to be
consistent with the rest of the interface.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D122892
In order to add a unit test, we need to expose the node names beyond
ItaniumDemangle.h. This breaks them out into a def file.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D122739
Do import the definition of objects from a foreign translation unit if that's type is const and trivial.
Differential Revision: https://reviews.llvm.org/D122805
Previously, when an input set had a duplicate division, the duplicates might
be removed by a call to mergeLocalIds due to being detected as being duplicate
for the first time. The subtraction implementation cannot handle existing
locals being removed, so this would lead to unexpected behaviour. Resolve this
by removing all the duplicates up front.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122826
Update the hardware CRC32 logic in scudo to support using `-mcrc32`
instead of `-msse4.2`. The CRC32 intrinsics use the former flag
in the newer compiler versions, e.g. in clang since 12fa608af4.
With these versions of clang, passing `-msse4.2` is insufficient
to enable the instructions and causes build failures when `-march` does
not enable CRC32 implicitly:
/var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.cpp:20:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'computeHardwareCRC32' that is compiled without support for 'crc32'
return CRC32_INTRINSIC(Crc, Data);
^
/var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.h:27:27: note: expanded from macro 'CRC32_INTRINSIC'
# define CRC32_INTRINSIC FIRST_32_SECOND_64(_mm_crc32_u32, _mm_crc32_u64)
^
/var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/../sanitizer_common/sanitizer_platform.h:132:36: note: expanded from macro 'FIRST_32_SECOND_64'
# define FIRST_32_SECOND_64(a, b) (a)
^
1 error generated.
For backwards compatibility, use `-mcrc32` when available and fall back
to `-msse4.2`. The `<smmintrin.h>` header remains in use as it still
works and is compatible with GCC, while clang's `<crc32intrin.h>`
is not.
Use __builtin_ia32*() rather than _mm_crc32*() when using `-mcrc32`
to preserve compatibility with GCC. _mm_crc32*() are aliases
to __builtin_ia32*() in both compilers but GCC requires `-msse4.2`
for the former, while both use `-mcrc32` for the latter.
Originally reported in https://bugs.gentoo.org/835870.
Differential Revision: https://reviews.llvm.org/D122789
LexSimplex cannot be made to support symbols for symbolic lexmin; this requires
a second class. In preparation for upstreaming support for symbolic lexmin,
keep the part of LexSimplex that are specific to non-symbolic lexmin in LexSimplex
and move the parts that are required to a common class LexSimplexBase for both to
inherit from.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D122828
If the frame pointer is an argument of the original pointer (which
happens with opaque pointers), then we currently first replace the
argument with undef, which will prevent later replacement of the
old frame pointer with the new one.
Fix this by replacing arguments with some dummy instructions first,
and then replacing those with undef later. This gives us a chance
to replace the frame pointer before it becomes undef.
Fixes https://github.com/llvm/llvm-project/issues/54523.
Differential Revision: https://reviews.llvm.org/D122375
This isn't expected to reduce compilation times as 'max-iters' is set to
one by default, but it helps with recursive functions that require higher
iteration counts.
Differential Revision: https://reviews.llvm.org/D122819
This patch modifies IntegerPolyhedron, IntegerRelation, PresburgerRelation,
PresburgerSet, PWMAFunction, constructors to take PresburgerSpace instead of
dimensions. This allows information present in PresburgerSpace to be carried
better and allows for a general interface.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D122842
This patch supports ordered clause specified without parameter in
worksharing-loop directive in the OpenMPIRBuilder and lowering MLIR to
LLVM IR.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114940
At present, we are generating wrong code for C++20 modules entities which
should have internal linkage. This is because we are assigning
'ModuleInternalLinkage' unconditionally to such entities. However this mode
is only applicable to the modules-ts.
This change makes the special linkage mode conditional on fmodules-ts and
adds a unit test to verify that we generate the correct linkage.
Currently, static variables and functions in module purview are emitted into
object files as external. On some platforms, lambdas are emitted as global
weak defintions (on Windows this causes a mangler crash).
Differential Revision: https://reviews.llvm.org/D122413
This reverts commit 09b53121c3.
Breaks the build with GCC 11.2 on x86_64:
In file included from /home/npopov/repos/llvm-project/compiler-rt/lib/scudo/scudo_crc32.h:27,
from /home/npopov/repos/llvm-project/compiler-rt/lib/scudo/scudo_crc32.cpp:14:
/usr/lib/gcc/x86_64-redhat-linux/11/include/smmintrin.h: In function ‘__sanitizer::u32 __scudo::computeHardwareCRC32(__sanitizer::u32, __sanitizer::uptr)’:
/usr/lib/gcc/x86_64-redhat-linux/11/include/smmintrin.h:846:1: error: inlining failed in call to ‘always_inline’ ‘long long unsigned int _mm_crc32_u64(long long unsigned int, long long unsigned int)’: target specific option mismatch
846 | _mm_crc32_u64 (unsigned long long __C, unsigned long long __V)
The pointer.volatile.pass.cpp test was already marked as XFAIL for
mingw-dll (for reasons explained in the comment above it).
The same issue also appears in clang-cl-dll when built with newer
CMake versions. (It didn't appear with older versions of CMake, as
CMake built the library with the clang-cl flag `-std:c++latest` when
we've requested C++ 20 - which practically built it in c++2b mode with
current clang versions. With current versions of CMake, it passes
`-std:c++20` instead.)
As it succeeds/fails dependent on factors we don't
directly control, mark it as UNSUPPORTED instead of XFAIL.
Differential Revision: https://reviews.llvm.org/D122718
Many callers of SendPacket() in RNBRemote.cpp have a local std::string
object, call c_str() on it to pass a c-string, which is then copied into
a std::string temporary object.
Also free JSONGenerator objects once we've formatted them into
ostringstream and don't need the objects any longer, to reduce max
memory use in debugserver.
Differential Revision: https://reviews.llvm.org/D122848
rdar://91117263
This shows that pushing constant to the right in a commutative op leads
to `applyPatternsAndFoldGreedily` to converge without applying all the
patterns.
Differential Revision: https://reviews.llvm.org/D122870
This reverts commit 59bbc7a085.
This exposes an issue breaking the contract of
`applyPatternsAndFoldGreedily` where we "converge" without applying
remaining patterns.
This approach is used by AArch64/RISCV to make frame-setup/frame-destroy
instructions contiguous instead of being interleaved by CFI instructions. Code
checking `MBBI->getFlag(MachineInstr::FrameSetup) || MBBI->isCFIInstruction()`
can be simplified to just check FrameSetup.
This helps locate all CFI instructions in the prologue, which can be handy to use
.cfi_remember_state/.cfi_restore_state to decrease unwind table size (D114545).
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122541
lexically contains a mention of the pack.
Systematically distinguish between syntactic and semantic references to
packs, especially when propagating dependence from a type into an
expression. We should consult the type-as-written when computing
syntactic dependence and should consult the semantic type when computing
semantic dependence.
Fixes#54402.
A context can be created by invoking the `getFunctionProfileForContext` function in two ways:
- by using a probe and its calling context.
- by removing the leaf frame from an existing contexts. The first way is used when generating a function profile for a given LBR range, and the input `WasLeafInlined` is computed depending on the actually probe in the LBR range. The second way is used when using the entry count of an inlinee function profile to update its inliner callsite count, so `WasLeafInlined` is unknown for the inliner frame.
The two invocations can happen in different order on different platforms, since the lbr ranges are stored in an unordered_map, and we are making sure `ContextWasInlined` is always set correctly.
This should fix the random test failure introduced by D121655
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D122844