These fields will be used to choose/influence patterns for
SPIR-V code generation.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D87106
This patch moves the tests for the old MemDepAnalysis based DSE
implementation to the MemDepAnalysis subdirectory and updates them to
pass -enable-dse-memoryssa=false.
This is in preparation for the switch to MemorySSA-backed DSE.
This refactors the standalone-translate executable to use mlirTranslateMain() declared in Translation.h and further applies D87129.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D87131
Drops the include on InitAllDialects.h, as dialects are now initialized in the translation passes.
Differential Revision: https://reviews.llvm.org/D87129
TestCPP11EnumTypes is one of the most expensive tests on my system and takes
around 35 seconds to run. A relatively large amount of that time is actually
doing CPU intensive work it seems (and not waiting on timeouts like other
slow tests).
The main issue is that this test repeatedly compiles the same source files
with different compiler defines. The test is also including standard library
headers, so it will also build all system modules with the gmodules debug
info variant. This leads to the problem that this test ends up compiling all
system Clang modules 8 times (one for each subtest with a unique define). As
the system modules are quite large, this causes that this test spends most
of its runtime just recompiling all system modules on macOS.
There is also the small issue that this test is starting and start-stopping
the test process a few hundred times.
This rewrites the test to instead just use a macro to instantiate all the
enum types in a single source and uses global variables to test the values
(which means there is no more need to continue/stop or even start a process).
I kept running all the debug info variants (event though it doesn't seem really
relevant) to keep this as NFC as possible.
This reduced the test runtime by around 1.5 seconds on my system (or in relative
numbers, the runtime of this test decreases by 95%).
While parsing LateParsedTemplates, Clang assumes that the Global DeclID matches
with the Local DeclID of a Decl. This is not the case when we have multiple
dependent modules , each having their own LateParsedTemplate section. In such a
case, a Local/Global DeclID confusion occurs which leads to improper casting of
FunctionDecl's.
This commit creates a Vector to map the LateParsedTemplate section of each
Module with their module file and therefore resolving the Global/Local DeclID
confusion.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D86514
Reduce to forward declaration, add the Register.h include that we still needed, move CCState::ensureMaxAlignment into CallingConvLower.cpp as it was the only function that needed the full definition of MachineFunction.
Fix a few implicit dependencies further down.
Extends lowerShuffleAsLanePermuteAndPermute to search for opportunities to use vpermq (64-bit cross-lane shuffle) and vpermd (32-bit cross-lane shuffle) to get elements into the correct lane, in addition to the 128-bit full-lane permutes it previously searched for.
This is especially helpful in cross-lane byte shuffles, where the alternative tends to be "vpshufb both lanes separately and blend them with a vpblendvb", which is very expensive, especially on Haswell where vpblendvb uses the same execution port as all the shuffles.
Addresses PR47262
Patch By: @TellowKrinkle (TellowKrinkle)
Differential Revision: https://reviews.llvm.org/D86429
This adds a simple tablegen pattern for folding predicate_cast(load)
into vldr p0, providing the alignment and offset are correct.
Differential Revision: https://reviews.llvm.org/D86702
We have the `RelSymbol<ELFT>` struct and can use it instead
of `std::pair<const Elf_Sym *, std::string>` in a few methods.
This is a bit cleaner.
Differential revision: https://reviews.llvm.org/D87092
The "restrict" keyword is illegal in C++, however, many libc
implementations use the "__restrict" compiler intrinsic in functions
prototypes. The "__restrict" keyword qualifies a type as a restricted type
even in C++.
In case of any non-C99 languages, we don't want to match based on the
restrict qualifier because we cannot know if the given libc implementation
qualifies the paramter type or not.
Differential Revision: https://reviews.llvm.org/D87097
This change implements pragma STDC FENV_ROUND, which is introduced by
the extension to standard (TS 18661-1). The pragma is implemented only
in frontend, it sets apprpriate state of FPOptions stored in Sema. Use
of these bits in constant evaluation adn/or code generator is not in the
scope of this change.
Parser issues warning on unsuppored pragma when it encounteres pragma
STDC FENV_ROUND, however it makes syntax checks and updates Sema state
as if the pragma were supported.
Primary purpose of the partial implementation is to facilitate
development of non-default floating poin environment. Previously a
developer cannot set non-default rounding mode in sources, this mades
preparing tests for say constant evaluation substantially complicated.
Differential Revision: https://reviews.llvm.org/D86921
This is one of the most expensive tests and runs for nearly half a minute on
my machine. Beside this test just doing a lot of work by iterating 15k times on
one ValueObject (which seems to be the point), it also runs this for every
debug info variant which doesn't seem relevant to just iterating ValueObject.
This marks it as no_debug_info_test to only run one debug info variation
and cut down the runtime to around 7 seconds on my machine.
I have fixed up some more ElementCount/TypeSize related warnings in
the following tests:
CodeGen/AArch64/sve-split-extract-elt.ll
CodeGen/AArch64/sve-split-insert-elt.ll
In SelectionDAG::CreateStackTemporary we were relying upon the implicit
cast from TypeSize -> uint64_t when calling MachineFrameInfo::CreateStackObject.
I've fixed this by passing in the known minimum size instead, which I
believe is fine because the associated stack id indicates whether this
is a scalable object or not.
I've also fixed up a case in TargetLowering::SimplifyDemandedBits when
extracting a vector element from a scalable vector. The result is a scalar,
hence it wasn't caught at the start of the function. If the vector is
scalable we just bail out for now.
Differential Revision: https://reviews.llvm.org/D86431
This patch updates MemCpyOpt to preserve MemorySSA. It uses the
MemoryDef at the insertion point of the builder and inserts the new def
after that def.
In some cases, we just modify a memory instruction. In that case, get
the defining access, then remove the memory access and add a new one.
If the defining access is in a different block, insert a new def at the
beginning of the current block, otherwise after the defining access.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D86651
Historically, the operations in the MLIR's LLVM dialect only checked that the
operand are of LLVM dialect type without more detailed constraints. This was
due to LLVM dialect types wrapping LLVM IR types and having clunky verification
methods. With the new first-class modeling, it is possible to define type
constraints similarly to other dialects and use them to enforce some
correctness rules in verifiers instead of having LLVM assert during translation
to LLVM IR. This hardening discovered several issues where MLIR was producing
LLVM dialect operations that cannot exist in LLVM IR.
Depends On D85900
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D85901
This reverts commit f369d51896. The bug this
fixes was already fixed by 1c5a0cb1c3 with the
same approach and this commit is now just giving the variable a second fallback
value.
The implementation is not fully standards compliant in the sense that
errno is not set on error, and floating point exceptions are not raised.
Subnormal range and normal range are tested separately in the tests.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D86666
When allowed, use 32-bit indices rather than 64-bit indices in the
SIMD computation of masks. This runs up to 2x and 4x faster on
a number of AVX2 and AVX512 microbenchmarks.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D87116
The new feature in GitHub called 'GitHub Codespaces' generates a
pythonenv3.8 directory in the root level of the llvm-project git
checkout. So I am adding that directory to the .gitignore.
See the following for more info:
https://github.com/features/codespaces
Differential Revision: https://reviews.llvm.org/D86846
Simplify:
defined(__ARM_DWARF_EH__) || !defined(__arm__)
to:
!defined(_LIBUNWIND_ARM_EHABI)
A later patch benefits from the simplicity. This change will result in
the two DWARF macros being defined when __USING_SJLJ_EXCEPTIONS__ is
defined, but:
* That's already the case with the __APPLE__ and _WIN32 clauses.
* That's also already the case with other architectures.
* With __USING_SJLJ_EXCEPTIONS__, most of the unwinder is #ifdef'ed
away.
Generally, when __USING_SJLJ_EXCEPTIONS__ is defined, most of the
libunwind code is removed by the preprocessor. e.g. None of the hpp
files are included, and almost all of the .c and .cpp files are defined
away, except in Unwind-sjlj.c. Unwind_AppleExtras.cpp is an exception
because it includes two hpp files, which it doesn't use. Remove the
unneeded includes for consistency with the general rule.
Reviewed By: steven_wu
Differential Revision: https://reviews.llvm.org/D86767
This adds the size to forward declared class DITypes, if the size is known.
Fixes an issue where we determine whether to emit fragments based on the
type size, so fragments would sometimes be incorrectly emitted if there
was no size.
Bug: https://bugs.llvm.org/show_bug.cgi?id=47338
Differential Revision: https://reviews.llvm.org/D87062
- When an operand is changed into an immediate value or like, ensure their
target flags being cleared or set properly.
Differential Revision: https://reviews.llvm.org/D87109
Previously we had two overloads where the only real difference beyond
parameter order was whether a reference parameter is const, where one
overload treated the reference parameter as an in-parameter and the
other treated it as an out-parameter!
Asan does not use metadata with primary allocators.
It should match AP64::kMetadataSize whic is 0.
Depends on D86917.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D86919
There are no know bugs related to this, still it may fix some latent ones.
Main concerns with preexisting code:
1. Inconsistent atomic/non-atomic access to the same field.
2. Assumption that bitfield chunk_state is always the first byte without
even taking into account endianness.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D86917
Previously, this code discarded the result of CheckPlaceholderExpr for
non-matrix subexpressions. Not only is this wasteful, but it was creating a
Warc-repeated-use-of-weak false-positive on the attached testcase, since the
discarded expression was still registered as a use of the weak property.
rdar://66162246
Differential revision: https://reviews.llvm.org/D87102
This patch scales the energy computed by the Entropic schedule based on the
execution time of each input. The input execution time is compared with the
average execution time of inputs in the corpus, and, based on the amount by
which they differ, the energy is scaled from 0.1x (for inputs executing slow) to
3x (for inputs executing fast). Note that the exact scaling criteria and formula
is borrowed from AFL.
On FuzzBench, this gives a sizeable throughput increase, which in turn leads to
more coverage on several benchmarks. For details, see the following report.
https://storage.googleapis.com/fuzzer-test-suite-public/exectime-report/index.html
Differential Revision: https://reviews.llvm.org/D86092