Summary:
Doing so allows us to increase test coverage by removing unnecessary
language restrictions.
Reviewers: hlopko, eduucaldas
Reviewed By: hlopko, eduucaldas
Subscribers: gribozavr2, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81040
Summary:
I changed `markStmtChild` to ignore implicit expressions the same way as
`markExprChild` does it already. The test that I modified crashes
without this change.
Reviewers: hlopko, eduucaldas
Reviewed By: hlopko, eduucaldas
Subscribers: gribozavr2, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81019
Updates the docs to include `MacroDefinition` documentation. The docs are still missing `ObjCIVar` however I don't have a clue about how that looks in code. If someone wants to show the code block needed for the example I'll add that in too.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D80877
These two nodes were added by 69caef2b78 in 2005
and they are not used by PowerPC backend anymore. And the ISD::FMA is a prefer
way for VMADDFP if we really want to create that node. For VNMSUBFP, we will
also add a more generic node FNMSUB in D76585 if we really want it.
Reviewed By: qiucf
Differential Revision: https://reviews.llvm.org/D80429
Use getVectorElementCount() instead of getVectorNumElements().
The code changed in this patch is covered by an existing test:
CodeGen/AArch64/sve-intrinsics-contiguous-prefetches.ll
Differential Revision: https://reviews.llvm.org/D80615
New functions `lockFile`, `tryLockFile` and `unlockFile` implement
simple file locking. They lock or unlock entire file. This must be
enough to support simulataneous writes to log files in parallel builds.
Differential Revision: https://reviews.llvm.org/D78896
Summary
- Implemented C876, C877
- Fixed IsConstantExpr to check C879
- Fixed bugs in few test cases - data01.f90, block-data01.f90,
pre-fir-tree02.f90
- Modified implementation of C8106 to identify all automatic objects
and modified equivalence01.f90 to reflect the changes
Differential Revision: https://reviews.llvm.org/D78424
Explicitly set the exec mask for SGPR spills and reloads.
This fixes a bug where SGPR spills to memory could be incorrect
if the exec mask was 0 (or differed between spill and reload).
Additionally pack scalar subregisters (upto 16/32 per VGPR),
so that the majority of scalar types can be spilt or reloaded
with a simple memory access. This should amortize some of the
additional overhead of manipulating the exec mask.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D80282
Port the code to recognize a zip1/zip2 shuffle mask from AArch64ISelLowering
and put it into the post-legalizer combiner.
Add G_ZIP1 and G_ZIP2 to AArch64InstrGISel.td and hook them up as equivalent
nodes to AArch64zip1 and AArch64zip2. This allows us to select them.
Minor code size improvements for SPECINT2000 at -O3 on 197.parser, 252.eon, and
186.crafty.
Differential Revision: https://reviews.llvm.org/D80969
Summary:
This patch simplifies FindMostPopularDest without changing the
functionality.
Given a list of jump threading destinations, the function finds the
most popular destination. To ensure determinism when there are
multiple destinations with the highest popularity, the function picks
the first one in the successor list with the highest popularity.
Without this patch:
- The function populates DestPopularity -- a histogram mapping
destinations to their respective occurrence counts.
- Then we iterate over DestPopularity, looking for the highest
popularity while building a vector of destinations with the highest
popularity.
- Finally, we iterate the successor list, looking for the destination
with the highest popularity.
With this patch:
- We implement DestPopularity with MapVector instead of DenseMap. We
populate the map with popularity 0 for all successors in the order
they appear in the successor list.
- We build the histogram in the same way as before.
- We simply use std::max_element on DestPopularity to find the most
popular destination. The use of MapVector ensures determinism.
Reviewers: wmi, efriedma
Reviewed By: wmi
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81030
When sampleFDO is enabled, people may expect they can use
-fno-profile-sample-use to opt-out using sample profile for a certain file.
That could be either for debugging purpose or for performance tuning purpose.
However, when thinlto is enabled, if a function in file A compiled with
-fno-profile-sample-use is imported to another file B compiled with
-fprofile-sample-use, the inlined copy of the function in file B may still
get its profile annotated.
The inconsistency may even introduce profile unused warning because if the
target is not compiled with explicit debug information flag, the function
in file A won't have its debug information enabled (debug information will
be enabled implicitly only when -fprofile-sample-use is used). After it is
imported into file B which is compiled with -fprofile-sample-use, profile
annotation for the outline copy of the function will fail because the
function has no debug information, and that will trigger profile unused
warning.
We add a new attribute use-sample-profile to control whether a function
will use its sample profile no matter for its outline or inline copies.
That will make the behavior of -fno-profile-sample-use consistent.
Differential Revision: https://reviews.llvm.org/D79959
The Darwin builder is passing some of the make arguments trough the
environment instead of the command line. Update the dsym builder to do
the same as the other variants.
Don't use the environment to pass values to the builder that are present
in the dotest configuration module. A subsequent patch will pass the
remaining values through the configuration instead of the environment.
This lets us to remove !stack-safe metadata and
better controll when to perform StackSafety
analysis.
Reviewers: eugenis
Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80771
Summary:
An upgrade of LLVM for CrOS [0] containing [1] triggered a bunch of
errors related to writing to reserved registers for a Linux kernel's
arm64 compat vdso (which is a aarch32 image).
After a discussion on LKML [2], it was determined that
-f{no-}omit-frame-pointer was not being specified. Comparing GCC and
Clang [3], it becomes apparent that GCC defaults to omitting the frame
pointer implicitly when optimizations are enabled, and Clang does not.
ie. setting -O1 (or above) implies -fomit-frame-pointer. Clang was
defaulting to -fno-omit-frame-pointer implicitly unless -fomit-frame-pointer
was set explicitly.
Why this becomes a problem is that the Linux kernel's arm64 compat vdso
contains code that uses r7. r7 is used sometimes for the frame pointer
(for example, when targeting thumb (-mthumb)). See useR7AsFramePointer()
in llvm/llvm-project/llvm/lib/Target/ARM/ARMSubtarget.h. This is mostly
for legacy/compatibility reasons, and the 2019 Q4 revision of the ARM
AAPCS looks to standardize r11 as the frame pointer for aarch32, though
this is not yet implemented in LLVM.
Users that are reliant on the implicit value if unspecified when
optimizations are enabled should explicitly choose -fomit-frame-pointer
(new behavior) or -fno-omit-frame-pointer (old behavior).
[0] https://bugs.chromium.org/p/chromium/issues/detail?id=1084372
[1] https://reviews.llvm.org/D76848
[2] https://lore.kernel.org/lkml/20200526173117.155339-1-ndesaulniers@google.com/
[3] https://godbolt.org/z/0oY39t
Reviewers: kristof.beyls, psmith, danalbert, srhines, MaskRay, ostannard, efriedma
Reviewed By: psmith, danalbert, srhines, MaskRay, efriedma
Subscribers: efriedma, olista01, MaskRay, vhscampos, cfe-commits, llvm-commits, manojgupta, llozano, glider, hctim, eugenis, pcc, peter.smith, srhines
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80828
This patch enables affine loop fusion for loops with affine vector loads
and stores. For that, we only had to use affine memory op interfaces in
LoopFusionUtils.cpp and Utils.cpp so that vector loads and stores are
also taken into account.
Reviewed By: andydavis1, ftynse
Differential Revision: https://reviews.llvm.org/D80971
Previously, the SpecificAllocator was a static local in the `make<T>`
function template. Using static locals is nice because they are only
constructed and registered if they are accessed. However, if there are
multiple calls to make<> with different constructor parameters, we would
get multiple static local variable instances. This is undesirable and
leads to extra memory allocations. I noticed there were two sources of
DefinedRegular allocations while checking heap profiles.
-Fix one place where we had a X86vzload64 but should have had
X86vzload32.
-Make sure all patterns that have scalar_to_vector+loadi64 also
have scalar_to_vector+f64 to match 32-bit codegen.
-Add some bitcasts that were missing from patterns.
-Make sure that if we have a scalar_to_vector+load pattern
we also have a vzload pattern.
We probably need some better canonicalization to avoid having
so many patterns.
parameters with default arguments.
Directly follow the wording by relaxing the AST invariant that all
parameters after one with a default arguemnt also have default
arguments, and removing the diagnostic on missing default arguments
on a pack-expanded parameter following a parameter with a default
argument.
Testing also revealed that we need to special-case explicit
specializations of templates with a pack following a parameter with a
default argument, as such explicit specializations are otherwise
impossible to write. The standard wording doesn't address this case; a
issue has been filed.
This exposed a bug where we would briefly consider a parameter to have
no default argument while we parse a delay-parsed default argument for
that parameter, which is also fixed.
Partially incorporates a patch by Raul Tambre.