This patch fixes a Windows -EHa crash induced by previous commit 797ad70152.
The crash was caused by "LifetimeMarker" scope (with option -O2) that should not be considered as SEH Scope.
This change also turns off -fasync-exceptions by default under -EHa option for now.
Differential Revision: https://reviews.llvm.org/D103664#2799944
Nesting mode is a new experimental feature in the OpenMP
runtime. It allows a user to set up nesting for an application in a
way that corresponds to the hardware topology levels on the machine an
application is being run on. For example, if a machine has 2 sockets,
each with 12 cores, then use of nesting mode could set up an outer
level of nesting that uses 2 threads per parallel region, and an inner
level of nesting that uses 12 threads per parallel region.
Nesting mode is controlled with the KMP_NESTING_MODE environment
variable as follows:
1) KMP_NESTING_MODE = 0: Nesting mode is off (default); max-active-levels-var
is set to 1 (the default -- nesting is off, nested parallel regions
are serialized).
2) KMP_NESTING_MODE = 1: Nesting mode is on, and a number of threads
will be assigned for each level discovered in the machine topology;
max-active-levels-var is set to the number of levels discovered.
3) KMP_NESTING_MODE = n, n>1: [Note: this option is experimental and may change
or be removed in the future.] Nesting mode is on, and a number of
threads will be assigned for each topology level discovered on the
machine, up to k<=n levels (since there may be fewer than n levels
discovered in the topology), and beyond the kth level, nested parallel
regions will be serialized; NOTE: max-active-levels-var is 1 (the default --
nesting is off, and nested parallel regions are serialized until the
user changes max-active-levels-var.
If the user sets OMP_NUM_THREADS or OMP_MAX_ACTIVE_LEVELS, they will
override KMP_NESTING_MODE settings for the associated environment
variables. The detected topology may be limited by an affinity mask
setting on the initial thread, or if the user sets KMP_HW_SUBSET. See
also: KMP_HOT_TEAMS_MAX_LEVEL for controlling use of hot teams for
nested parallel regions. Note that this feature only sets numbers of
threads used at nesting levels. The user should make use of
OMP_PLACES and OMP_PROC_BIND or KMP_AFFINITY for affinitizing those
threads, if desired.
Differential Revision: https://reviews.llvm.org/D102188
This resolves an issue tripping a `DCHECK`, as I was checking for the
capacity and not the size. We don't need to 0-init the Vector as it's
done already, and make sure we only 0-out the string on clear if it's
not empty.
Differential Revision: https://reviews.llvm.org/D103716
`__profd_*` variables are referenced by code only when value profiling is
enabled. If disabled (e.g. default -fprofile-instr-generate), the symbols just
waste space on ELF/Mach-O. We change the comdat symbol from `__profd_*` to
`__profc_*` because an internal symbol does not provide deduplication features
on COFF. The choice doesn't matter on ELF.
(In -DLLVM_BUILD_INSTRUMENTED_COVERAGE=on build, there is now no `__profd_*` symbols.)
On Windows this enables further optimization. We are no longer affected by the
link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE can
cause duplicate definition error.
https://lists.llvm.org/pipermail/llvm-dev/2021-May/150758.html
We can thus use llvm.compiler.used instead of llvm.used like ELF (D97585).
This avoids many `/INCLUDE:` directives in `.drectve`.
Here is rnk's measurement for Chrome:
```
This reduced object file size of base_unittests.exe, compiled with coverage, optimizations, and gmlt debug info by 10%:
#BEFORE
$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
1047758867
$ du -cksh base_unittests.exe
82M base_unittests.exe
82M total
# AFTER
$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
937886499
$ du -cksh base_unittests.exe
78M base_unittests.exe
78M total
```
The change is NFC for Mach-O.
Reviewed By: davidxl, rnk
Differential Revision: https://reviews.llvm.org/D103372
The following was found by a customer and is accepted by the other primary
C++ compilers, but fails to compile in Clang:
namespace sss {
double foo(int, double);
template <class T>
T foo(T); // note: target of using declaration
} // namespace sss
namespace oad {
void foo();
}
namespace oad {
using ::sss::foo;
}
namespace sss {
using oad::foo; // note: using declaration
}
namespace sss {
double foo(int, double) { return 0; }
template <class T>
T foo(T t) { // error: declaration conflicts with target of using
return t;
}
} // namespace sss
I believe the issue is that MergeFunctionDecl() was calling
checkUsingShadowRedecl() but only considering a FunctionDecl as a
possible shadow and not FunctionTemplateDecl. The changes in this patch
largely mirror how variable declarations were being handled by also
catching FunctionTemplateDecl.
When SimplifyIndVars infers IR nowrap flags from SCEV, this may
happen in two ways: Either nowrap flags were already present in
SCEV and just get transferred to IR. Or zero/sign extension of
addrecs infers additional nowrap flags, and those get transferred
to IR. In the latter case, calling forgetValue() ensures that the
newly inferred nowrap flags get propagated to any other SCEV
expressions based on the addrec. However, the invalidation can
also have a major compile-time effect in some cases. For
https://bugs.llvm.org/show_bug.cgi?id=50384 with n=512 compile-
time drops from 7.1s to 0.8s without this invalidation. At the
same time, removing the invalidation doesn't affect any codegen
in test-suite.
Differential Revision: https://reviews.llvm.org/D103424
This patch was split from https://reviews.llvm.org/D102246
[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO
This is for llvm-profdata part of change. It sets the bit masks for the
profile reader in llvm-profdata. Also add an internal option
"-fs-discriminator-pass" for show and merge command to process the profile
offline.
This patch also moved setDiscriminatorMaskedBitFrom() to
SampleProfileReader::create() to simplify the interface.
Differential Revision: https://reviews.llvm.org/D103550
To ensure that errors are emitted by CheckConformance and
its callers in all situations, it's necessary for the returned result
of that function to distinguish between three possible
outcomes: the arrays are known to conform at compilation time,
the arrays are known to not conform (and a message has been
produced), and an indeterminate result in which is not possible
to determine conformance. So convert CheckConformance's
result into an optional<bool>, and convert its confusing
Boolean flag arguments into a bit-set of named flags too.
Differential Revision: https://reviews.llvm.org/D103654
prepareTaggedChunk uses Tag 0 for header.
Android already PR_MTE_TAG_MASK to 0xfffe,
but with the patch we will not need to deppend
on the system configuration.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D103134
Extend debug info handling by adding DWARF address space mapping for
SPIR, with corresponding test case.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D103097
Otherwise it is causing one of our build jobs to fail,
it is using "external" as directory, and NOT is
failing because "external" is found in ModuleID.
Differential Revision: https://reviews.llvm.org/D103658
If we ended up with two phi instructions in a block, and we needed to fix up
the banks for the first one, we'd end up inserting our COPY before the second
phi.
E.g.
```
%x = G_PHI ...
%fixup = COPY ...
%y = G_PHI ...
```
This is invalid MIR, and breaks assumptions made by the register allocator later
down the line. With the verifier enabled, it also emits a verification error.
This teaches fixupPHIOpBanks to walk past any phi instructions in the block
when emitting the fixup copies.
Here's an example of the crashing code (same as added testcase):
https://godbolt.org/z/h5j1x3o6e
Differential Revision: https://reviews.llvm.org/D103582
This patch sets the isCommutable attribute for several opcodes that have
the "reg = OPCODE reg, reg" format.
Differential Revision: https://reviews.llvm.org/D103653
This patch changes the `isKnownHeapToStack` and `isAssumedHeapToStack`
member functions to return if a function call is going to be altered by
HeapToStack.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D103574
This is a fairly mechanical change, it just moves each algorithm into its own header. This is a NFC.
Note: during this change, I burned down all the includes, so this follows "include only and exactly what you use."
Differential Revision: https://reviews.llvm.org/D103583
This patch adds an option to `lookupAAFor` that allows it to return a
nullptr if the state of the looked up attribute is invalid. This is so
future passes can use this to query other attributes with the guarantee
that they are valid.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D103556
All that really matters is that the VLMAX of the preceding
instructions is the same as the VLMAX required by the mask
operation.
Also update the vmsge(u) handling to use the SEW/LMUL we use for
other mask register operations. We were matching it to the compare
before. Some cases will be improve if we fix masked compares to
use tail agnostic policy. I think they ignore the tail policy
anyway.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D103299
In --check mode we do not run code completion because it is too slow,
especially on larger files. With the introducation of --check-lines we
can narrow down the scope and thus we can afford to do code completion.
We vlog() the top completion result, but that's not really the point.
The most value will come from being able to reproduce crashes that occur
during code completion and require preamble build or index (and thus are
more difficult to reproduce with -code-complete-at).
Differential Revision: https://reviews.llvm.org/D103538
Need to emit a call for __kmpc_cancel_barrier in the exit block for
__kmpc_cancel function call if cancellation of the parallel block is
requested.
Differential Revision: https://reviews.llvm.org/D103646
setcc (csel 0, 1, cond, X), 1, ne ==> csel 0, 1, !cond, X
Where X is a condition code setting instruction.
Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D103256
* Rename PadTensorOpVectorizationPattern to GenericPadTensorOpVectorizationPattern.
* Make GenericPadTensorOpVectorizationPattern a private pattern, to be instantiated via populatePadTensorOpVectorizationPatterns.
* Factor out parts of PadTensorOpVectorizationPattern into helper functions.
This commit prepares PadTensorOpVectorizationPattern for a series of subsequent commits that add more specialized PadTensorOp vectorization patterns.
Differential Revision: https://reviews.llvm.org/D103681
Patch allows using of constexpr vars evaluatable to constant calue to be
used in declare mapper construct.
Differential Revision: https://reviews.llvm.org/D103642
Convert data operands from the acc.data operation using the same conversion pattern than D102170.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103332
With this patch, the following invocation of the frontend driver will
return an error:
```
flang-new -fc1 input-file.f90 -o
```
Similar logic applies to other options that require arguments.
Similar checks are already available in the compiler driver, flang-new
(that's implemented in clangDriver).
Differential Revision: https://reviews.llvm.org/D103554
As discussed on cfe-dev [1], use the using_if_exists Clang attribute when
the compiler supports it. This makes it easier to port libc++ on top of
new platforms that don't fully support the C Standard library.
Previously, libc++ would fail to build when trying to import a missing
declaration in a <cXXXX> header. With the attribute, the declaration will
simply not be imported into namespace std, and hence it won't be available
for libc++ to use. In many cases, the declarations were *not* actually
required for libc++ to work (they were only surfaced for users to use
them as std::XXXX), so not importing them into namespace std is acceptable.
The same thing could be achieved by conscious usage of `#ifdef` along
with platform detection, however that quickly creates a maintenance
problem as libc++ is ported to new platforms. Furthermore, this problem
is exacerbated when mixed with vendor internal-only platforms, which can
lead to difficulties maintaining a downstream fork of the library.
For the time being, we only use the using_if_exists attribute when it
is supported. At some point in the future, we will start removing #ifdef
paths that are unnecessary when the attribute is supported, and folks
who need those #ifdef paths will be required to use a compiler that
supports the attribute.
[1]: http://lists.llvm.org/pipermail/cfe-dev/2020-June/066038.html
Differential Revision: https://reviews.llvm.org/D90257