The test case is the IR of:
```
void func(float * restrict a, float *b, int N) {
N = 199;
#pragma omp parallel for
for (int i = 1; i < N; i++)
a[i] = b[i] + 1.0;
}
```
There's a lot of duplicated calls to find various compiler-rt libraries
from build of runtime libraries like libunwind, libc++, libc++abi and
compiler-rt. The compiler-rt helper module already implemented caching
for results avoid repeated Clang invocations.
This change moves the compiler-rt implementation into a shared location
and reuses it from other runtimes to reduce duplication and speed up
the build.
Differential Revision: https://reviews.llvm.org/D88458
There are a couple internal builds that require the use of this flag.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D112594
This patch adds breakpoints to each target's statistics so we can track how long it takes to resolve each breakpoint. It also includes the structured data for each breakpoint so the exact breakpoint details are logged to allow for reproduction of slow resolving breakpoints. Each target gets a new "breakpoints" array that contains breakpoint details. Each breakpoint has "details" which is the JSON representation of a serialized breakpoint resolver and filter, "id" which is the breakpoint ID, and "resolveTime" which is the time in seconds it took to resolve the breakpoint. A snippet of the new data is shown here:
"targets": [
{
"breakpoints": [
{
"details": {...},
"id": 1,
"resolveTime": 0.00039291599999999999
},
{
"details": {...},
"id": 2,
"resolveTime": 0.00022679199999999999
}
],
"totalBreakpointResolveTime": 0.00061970799999999996
}
]
This provides full details on exactly how breakpoints were set and how long it took to resolve them.
Differential Revision: https://reviews.llvm.org/D112587
In ARM mode, passing -mtp=cp15 forces the use of an inline MRC system register read to move the thread pointer value into a register.
Currently, in Thumb2 mode, -mtp=cp15 is ignored, and a call to the __aeabi_read_tp helper is emitted instead.
This is inconsistent, and breaks the Linux/ARM build for Thumb2 targets, as the Linux kernel does not provide an implementation of __aeabi_read_tp,.
Reviewed By: nickdesaulniers, peter.smith
Differential Revision: https://reviews.llvm.org/D112600
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112227
Create a valid triple in the Darwin builder. Currently it was
incorrectly treating the os and version as two separate components in
the triple.
Differential revision: https://reviews.llvm.org/D112676
When we strip and accumulate constant offsets we need to pick the right
address space such that the offset APInt has the right bit width.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D112544
We do not generate _serialized_parallel calls in device mode, no
need for an external API.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D112145
Exiting a data environment will reset all values, it is wrong to adjust
them afterwards.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D112144
We will later use the fact that a barrier is aligned to reason about
thread divergence. For now we introduce the assumption and some more
documentation.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D112153
SPSExecutorAddr will now be serializable to/from ExecutorAddr, rather than
uint64_t. This improves type safety when working with serialized addresses.
Also updates the SupportFunctionCall to use an ExecutorAddrRange (rather than
a separate ExecutorAddr addr and uint64_t size field), and updates the
tpctypes::*Write data structures to use ExecutorAddr rather than
JITTargetAddress.
The OpenMP thread ID is not the hardware thread ID if we have nesting.
We need to ask the runtime properly to ensure correct results.
Note that the loop interface is going to change soon so we do not adjust
it now but simply ignore the extra argument.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D111950
The team size could/should be an ICV but since we know it is either 1 or
a value we can leave it in the team state for now. However, we still
need to determine if the current level is nested before we use it.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D111949
The first thread state in the new GPU runtime doesn't have a previous
one and we should not dereference the nullptr placeholder.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D111946
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.
Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.
This reverts commit f3190dedee.
This reverts commit 749581d21f.
This reverts commit f3df87d57e.
This reverts commit ab1dbcecd6.
We were writing a pointer to a selector string into the contents of a
string instead of overwriting the pointer to the string, leading to
corruption. This was causing non-deterministic failures of the
'trivial-objc-methods' test case.
Differential Revision: https://reviews.llvm.org/D112671
- `[[format(archetype, fmt-idx, ellipsis)]]` specifies that a function accepts a
format string and arguments according to `archetype`. This is how Clang
type-checks `printf` arguments based on the format string.
- Clang has a `-Wformat-nonliteral` warning that is triggered when a function
with the `format` attribute is called with a format string that is not
inspectable because it isn't constant. This warning is suppressed if the
caller has the `format` attribute itself and the format argument to the callee
is the caller's own format parameter.
- When using the `format` attribute on a block, Clang wouldn't recognize its
format parameter when calling another function with the format attribute. This
would cause unsuppressed -Wformat-nonliteral warnings for no supported reason.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D112569
Radar-Id: rdar://84603673
- When an unconditional branch is expanded into an indirect branch, if
there is no scavenged register, an SGPR pair needs spilling to enable
the destination PC calculation. In addition, before jumping into the
destination, that clobbered SGPR pair need restoring.
- As SGPR cannot be spilled to or restored from memory directly, the
spilling/restoring of that SGPR pair reuses the regular SGPR spilling
support but without spilling it into memory. As that spilling and
restoring points are fully controlled, we only need to spill that SGPR
into the temporary VGPR, which needs spilling into its emergency slot.
- The target-specific hook is revised to take additional restore block,
where the restoring code is filled. After that, the relaxation will
place that restore block directly before the destination block and
insert an unconditional branch in any fall-through block into the
destination block.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D106449
This is going to be necessary to implement some range adaptors.
As a fly-by fix, rename _LIBCPP_INLINE_VISIBILITY to _LIBCPP_HIDE_FROM_ABI
and remove a redundant inline keyword.
Differential Revision: https://reviews.llvm.org/D112650
Instead of going through libc++'s run.py, we can simply run the executable
directly since we don't need to setup a working directory or control the
environment.
Differential Revision: https://reviews.llvm.org/D112649
The type `MoveOnlyForwardRange` violates the precondition stated in
`view.interface.general`. Specifically, the type passed to
`view_interface` shall model the `view` concept. In turn, this requires the
type to satisfy `movable` concept (and others), but this type
`MoveOnlyForwardRange` does not satisfy the `movable` concept.
Add a move assignment operator so that `MoveOnlyForwardRange` satisfies the
`movable` concept. While we're here, ensure the neighboring types that inherit
from `view_interface` also satisfy the `view` concept to avoid similar issues.
Fixes https://bugs.llvm.org/show_bug.cgi?id=50720
Reviewed By: Quuxplusone, Mordante, #libc
Differential Revision: https://reviews.llvm.org/D112631
The dump of all diagnostics of all tests under `clang/test/{CXX,SemaCXX,SemaTemplate}` was analyzed , and all the cases where there were obviously bad canonical types being printed, like `type-parameter-*-*` and `<overloaded function type>` were identified. Also a small amount of cases of missing sugar were analyzed.
This patch then spells those explicitly in the test expectations, as preparatory work for future fixes for these problems.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D110210
A resolution to the ambiguity issues created by P0522, which is a DR solving
CWG 150, did not come as expected, so we are just going to accept the change,
and watch how users digest it.
For now we deprecate the flag with a warning, and make it on by default.
We don't remove the flag completely in order to give users a chance to
work around any problems by disabling it.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D109496
Enables the arm64 MachO platform, adds basic tests, and implements the
missing TLV relocations and runtime wrapper function. The TLV
relocations are just handled as GOT accesses.
rdar://84671534
Differential Revision: https://reviews.llvm.org/D112656
ICF runs before relocation processing, but undefined symbol errors
are only emitted during relocation processing.
So just ignore Undefineds during ICF (instead of crashing) -- lld
will emit an error once ICF is done.
Fixes PR52330.
Differential Revision: https://reviews.llvm.org/D112643
This patch implements __builtin_elementwise_abs as specified in
D111529.
Reviewed By: aaron.ballman, scanon
Differential Revision: https://reviews.llvm.org/D111986
Otherwise tools like codesign_allocate will choke. We were already
handling this correctly for the other DYLD_INFO sections.
Doing this correctly is a bit subtle: we don't know if export_size will
be zero until we have run `ExportSection::finalizeContents()`. However,
we must still add the ExportSection to the `__LINKEDIT` segment in order
that it gets sorted during `sortSectionsAndSegments()`.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D112589
GEP decomposition currently checks whether the multiplication of
the linear expression offset and GEP scale overflows. However, if
everything else works correctly, this overflow check is both
unnecessary and dangerously misleading. While it will avoid an
overflow in Scale * Offset in particular, other parts of the
calculation (including those on dynamic values) may still overflow.
The code working on the decomposed GEPs is responsible for ensuring
that it remains correct in the presence of overflow. D112611 fixes
the last issue of that kind that I'm aware of (in fact, the overflow
check was originally introduced to work around precisely that issue).
Differential Revision: https://reviews.llvm.org/D112618
This diff adds a data formatter for libstdcpp's set. Besides, it unifies the tests for set for libcxx and libstdcpp for maintainability.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D112537
A constant complaint we get is that the __typeid__ symbols in the CFI
jump tables causes confusing stack traces in applications. Emit the more
readable cfi_jt aliases regardless of function export (LTO vs Thin LTO).
Reviewed By: pcc, tejohnson
Differential Revision: https://reviews.llvm.org/D107934
There's precedent for that in `CreateOr()`/`CreateAnd()`.
The motivation here is to avoid bloating the run-time check's IR
in `SCEVExpander::generateOverflowCheck()`.
Refs. https://reviews.llvm.org/D109368#3089809