Summary:
This patch adds more function attribute information to the runtime function definitions in OMPKinds.def. The goal is to provide sufficient information about OpenMP runtime functions to perform more optimizations on OpenMP code.
Reviewers: jdoerfert
Subscribers: aaron.ballman cfe-commits yaxunl guansong sstefan1 llvm-commits
Tags: #OpenMP #clang #LLVM
Differential Revision: https://reviews.llvm.org/D81031
as it's causing a few unused variable warnings via the macro instantiation:
sources/llvm-project/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def:649:17: error: unused variable 'InaccessibleOnlyAttrs' [-Werror,-Wunused-variable]
__OMP_ATTRS_SET(InaccessibleOnlyAttrs,
^
This reverts commit 09fe0c5ab9.
Summary:
This patch adds more function attribute information to the runtime function definitions in OMPKinds.def. The goal is to provide sufficient information about OpenMP runtime functions to perform more optimizations on OpenMP code.
Reviewers: jdoerfert
Subscribers: aaron.ballman cfe-commits yaxunl guansong sstefan1 llvm-commits
Tags: #OpenMP #clang #llvm
Differential Revision: https://reviews.llvm.org/D81031
Since D82572, we keep "reference" edges for callback call sites. While
not strictly necessary they can improve the traversal order. However, we
did not update them properly in case a pass removed the callback call
site which caused a verification error (PR46687). With this patch we
update these reference edges properly during the invocation of
`CallGraphSCCPass::RefreshCallGraph` in non-checking mode.
Reviewed By: sdmitriev
Differential Revision: https://reviews.llvm.org/D83718
In non-SPMD mode we create a state machine like code to identify the
parallel region the GPU worker threads should execute next. The
identification uses the parallel region function pointer as that allows
it to work even if the kernel (=target region) and the parallel region
are in separate TUs. However, taking the address of a function comes
with various downsides. With this patch we will identify the most common
situation and replace the function pointer use with a dummy global
symbol (for identification purposes only). That means, if the parallel
region is only called from a single target region (or kernel), we do not
use the function pointer of the parallel region to identify it but a new
global symbol.
Fixes PR46450.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D83271
We now identify GPU kernels, that is entry points into the GPU code.
These kernels (can) correspond to OpenMP target regions. With this patch
we identify and on request print them via remarks.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D83269
This reverts commit 1d542f0ca8.
`recollectUses()` is added to prevent looking at dead uses after
Attributor run.
This is the first and most basic ICV Tracking implementation. For this
first version, we only support deduplication within the same BB.
Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku,
baziotis, lebedev.ri
Differential Revision: https://reviews.llvm.org/D81788
There appears to be some kind of memory corruption/use-after-free/etc
going on here. In particular, in `OpenMPOpt::deleteParallelRegions()`,
in `DeleteCallCB()`, `CI` is garbage.
WIll post reproducer in the original review.
This reverts commit 6c4a5e9257.
This is the first and most basic ICV Tracking implementation. For this
first version, we only support deduplication within the same BB.
Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku,
baziotis
Differential Revision: https://reviews.llvm.org/D81788
Summary:
This defines some basic information about ICVs in `OMPKinds.def`.
We also emit remarks with initial values for each function (which are default for now)
as a way to test this.
Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6
Subscribers: yaxunl, hiraditya, guansong, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82193
Summary: Couple of tests to showcase what will be done and what to expect with ICV tracking.
Reviewers: jdoerfert, JonChesterfield
Subscribers: yaxunl, guansong, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81114
Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits
Tags: #openmp, #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80222
Before we kept the first applicable `ident_t*` during deduplication of
runtime calls. The problem is that "first" is dependent on the iteration
order of a DenseMap. Since the proper solution, which is to combine the
information from all `ident_t*`, should be deterministic on its own, we
will not try to make the iteration order deterministic. Instead, we will
create a fresh `ident_t*` if there is not a unique existing `ident_t*`
to pick.
Summary:
AbstractCallSite::getCallbackUses() does not check that callback callee index from
the callback metadata does not exceed the total number of call arguments. This patch
add such validation check.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78112
Validation of the found runtime library functions declarations types
(return and argument types) with the expected types.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D76058
If we deduplicate OpenMP runtime calls we have multiple `ident_t*` that
represent information like source location. So far, we simply kept the
one used by the replacement call. However, as exposed by PR44893, that
can cause problems if we have stack allocated `ident_t` objects. While
we need to revisit the use of these as well, it is clear that we
eventually want to merge source location information in some way. With
this patch we add the infrastructure to do so but without doing the
actual merge. Instead we pick a global `ident_t` from the replaced
calls, if possible, or create a new one with an unknown location
instead.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D74925
Parallel regions known to be read-only, e.g., after we removed all dead
write accesses, and terminating (`willreturn`) can be removed.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D69954
The OpenMPOpt pass is a CGSCC pass in which OpenMP specific
optimizations can reside.
The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime
calls and their uses. This allows targeted transformations and eases
their implementation.
This initial patch deduplicates `__kmpc_global_thread_num` and
`omp_get_thread_num` calls. We can also identify arguments that are
equivalent to such a call result and use it instead. Later we can
determine "gtid" arguments based on the use in kernel functions etc.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D69930