Summary:
According to OpenMP, During execution of an iteration of a worksharing-loop or a loop nest within a worksharing-loop, simd, or worksharing-loop SIMD region, a thread must not execute more than one ordered region corresponding to an ordered construct without a depend clause.
Need to report an error in this case.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81951
Reland https://reviews.llvm.org/D76696
All known crashes have been fixed, another attemption.
We have rolled out this to all internal users for a while, didn't see
big issues, we consider it is stable enough.
Reviewed By: sammccall
Subscribers: rsmith, hubert.reinterpretcast, ebevhan, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78350
Summary:
Added codegen for use_device_addr clause. The components of the list
items are mapped as a kind of RETURN components and then the returned
base address is used instead of the real address of the base declaration
used in the use_device_addr expressions.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80730
Summary:
If the variables must be globalized in OpenMP mode (local automatic
variable, GPU compilation mode, the variable may escape its declaration
context by the reference or by the pointer), it should not be considered
as the NRVO candidate. Otherwise, incorrect the return value of the
function might not be updated.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80936
Summary:
If the array subscript expression is type depent, its analysis must be
delayed before its instantiation.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, caomhin, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78637
Summary:
Nvidia PTX does not allow `.` to appear in identifiers, so OpenMP variant mangling now uses `$` to separate segments of the mangled name for variants of functions declared via `declare variant`.
Reviewers: jdoerfert, Hahnfeld
Reviewed By: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits
Tags: #openmp, #clang
Differential Revision: https://reviews.llvm.org/D80439
Summary:
If the data member is mapped as an array section, need to emit the
pointer to the last element of this array section and use this pointer
as the highest element in partial struct data.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81037
Summary:
Added initial codegen for 'affinity' clauses on task directives.
Emits next code:
```
kmp_task_affinity_info_t affs[<num_elems>];
void *td = __kmpc_task_alloc(..);
affs[<i>].base = &data_i;
affs[<i>].size = sizeof(data_i);
__kmpc_omp_reg_task_with_affinity(&loc, <gtid>, td, <num_elems>, affs);
```
The result returned by the call of `__kmpc_omp_reg_task_with_affinity`
function is ignored currently sincethe runtime currently ignores args
and returns 0 uncoditionally.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, llvm-commits, cfe-commits, caomhin
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80240
Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits
Tags: #openmp, #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80222
Summary:
Do not ask size of type if it is dependent. ASTContext doesn't seem expecting
this.
Reviewers: jdoerfert, ABataev, bader
Reviewed By: ABataev
Subscribers: yaxunl, guansong, ebevhan, Anastasia, sstefan1, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80829
Summary:
Diagnostic is emitted if some declaration of unsupported type
declaration is used inside device code.
Memcpy operations for structs containing member with unsupported type
are allowed. Fixed crash on attempt to emit diagnostic outside of the
functions.
The approach is generalized between SYCL and OpenMP.
CUDA/OMP deferred diagnostic interface is going to be used for SYCL device.
Reviewers: rsmith, rjmccall, ABataev, erichkeane, bader, jdoerfert, aaron.ballman
Reviewed By: jdoerfert
Subscribers: guansong, sstefan1, yaxunl, mgorny, bader, ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74387
We currently diagnose static data members directly contained in unnamed classes,
but we should also diagnose when they're in a class that is nested (directly or
indirectly) in an unnamed class. Do this by iterating up the list of parent
DeclContexts and checking if any is an unnamed class.
Similarly also check for function or method DeclContexts (which includes things
like blocks and openmp captured statements) as then the class is considered to
be a local class, which means static data members aren't allowed.
Differential Revision: https://reviews.llvm.org/D80295
Fixes PR45753
When a program that contains a loop to which both `omp parallel for`
pragma and `clang loop` pragma are associated is compiled with the
-fopenmp option, `clang loop` pragma did not take effect. The example
below should not be vectorized by the `clang loop` pragma but it was
actually vectorized. The cause is that `llvm.loop.vectorize.width`
was not output to the IR when -fopenmp is specified.
The fix attaches attributes if they exist for the loop.
[example.c]
```
int a[100], b[100];
void foo() {
#pragma omp parallel for
#pragma clang loop vectorize(disable)
for (int i = 0; i < 100; i++)
a[i] += b[i] * i;
}
```
[compile]
```
$ clang -O2 -fopenmp example.c -c -Rpass=vect
example.c:3:11: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
#pragma omp parallel for
^
```
[IR with -fopenmp]
```
$ clang -O2 exmaple.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fopenmp | grep 'vectorize\.width'
```
[IR with -fno-openmp]
```
$ clang -O2 example.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fno-openmp | grep 'vectorize\.width'
!7 = !{!"llvm.loop.vectorize.width", i32 1}
```
Differential Revision: https://reviews.llvm.org/D79921
Summary:
No need to generate inlined OpenMP region for variables captured in
lambdas or block decls, only for implicitly captured variables in the
OpenMP region.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79966
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.
See also D80072.
Differential Revision: https://reviews.llvm.org/D80166
Summary:
With recovery expr, it is possible that we have a value-dependent expr
within non-dependent context.
Reviewers: sammccall, jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80200
Summary:
Predefined allocators should not be mapped at all (they are just enumeric
constants). FOr user-defined allocators need to map the traits only as
firstprivates, the allocator itself is private.
At the beginning of the target region the user-defined allocatores must
be created and then destroyed at the end of the target region:
```
omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>,
/*default memhandle*/ 0, <number_of_traits>, &<traits>);
...
call void @__kmpc_destroy_allocator(<gtid>, my_allocator);
```
Reviewers: jdoerfert, aaron.ballman
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79257
Summary:
omp.h header file defines omp_null_allocator as a predefined allocator,
need to consider it also as a predefined allocator.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79186
Summary:
This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS.
The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64.
Reviewers: ABataev, jdoerfert, andwar
Reviewed By: jdoerfert
Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78969
Summary:
The linear parameter token in the mangling function must be multiplied
by the pointee size in bytes when the parameter is a pointer.
Reviewers: ABataev, andwar, jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78965
Summary:
Added basic support for 'task' modifier in the reduction clauses in
non-simd parallel and worksharing constructs.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78738
Summary:
Patch forces codegen to use the new runtime functions for task reductions where
the issue with passing the address of the original variables to the UDR
initializers is fixed. Also, this patch is required for upcoming
support of task modifier inreduction clause.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78733
Summary:
This patch contains 2 separate changes:
1) the initializer of a variable should play no part in decl "invalid" bit;
2) preserve the invalid initializer via recovery exprs;
With 1), we will regress the diagnostics (one big regression is that we loose
the "selected 'begin' function with iterator type" diagnostic in for-range stmt;
but with 2) together, we don't have regressions (the new diagnostics seems to be
improved).
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78116
Summary:
The global variable should be captured in the region only if it was
privitized in the region or in any of the outer regions. Otherwise, it
should not be captured.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77731
Summary:
According to the standard, variable-category is the optional part of the
defaultmap clause while the compiler always requires it. Turned it into
optional part.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77751
By default, all traits in the OpenMP context selector have to match for
it to be acceptable. Though, we sometimes want a single property out of
multiple to match (=any) or no match at all (=none). We offer these
choices as extensions via
`implementation={extension(match_{all,any,none})}`
to the user. The choice will affect the entire context selector not only
the traits following the match property.
The first user will be D75788. There we can replace
```
#pragma omp begin declare variant match(device={arch(nvptx64)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
// TODO: Hack until we support an extension to the match clause that allows "or".
#undef __CLANG_CUDA_CMATH_H__
#undef __CUDA__
#pragma omp end declare variant
#pragma omp begin declare variant match(device={arch(nvptx)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
with the much simpler
```
#pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D77414
Implemented codegen for the iterator expression in the depend clauses.
Iterator construct is emitted the following way:
iterator(cnt1, cnt2, ...), in : <dep>
<TotalNumDeps> = <cnt1_size> * <cnt2_size> * ...;
kmp_depend_t deps[<TotalNumDeps>];
deps_counter = 0;
for (cnt1) {
for (cnt2) {
...
deps[deps_counter].base_addr = &<dep>;
deps[deps_counter].size = sizeof(<dep>);
deps[deps_counter].flags = in;
deps_counter += 1;
...
}
}
For depobj construct the codegen is very similar, but the memory is
allocated dynamically and added extra first item reserved for internal use.
Currently deferred diagnostic emitter checks variable decl in DeclRefExpr, which
causes infinite recursion for cases like long a = (long)&a;.
Deferred diagnostic emitter does not need check variable decls in DeclRefExpr
since reference of a variable does not cause emission of functions directly or
indirectly. Therefore there is no need to check variable decls in DeclRefExpr.
Differential Revision: https://reviews.llvm.org/D76937
Need to allow arrayshaping expression in a list of expressions, so use
ParseAssignmentExpression() when try to parse the base of the shaping
operation.
clauses.
Implemented codegen for array shaping operation in depend clauses. The
begin of the expression is the pointer itself, while the size of the
dependence data is the mukltiplacation of all dimensions in the array
shaping expression.
Summary:
Added basic representation and parsing/sema handling of array-shaping
operations. Array shaping expression is an expression of form ([s0]..[sn])base,
where s0, ..., sn must be a positive integer, base - a pointer. This
expression is a kind of cast operation that converts pointer expression
into an array-like kind of expression.
Reviewers: rjmccall, rsmith, jdoerfert
Subscribers: guansong, arphaman, cfe-commits, caomhin, kkwli0
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74144
This is the second part loosely extracted from D71179 and cleaned up.
This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.
A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].
The logic is as follows:
If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
1) Create a function declaration for the definition we were about to generate.
2) Create a function definition but with a mangled name (according to
`<SELECTOR>`).
3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
one used already for `omp declare variant`, using and the mangled
function definition as specialization for the context defined by
`<SELECTOR>`.
When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman
Subscribers: bollu, guansong, openmp-commits, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75779
This is the first part extracted from D71179 and cleaned up.
This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].
A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D74941
This reverts commit 0788acbccb.
This reverts commit c2d7a1f79cedfc9fcb518596aa839da4de0adb69: Revert "[clangd] Add test for FindTarget+RecoveryExpr (which already works). NFC"
It causes a crash on invalid code:
class X {
decltype(unresolved()) foo;
};
constexpr int s = sizeof(X);
region.
According to OpenMP 5.0, exactly one scan directive must appear in the loop body of an enclosing worksharing-loop, worksharing-loop SIMD, or simd construct on which a reduction clause with the inscan modifier is present.
If the ancestor device modifier is used and the value of the device
clause is evaluated to 1, the ancestor device shall be used for the
execution.
Since the reverse offloading is not supported yet, the target construct
execution is always initiated from the host, not from the device. So, if
the ancestor modifier is specified, just execute target region on the
host.
Avoid copying of the orignal variable if it is going to be marked as
firstprivate in task regions. For taskloops, still need to copy the
non-trvially copyable variables to correctly construct them upon task
creation.
induction variable abends.
Used incorrect loop bound when trying to calculate the index in the vec
array for doacross construct in the loops with the reverse order.
Added codegen for update clause in depobj. Reads the number of the
elements from the first element and updates flags for each element in
the loop.
```
omp_depend_t x;
kmp_depend_info *base = (kmp_depend_info *)x;
intptr_t num = x[-1].base_addr;
kmp_depend_info *end = x + num;
kmp_depend_info *el = base;
do {
el.flags = new_flag;
el = &el[1];
} while (el != end);
```
in depobj object.
The first element in the list of the dependencies is used for internal
purposes to store the number of the elements in the provided list.
The first element now is skipped and depobj object poits exactly to the
list of dependencies.
Added codegen for 'depend' clause in depobj directive. The depend clause
is emitted as kmp_depend_info <deps>[<number_of_items_in_clause> + 1]. The
first element in this array is reserved for storing the number of
elements in this array: <deps>[0].base_addr =
<number_of_items_in_clause>;
This extra element is required to implement 'update' and 'destroy'
clauses. It is required to know the size of array to destroy it
correctly and to update depency kind.
Chen
Summary:
Base declaration in pointer arithmetic expression is determined by
binary search with type information. Take "int *a, *b; *(a+*b)" as an
example, we determine the base by checking the type of LHS and RHS. In
this case the type of LHS is "int *", the type of RHS is "int",
therefore, we know that we need to visit LHS in order to find base
declaration.
Reviewers: ABataev, jdoerfert
Reviewed By: ABataev
Subscribers: guansong, cfe-commits, sandoval, dreachem
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75077
If we deduplicate OpenMP runtime calls we have multiple `ident_t*` that
represent information like source location. So far, we simply kept the
one used by the replacement call. However, as exposed by PR44893, that
can cause problems if we have stack allocated `ident_t` objects. While
we need to revisit the use of these as well, it is clear that we
eventually want to merge source location information in some way. With
this patch we add the infrastructure to do so but without doing the
actual merge. Instead we pick a global `ident_t` from the replaced
calls, if possible, or create a new one with an unknown location
instead.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D74925
Compute and propagate conversion kind to diagnostics helper in C++
to provide more specific diagnostics about incorrect implicit
conversions in assignments, initializations, params, etc...
Duplicated some diagnostics as errors because C++ is more strict.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74116
This patch introduces a new helper class `OMPBuilderCBHelpers`,
which will contain all reusable C/C++ language specific function-
alities required by the `OMPIRBuilder`.
Initially, this helper class contains the body and finalization
codegen functionalities implemented using callbacks which were
moved here for reusability among the different directives
implemented in the `OMPIRBuilder`, along with RAIIs for preserving
state prior to emitting outlined and/or inlined OpenMP regions.
In the future this helper class will also contain all the different
call backs required by OpenMP clauses/variable privatization.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D74562
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.
There are some changes in OpenMP and Instrumentation tests, where
bitcasts now get constant folded. I've also highlighted one
InstCombine test which now finishes in two rather than three
iterations, thanks to new instructions being inserted into the
worklist.
Differential Revision: https://reviews.llvm.org/D74787
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.
There are some changes in OpenMP tests, where bitcasts now get
constant folded. I've also highlighted one InstCombine test which
now finishes in two rather than three iterations, thanks to new
instructions being inserted into the worklist.
Differential Revision: https://reviews.llvm.org/D74787
This patch removes the explicit call graph for CUDA/HIP/OpenMP deferred
diagnostics generated during parsing since it is error prone due to
incomplete information about function declarations during parsing. In stead,
this patch does a post-parsing AST traverse and emits deferred diagnostics
based on the use graph implicitly generated during the traverse.
Differential Revision: https://reviews.llvm.org/D70172
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
This patch implements an almost complete handling of OpenMP
contexts/traits such that we can reuse most of the logic in Flang
through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP.
All but construct SIMD specifiers, e.g., inbranch, and the device ISA
selector are define in `llvm/lib/Frontend/OpenMP/OMPKinds.def`. From
these definitions we generate the enum classes `TraitSet`,
`TraitSelector`, and `TraitProperty` as well as conversion and helper
functions in `llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}`.
The above enum classes are used in the parser, sema, and the AST
attribute. The latter is not a collection of multiple primitive variant
arguments that contain encodings via numbers and strings but instead a
tree that mirrors the `match` clause (see `struct OpenMPTraitInfo`).
The changes to the parser make it more forgiving when wrong syntax is
read and they also resulted in more specialized diagnostics. The tests
are updated and the core issues are detected as before. Here and
elsewhere this patch tries to be generic, thus we do not distinguish
what selector set, selector, or property is parsed except if they do
behave exceptionally, as for example `user={condition(EXPR)}` does.
The sema logic changed in two ways: First, the OMPDeclareVariantAttr
representation changed, as mentioned above, and the sema was adjusted to
work with the new `OpenMPTraitInfo`. Second, the matching and scoring
logic moved into `OMPContext.{h,cpp}`. It is implemented on a flat
representation of the `match` clause that is not tied to clang.
`OpenMPTraitInfo` provides a method to generate this flat structure (see
`struct VariantMatchInfo`) by computing integer score values and boolean
user conditions from the `clang::Expr` we keep for them.
The OpenMP context is now an explicit object (see `struct OMPContext`).
This is in anticipation of construct traits that need to be tracked. The
OpenMP context, as well as the `VariantMatchInfo`, are basically made up
of a set of active or respectively required traits, e.g., 'host', and an
ordered container of constructs which allows duplication. Matching and
scoring is kept as generic as possible to allow easy extension in the
future.
---
Test changes:
The messages checked in `OpenMP/declare_variant_messages.{c,cpp}` have
been auto generated to match the new warnings and notes of the parser.
The "subset" checks were reversed causing the wrong version to be
picked. The tests have been adjusted to correct this.
We do not print scores if the user did not provide one.
We print spaces to make lists in the `match` clause more legible.
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: merge_guards_bot, rampitec, mgorny, hiraditya, aheejin, fedor.sergeev, simoncook, bollu, guansong, dexonsmith, jfb, s.egerton, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71830
The code generation is exactly the same as it was.
But not that the special handling of untied tasks is still handled by
emitUntiedSwitch in clang.
Differential Revision: https://reviews.llvm.org/D69828
According to OpenMP 5.0, cancel and cancellation point constructs are
supported in taskloop directive. Added support for cancellation in
taskloop, master taskloop and parallel master taskloop.
Finalization can introduce new blocks we need to outline as well so it
makes sense to identify the blocks that need to be outlined after
finalization happened. There was also a minor unit test adjustment to
account for the fact that we have a single outlined exit block now.
directive.
According to OpenMP 5.0, The atomic_default_mem_order clause specifies the default memory ordering behavior for atomic constructs that must be provided by an implementation. If the default memory ordering is specified as seq_cst, all atomic constructs on which memory-order-clause is not specified behave as if the seq_cst clause appears. If the default memory ordering is specified as relaxed, all atomic constructs on which memory-order-clause is not specified behave as if the relaxed clause appears.
If the default memory ordering is specified as acq_rel, atomic constructs on which memory-order-clause is not specified behave as if the release clause appears if the atomic write or atomic update operation is specified, as if the acquire clause appears if the atomic read operation is specified, and as if the acq_rel clause appears if the atomic captured update operation is specified.
Added restrictions for atomic directive.
1. If atomic-clause is read then memory-order-clause must not be acq_rel or release.
2. If atomic-clause is write then memory-order-clause must not be
acq_rel or acquire.
3. If atomic-clause is update or not present then memory-order-clause
must not be acq_rel or acquire.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
Summary:
Due to a recent (but retroactive) C++ rule change, only sufficiently
C-compatible classes are permitted to be given a typedef name for
linkage purposes. Add an enabled-by-default warning for these cases, and
rephrase our existing error for the case where we encounter the typedef
name for linkage after we've already computed and used a wrong linkage
in terms of the new rule.
Reviewers: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74103
Add support for Flush in the OMPIRBuilder. This patch also adds changes
to clang to use the OMPIRBuilder when '-fopenmp-enable-irbuilder'
commandline option is used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D70712
Summary:
Clang -fpic defaults to -fno-semantic-interposition (GCC -fpic defaults
to -fsemantic-interposition).
Users need to specify -fsemantic-interposition to get semantic
interposition behavior.
Semantic interposition is currently a best-effort feature. There may
still be some cases where it is not handled well.
Reviewers: peter.smith, rnk, serge-sans-paille, sfertile, jfb, jdoerfert
Subscribers: dschuff, jyknight, dylanmckay, nemanjai, jvesely, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, arphaman, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73865
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
regions.
If the lastprivate conditional is passed as shared in inner region, we
shall check if it was ever changed and use this updated value after exit
from the inner region as an update value.
regions with reductions, lastprivates or linears clauses.
If the lastprivate conditional variable is updated in inner parallel
region with reduction, lastprivate or linear clause, the value must be
considred as a candidate for lastprivate conditional. Also, tracking in
inner parallel regions is not required.
If local allocator was declared and used in the allocate clause, it was
not captured in inner region. It leads to a compiler crash, need to
capture the allocator declarator.
Target regions have implicit outer region which may erroneously capture
some globals when it should not. It may lead to a compiler crash at the
compile time.
If current kind of the translation unit is TU_Prefix and it is not
complete, cannot decide what to do with virtual members/table at that
time, need to delay it to later stages.
There are no special virtual function handlers (like __cxa_pure_virtual)
defined for NVPTX target, so just emit such functions as null pointers
to prevent issues with linking and unresolved references.
Just marking a symbol as weak_odr/linkonce_odr isn't enough for
actually tolerating multiple copies of it at linking on windows,
it has to be made a proper comdat; make it comdat for all platforms
for consistency.
This should hopefully fix
https://bugzilla.mozilla.org/show_bug.cgi?id=1566288.
Differential Revision: https://reviews.llvm.org/D71572
If standalone OpenMP declaration pragma, like declare mapper or declare
reduction, is declared in the class context, it may reference a member
(data or function) in its internal expressions/statements. So, the
parsing of such pragmas must be dalayed just like the parsing of the
member initializers/definitions before the completion of the class
declaration.
declare simd.
According to the standard, a list-item that appears in a linear clause without the ref modifier must be of integral or pointer type, or must be a reference to an integral or pointer type. Added check that this restriction is applied only to non-ref items.
The OpenMP specification disallows having zero-length array
sections in the depend clause (OpenMP 5.0 2.17.11).
Differential Revision: https://reviews.llvm.org/D71969
Added codegen support for lastprivate conditional. According to the
standard, if when the conditional modifier appears on the clause, if an
assignment to a list item is encountered in the construct then the
original list item is assigned the value that is assigned to the new
list item in the sequentially last iteration or lexically last section
in which such an assignment is encountered.
We look for the assignment operations and check if the left side
references lastprivate conditional variable. Then the next code is
emitted:
if (last_iv_a <= iv) {
last_iv_a = iv;
last_a = lp_a;
}
At the end the implicit barrier is generated to wait for the end of all
threads and then in the check for the last iteration the private copy is
assigned the last value.
if (last_iter) {
lp_a = last_a; // <--- new code
a = lp_a; // <--- store of private value to the original variable.
}
Summary: `getListOfPossibleValues()` formatted incorrectly when there is only one value, emitting something like `expected 'conditional' or in OpenMP clause 'lastprivate'`.
Reviewers: jdoerfert, ABataev
Reviewed By: jdoerfert
Subscribers: guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71884
The new check line is compatible with the clang code generation check
line as it allows a 64 and 32 bit value.
I hope this makes the llvm-clang-win-x-armv7l buildbot happy.
This allows to use the OpenMPIRBuilder for parallel regions. Code was
extracted from D61953 and adapted to work with the new version (D70109).
All but one feature should be supported. An update of this patch will
provide test coverage and privatization other than shared.
Reviewed By: fghanim
Differential Revision: https://reviews.llvm.org/D70290
Summary:
Basic codegen for the declarations marked as nontemporal. Also, if the
base declaration in the member expression is marked as nontemporal,
lvalue for member decl access inherits nonteporal flag from the base
lvalue.
Reviewers: rjmccall, hfinkel, jdoerfert
Subscribers: guansong, arphaman, caomhin, kkwli0, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71708
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
When parsing the code with OpenMP and the function's body must be
skipped, need to skip also OpenMP annotation tokens. Otherwise the
counters for braces/parens are unbalanced and parsing fails.
in declare variant.
If the types of the fnction are not equal, but match, at the codegen
thei may have different types. This may lead to compiler crash.
Summary:
Make sure that auxiliary target specific macros are defined in OpenMP
mode.
Reviewers: ABataev, jdoerfert
Subscribers: guansong, ebevhan, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71413
This is a follow up patch to use the OpenMP-IR-Builder, as discussed on
the mailing list ([1] and later) and at the US Dev Meeting'19.
[1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69922
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
directive.
Fixed capturing of the if condition if no modifer was specified in this
condition. Previously could capture it only in outer region and it could
lead to a compiler crash.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
variant directive.
If the function is used only in declare variant directive as a variant
function, it should not be marked as used to prevent emission of the
target-specific functions. Build the reference in the unevaluated
context.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause is false, the non-vectorized version of the
loop must be executed.
Summary:
D30644 added OpenMP offloading to AArch64 targets, then D32035 changed the
frontend to throw an error when offloading is requested for an unsupported
target architecture. However the latter did not include AArch64 in the list
of supported architectures, causing the following unit tests to fail:
libomptarget :: api/omp_get_num_devices.c
libomptarget :: mapping/pr38704.c
libomptarget :: offloading/offloading_success.c
libomptarget :: offloading/offloading_success.cpp
Reviewers: pawosm01, gtbercea, jdoerfert, ABataev
Subscribers: kristof.beyls, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70804
A trivially copyable type provides a trivial copy constructor and a trivial
copy assignment operator. This is enough for the runtime to memcpy the data
to the device. Additionally there must be no virtual functions or virtual
base classes and the destructor is guaranteed to be trivial, ie performs
no action.
The runtime does not require trivial default constructors because on alloc
the memory is undefined. Thus, weaken the warning to be only issued if the
mapped type is not trivially copyable.
Differential Revision: https://reviews.llvm.org/D71134
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
According to OpenMP 5.0, if clause can be used in parallel for simd directive. If condition in the if clause if false, the non-vectorized version of the
loop must be executed.
Need to perform the instantiation of the combiner/initializer even if
the resulting type is not dependent, if the construct is defined in
templates in some cases.
Summary:
Currently, we ignore all locality attributes/info when building for
the device and thus all symblos are externally visible and can be
preemted at the runtime. It may lead to incorrect results. We need to
follow the same logic, compiler uses for static/pie builds. But in some
cases changing of dso locality may lead to problems with codegen, so
instead mark external symbols as hidden instead in the device code.
Reviewers: jdoerfert
Subscribers: guansong, caomhin, kkwli0, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70549
directives.
If the default datasharing is set to none, the datasharing attributes
for variables in the condition of the if clause for the inner taskloop
directive must be verified.
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
According to OpenMP 5.0, if clause can be used in simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
Summary:
For the extended defaultmap, most of the work is inside sema.
The only difference for codegen is to set different initial
maptype for different implicit-behavior.
Reviewers: jdoerfert, ABataev
Reviewed By: ABataev
Subscribers: dreachem, sandoval, cfe-commits
Tags: #clang, #openmp
Differential Revision: https://reviews.llvm.org/D69204
Add assignment operator in the test to check that even if the operator
was declare explicitly, the constructor is called in the user-defined
reduction initializer anyway.
simd-based directives.
According to OpenMP 5.0 standard, ordered simd, atomic and simd
directives are allowed as nested directives in the simd-based
directives.
If the context selector score was not specified, its value must be set
to 0. Simplify the processing of unspecified scores + save memory in
attribute representation.