Summary:
Predefined allocators should not be mapped at all (they are just enumeric
constants). FOr user-defined allocators need to map the traits only as
firstprivates, the allocator itself is private.
At the beginning of the target region the user-defined allocatores must
be created and then destroyed at the end of the target region:
```
omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>,
/*default memhandle*/ 0, <number_of_traits>, &<traits>);
...
call void @__kmpc_destroy_allocator(<gtid>, my_allocator);
```
Reviewers: jdoerfert, aaron.ballman
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79257
Summary:
omp.h header file defines omp_null_allocator as a predefined allocator,
need to consider it also as a predefined allocator.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79186
Summary:
This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS.
The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64.
Reviewers: ABataev, jdoerfert, andwar
Reviewed By: jdoerfert
Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78969
Summary:
The linear parameter token in the mangling function must be multiplied
by the pointee size in bytes when the parameter is a pointer.
Reviewers: ABataev, andwar, jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78965
Summary:
Added basic support for 'task' modifier in the reduction clauses in
non-simd parallel and worksharing constructs.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78738
Summary:
Patch forces codegen to use the new runtime functions for task reductions where
the issue with passing the address of the original variables to the UDR
initializers is fixed. Also, this patch is required for upcoming
support of task modifier inreduction clause.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78733
Summary:
This patch contains 2 separate changes:
1) the initializer of a variable should play no part in decl "invalid" bit;
2) preserve the invalid initializer via recovery exprs;
With 1), we will regress the diagnostics (one big regression is that we loose
the "selected 'begin' function with iterator type" diagnostic in for-range stmt;
but with 2) together, we don't have regressions (the new diagnostics seems to be
improved).
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78116
Summary:
The global variable should be captured in the region only if it was
privitized in the region or in any of the outer regions. Otherwise, it
should not be captured.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77731
Summary:
According to the standard, variable-category is the optional part of the
defaultmap clause while the compiler always requires it. Turned it into
optional part.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77751
By default, all traits in the OpenMP context selector have to match for
it to be acceptable. Though, we sometimes want a single property out of
multiple to match (=any) or no match at all (=none). We offer these
choices as extensions via
`implementation={extension(match_{all,any,none})}`
to the user. The choice will affect the entire context selector not only
the traits following the match property.
The first user will be D75788. There we can replace
```
#pragma omp begin declare variant match(device={arch(nvptx64)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
// TODO: Hack until we support an extension to the match clause that allows "or".
#undef __CLANG_CUDA_CMATH_H__
#undef __CUDA__
#pragma omp end declare variant
#pragma omp begin declare variant match(device={arch(nvptx)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
with the much simpler
```
#pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D77414
Implemented codegen for the iterator expression in the depend clauses.
Iterator construct is emitted the following way:
iterator(cnt1, cnt2, ...), in : <dep>
<TotalNumDeps> = <cnt1_size> * <cnt2_size> * ...;
kmp_depend_t deps[<TotalNumDeps>];
deps_counter = 0;
for (cnt1) {
for (cnt2) {
...
deps[deps_counter].base_addr = &<dep>;
deps[deps_counter].size = sizeof(<dep>);
deps[deps_counter].flags = in;
deps_counter += 1;
...
}
}
For depobj construct the codegen is very similar, but the memory is
allocated dynamically and added extra first item reserved for internal use.
Currently deferred diagnostic emitter checks variable decl in DeclRefExpr, which
causes infinite recursion for cases like long a = (long)&a;.
Deferred diagnostic emitter does not need check variable decls in DeclRefExpr
since reference of a variable does not cause emission of functions directly or
indirectly. Therefore there is no need to check variable decls in DeclRefExpr.
Differential Revision: https://reviews.llvm.org/D76937
Need to allow arrayshaping expression in a list of expressions, so use
ParseAssignmentExpression() when try to parse the base of the shaping
operation.
clauses.
Implemented codegen for array shaping operation in depend clauses. The
begin of the expression is the pointer itself, while the size of the
dependence data is the mukltiplacation of all dimensions in the array
shaping expression.
Summary:
Added basic representation and parsing/sema handling of array-shaping
operations. Array shaping expression is an expression of form ([s0]..[sn])base,
where s0, ..., sn must be a positive integer, base - a pointer. This
expression is a kind of cast operation that converts pointer expression
into an array-like kind of expression.
Reviewers: rjmccall, rsmith, jdoerfert
Subscribers: guansong, arphaman, cfe-commits, caomhin, kkwli0
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74144
This is the second part loosely extracted from D71179 and cleaned up.
This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.
A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].
The logic is as follows:
If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
1) Create a function declaration for the definition we were about to generate.
2) Create a function definition but with a mangled name (according to
`<SELECTOR>`).
3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
one used already for `omp declare variant`, using and the mangled
function definition as specialization for the context defined by
`<SELECTOR>`.
When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman
Subscribers: bollu, guansong, openmp-commits, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75779
This is the first part extracted from D71179 and cleaned up.
This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].
A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D74941
This reverts commit 0788acbccb.
This reverts commit c2d7a1f79cedfc9fcb518596aa839da4de0adb69: Revert "[clangd] Add test for FindTarget+RecoveryExpr (which already works). NFC"
It causes a crash on invalid code:
class X {
decltype(unresolved()) foo;
};
constexpr int s = sizeof(X);
region.
According to OpenMP 5.0, exactly one scan directive must appear in the loop body of an enclosing worksharing-loop, worksharing-loop SIMD, or simd construct on which a reduction clause with the inscan modifier is present.
If the ancestor device modifier is used and the value of the device
clause is evaluated to 1, the ancestor device shall be used for the
execution.
Since the reverse offloading is not supported yet, the target construct
execution is always initiated from the host, not from the device. So, if
the ancestor modifier is specified, just execute target region on the
host.