Summary:
Added support for dynamic memory allocation for globalized variables in
case if execution of target regions in parallel is required.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82324
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html
Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".
As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.
Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71739
Summary:
When -fopenmp option is specified then version 5.0 will be set as
default.
Reviewers: gregrodgers, jdoerfert, ABataev
Reviewed By: ABataev
Subscribers: pdhaliwal, yaxunl, guansong, sstefan1, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81098
Otherwise, it's painful to insert new code. There are many existing
examples in the same test file where the line numbers are not
hard-coded.
I intend to do the same for several other OpenMP tests, but I want to
be sure there are no objections before I spend time on it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D82224
Summary:
Compiler may erroneously treat current context in OpenMP pragmas as the
context where new type declaration/definition is allowed. But the
declartation/definition of the new types in OpenMP pragmas should not be
allowed.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82019
Summary:
According to OpenMP 5.0, nonmonotonic modifier can be used with all
schedule kinds, not only dynamic and guided as in OpenMP 4.5.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82026
Summary:
The OpenMP loops are normalized and transformed into the loops from 0 to
max number of iterations. In some cases, original scheme may lead to
overflow during calculation of number of iterations. If it is unknown,
if we can end up with overflow or not (the bounds are not constant and
we cannot define if there is an overflow), cast original type to the
unsigned.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, openmp-commits, cfe-commits, caomhin
Tags: #clang, #openmp
Differential Revision: https://reviews.llvm.org/D81881
Summary:
SYCL and OpenMP prohibits thread local storage in device code,
so this commit ensures that error is emitted for device code and not
emitted for host code when host target supports it.
Reviewers: jdoerfert, erichkeane, bader
Reviewed By: jdoerfert, erichkeane
Subscribers: guansong, riccibruno, ABataev, yaxunl, ebevhan, Anastasia, sstefan1, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81641
Summary:
According to OpenMP, During execution of an iteration of a worksharing-loop or a loop nest within a worksharing-loop, simd, or worksharing-loop SIMD region, a thread must not execute more than one ordered region corresponding to an ordered construct without a depend clause.
Need to report an error in this case.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81951
Reland https://reviews.llvm.org/D76696
All known crashes have been fixed, another attemption.
We have rolled out this to all internal users for a while, didn't see
big issues, we consider it is stable enough.
Reviewed By: sammccall
Subscribers: rsmith, hubert.reinterpretcast, ebevhan, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78350
Summary:
Added codegen for use_device_addr clause. The components of the list
items are mapped as a kind of RETURN components and then the returned
base address is used instead of the real address of the base declaration
used in the use_device_addr expressions.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80730
Summary:
If the variables must be globalized in OpenMP mode (local automatic
variable, GPU compilation mode, the variable may escape its declaration
context by the reference or by the pointer), it should not be considered
as the NRVO candidate. Otherwise, incorrect the return value of the
function might not be updated.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80936
Summary:
If the array subscript expression is type depent, its analysis must be
delayed before its instantiation.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, caomhin, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78637
Summary:
Nvidia PTX does not allow `.` to appear in identifiers, so OpenMP variant mangling now uses `$` to separate segments of the mangled name for variants of functions declared via `declare variant`.
Reviewers: jdoerfert, Hahnfeld
Reviewed By: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits
Tags: #openmp, #clang
Differential Revision: https://reviews.llvm.org/D80439
Summary:
If the data member is mapped as an array section, need to emit the
pointer to the last element of this array section and use this pointer
as the highest element in partial struct data.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81037
Summary:
Added initial codegen for 'affinity' clauses on task directives.
Emits next code:
```
kmp_task_affinity_info_t affs[<num_elems>];
void *td = __kmpc_task_alloc(..);
affs[<i>].base = &data_i;
affs[<i>].size = sizeof(data_i);
__kmpc_omp_reg_task_with_affinity(&loc, <gtid>, td, <num_elems>, affs);
```
The result returned by the call of `__kmpc_omp_reg_task_with_affinity`
function is ignored currently sincethe runtime currently ignores args
and returns 0 uncoditionally.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, llvm-commits, cfe-commits, caomhin
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80240
Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits
Tags: #openmp, #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80222
Summary:
Do not ask size of type if it is dependent. ASTContext doesn't seem expecting
this.
Reviewers: jdoerfert, ABataev, bader
Reviewed By: ABataev
Subscribers: yaxunl, guansong, ebevhan, Anastasia, sstefan1, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80829
Summary:
Diagnostic is emitted if some declaration of unsupported type
declaration is used inside device code.
Memcpy operations for structs containing member with unsupported type
are allowed. Fixed crash on attempt to emit diagnostic outside of the
functions.
The approach is generalized between SYCL and OpenMP.
CUDA/OMP deferred diagnostic interface is going to be used for SYCL device.
Reviewers: rsmith, rjmccall, ABataev, erichkeane, bader, jdoerfert, aaron.ballman
Reviewed By: jdoerfert
Subscribers: guansong, sstefan1, yaxunl, mgorny, bader, ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74387
We currently diagnose static data members directly contained in unnamed classes,
but we should also diagnose when they're in a class that is nested (directly or
indirectly) in an unnamed class. Do this by iterating up the list of parent
DeclContexts and checking if any is an unnamed class.
Similarly also check for function or method DeclContexts (which includes things
like blocks and openmp captured statements) as then the class is considered to
be a local class, which means static data members aren't allowed.
Differential Revision: https://reviews.llvm.org/D80295
Fixes PR45753
When a program that contains a loop to which both `omp parallel for`
pragma and `clang loop` pragma are associated is compiled with the
-fopenmp option, `clang loop` pragma did not take effect. The example
below should not be vectorized by the `clang loop` pragma but it was
actually vectorized. The cause is that `llvm.loop.vectorize.width`
was not output to the IR when -fopenmp is specified.
The fix attaches attributes if they exist for the loop.
[example.c]
```
int a[100], b[100];
void foo() {
#pragma omp parallel for
#pragma clang loop vectorize(disable)
for (int i = 0; i < 100; i++)
a[i] += b[i] * i;
}
```
[compile]
```
$ clang -O2 -fopenmp example.c -c -Rpass=vect
example.c:3:11: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
#pragma omp parallel for
^
```
[IR with -fopenmp]
```
$ clang -O2 exmaple.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fopenmp | grep 'vectorize\.width'
```
[IR with -fno-openmp]
```
$ clang -O2 example.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fno-openmp | grep 'vectorize\.width'
!7 = !{!"llvm.loop.vectorize.width", i32 1}
```
Differential Revision: https://reviews.llvm.org/D79921
Summary:
No need to generate inlined OpenMP region for variables captured in
lambdas or block decls, only for implicitly captured variables in the
OpenMP region.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79966
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.
See also D80072.
Differential Revision: https://reviews.llvm.org/D80166
Summary:
With recovery expr, it is possible that we have a value-dependent expr
within non-dependent context.
Reviewers: sammccall, jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80200
Summary:
Predefined allocators should not be mapped at all (they are just enumeric
constants). FOr user-defined allocators need to map the traits only as
firstprivates, the allocator itself is private.
At the beginning of the target region the user-defined allocatores must
be created and then destroyed at the end of the target region:
```
omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>,
/*default memhandle*/ 0, <number_of_traits>, &<traits>);
...
call void @__kmpc_destroy_allocator(<gtid>, my_allocator);
```
Reviewers: jdoerfert, aaron.ballman
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79257
Summary:
omp.h header file defines omp_null_allocator as a predefined allocator,
need to consider it also as a predefined allocator.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79186
Summary:
This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS.
The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64.
Reviewers: ABataev, jdoerfert, andwar
Reviewed By: jdoerfert
Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78969
Summary:
The linear parameter token in the mangling function must be multiplied
by the pointee size in bytes when the parameter is a pointer.
Reviewers: ABataev, andwar, jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78965
Summary:
Added basic support for 'task' modifier in the reduction clauses in
non-simd parallel and worksharing constructs.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78738
Summary:
Patch forces codegen to use the new runtime functions for task reductions where
the issue with passing the address of the original variables to the UDR
initializers is fixed. Also, this patch is required for upcoming
support of task modifier inreduction clause.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78733
Summary:
This patch contains 2 separate changes:
1) the initializer of a variable should play no part in decl "invalid" bit;
2) preserve the invalid initializer via recovery exprs;
With 1), we will regress the diagnostics (one big regression is that we loose
the "selected 'begin' function with iterator type" diagnostic in for-range stmt;
but with 2) together, we don't have regressions (the new diagnostics seems to be
improved).
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78116
Summary:
The global variable should be captured in the region only if it was
privitized in the region or in any of the outer regions. Otherwise, it
should not be captured.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77731