2021-05-06 06:13:14 +08:00
|
|
|
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" "reduction_size[.].+[.]" "pl_cond[.].+[.|,]" --prefix-filecheck-ir-name _
|
|
|
|
// RUN: %clang_cc1 -verify -triple x86_64-apple-darwin10 -fopenmp -x c++ -emit-llvm %s -o - -DUNTIEDRT | FileCheck %s --check-prefix=CHECK1
|
2020-09-15 23:21:47 +08:00
|
|
|
// RUN: %clang_cc1 -fopenmp -x c++ -triple x86_64-apple-darwin10 -emit-pch -o %t %s -DUNTIEDRT
|
2021-05-06 06:13:14 +08:00
|
|
|
// RUN: %clang_cc1 -fopenmp -x c++ -triple x86_64-apple-darwin10 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK2
|
|
|
|
// RUN: %clang_cc1 -verify -triple x86_64-apple-darwin10 -fopenmp -fopenmp-enable-irbuilder -x c++ -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK3
|
2019-12-12 17:19:10 +08:00
|
|
|
// RUN: %clang_cc1 -fopenmp -fopenmp-enable-irbuilder -x c++ -triple x86_64-apple-darwin10 -emit-pch -o %t %s
|
2021-05-06 06:13:14 +08:00
|
|
|
// RUN: %clang_cc1 -fopenmp -fopenmp-enable-irbuilder -x c++ -triple x86_64-apple-darwin10 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK4
|
2017-12-30 02:07:07 +08:00
|
|
|
|
2021-05-19 10:52:53 +08:00
|
|
|
// RUN: %clang_cc1 -verify -triple x86_64-apple-darwin10 -fopenmp-simd -x c++ -emit-llvm %s -o - | FileCheck %s --implicit-check-not="{{__kmpc|__tgt}}"
|
2017-12-30 02:07:07 +08:00
|
|
|
// RUN: %clang_cc1 -fopenmp-simd -x c++ -triple x86_64-apple-darwin10 -emit-pch -o %t %s
|
2021-05-19 10:52:53 +08:00
|
|
|
// RUN: %clang_cc1 -fopenmp-simd -x c++ -triple x86_64-apple-darwin10 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s --implicit-check-not="{{__kmpc|__tgt}}"
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
// expected-no-diagnostics
|
|
|
|
#ifndef HEADER
|
|
|
|
#define HEADER
|
|
|
|
|
2020-09-15 23:21:47 +08:00
|
|
|
enum omp_allocator_handle_t {
|
|
|
|
omp_null_allocator = 0,
|
|
|
|
omp_default_mem_alloc = 1,
|
|
|
|
omp_large_cap_mem_alloc = 2,
|
|
|
|
omp_const_mem_alloc = 3,
|
|
|
|
omp_high_bw_mem_alloc = 4,
|
|
|
|
omp_low_lat_mem_alloc = 5,
|
|
|
|
omp_cgroup_mem_alloc = 6,
|
|
|
|
omp_pteam_mem_alloc = 7,
|
|
|
|
omp_thread_mem_alloc = 8,
|
|
|
|
KMP_ALLOCATOR_MAX_HANDLE = __UINTPTR_MAX__
|
|
|
|
};
|
|
|
|
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
struct S {
|
|
|
|
int a;
|
|
|
|
S() : a(0) {}
|
|
|
|
S(const S &s) : a(s.a) {}
|
|
|
|
~S() {}
|
|
|
|
};
|
|
|
|
int a;
|
|
|
|
int main() {
|
|
|
|
char b;
|
[OPENMP] Codegen for 'depend' clause (OpenMP 4.0).
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
2015-06-24 19:01:36 +08:00
|
|
|
S s[2];
|
2015-08-31 15:32:19 +08:00
|
|
|
int arr[10][a];
|
2016-05-10 18:36:51 +08:00
|
|
|
#pragma omp task shared(a, b, s) priority(b)
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
{
|
|
|
|
a = 15;
|
|
|
|
b = a;
|
[OPENMP] Codegen for 'depend' clause (OpenMP 4.0).
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
2015-06-24 19:01:36 +08:00
|
|
|
s[0].a = 10;
|
|
|
|
}
|
2015-08-31 15:32:19 +08:00
|
|
|
#pragma omp task shared(a, s) depend(in : a, b, s, arr[:])
|
[OPENMP] Codegen for 'depend' clause (OpenMP 4.0).
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
2015-06-24 19:01:36 +08:00
|
|
|
{
|
|
|
|
a = 15;
|
|
|
|
s[1].a = 10;
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
}
|
|
|
|
#pragma omp task untied
|
|
|
|
{
|
2018-10-25 23:35:27 +08:00
|
|
|
#pragma omp critical
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
a = 1;
|
|
|
|
}
|
2015-08-31 15:32:19 +08:00
|
|
|
#pragma omp task untied depend(out : s[0], arr[4:][b])
|
[OPENMP] Codegen for 'depend' clause (OpenMP 4.0).
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
2015-06-24 19:01:36 +08:00
|
|
|
{
|
|
|
|
a = 1;
|
|
|
|
}
|
2019-02-04 15:33:19 +08:00
|
|
|
#pragma omp task untied depend(mutexinoutset: s[0], arr[4:][b])
|
|
|
|
{
|
|
|
|
a = 1;
|
|
|
|
}
|
2015-08-31 15:32:19 +08:00
|
|
|
#pragma omp task final(true) depend(inout: a, s[1], arr[:a][3:])
|
[OPENMP] Codegen for 'depend' clause (OpenMP 4.0).
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
2015-06-24 19:01:36 +08:00
|
|
|
{
|
|
|
|
a = 2;
|
|
|
|
}
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
#pragma omp task final(true)
|
|
|
|
{
|
|
|
|
a = 2;
|
|
|
|
}
|
|
|
|
const bool flag = false;
|
|
|
|
#pragma omp task final(flag)
|
|
|
|
{
|
|
|
|
a = 3;
|
|
|
|
}
|
2015-09-11 18:29:41 +08:00
|
|
|
int c __attribute__((aligned(128)));
|
|
|
|
#pragma omp task final(b) shared(c)
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
{
|
|
|
|
a = 4;
|
2015-09-11 18:29:41 +08:00
|
|
|
c = 5;
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
}
|
2020-09-15 23:21:47 +08:00
|
|
|
#pragma omp task untied firstprivate(c) allocate(omp_pteam_mem_alloc:c)
|
2016-04-20 12:01:36 +08:00
|
|
|
{
|
2020-09-15 23:21:47 +08:00
|
|
|
S s1, s2;
|
|
|
|
#ifdef UNTIEDRT
|
|
|
|
#pragma omp allocate(s2) allocator(omp_pteam_mem_alloc)
|
|
|
|
#endif
|
|
|
|
s2.a = 0;
|
2016-04-20 12:01:36 +08:00
|
|
|
#pragma omp task
|
2020-09-15 23:21:47 +08:00
|
|
|
a = c = 4;
|
2016-04-20 12:01:36 +08:00
|
|
|
#pragma omp taskyield
|
|
|
|
s1 = S();
|
2020-09-15 23:21:47 +08:00
|
|
|
s2.a = 10;
|
2016-04-20 12:01:36 +08:00
|
|
|
#pragma omp taskwait
|
|
|
|
}
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
return a;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2016-04-20 12:01:36 +08:00
|
|
|
|
|
|
|
|
2020-08-12 22:54:36 +08:00
|
|
|
|
|
|
|
|
2016-04-20 12:01:36 +08:00
|
|
|
|
2020-09-15 23:21:47 +08:00
|
|
|
|
2016-04-20 12:01:36 +08:00
|
|
|
|
2020-08-12 22:54:36 +08:00
|
|
|
|
|
|
|
// s1 = S();
|
2016-04-20 12:01:36 +08:00
|
|
|
|
2020-08-12 22:54:36 +08:00
|
|
|
|
|
|
|
|
|
|
|
|
2018-10-25 02:53:12 +08:00
|
|
|
|
|
|
|
struct S1 {
|
|
|
|
int a;
|
|
|
|
S1() { taskinit(); }
|
|
|
|
void taskinit() {
|
|
|
|
#pragma omp task
|
|
|
|
a = 0;
|
|
|
|
}
|
|
|
|
} s1;
|
|
|
|
|
|
|
|
|
2020-09-25 02:46:06 +08:00
|
|
|
#ifdef UNTIEDRT
|
|
|
|
// FIXME: There is a buffer overflow in IrBuilder mode.
|
|
|
|
template <typename T = void>
|
|
|
|
void foobar() {
|
|
|
|
float a;
|
|
|
|
#pragma omp parallel
|
|
|
|
#pragma omp single
|
|
|
|
{
|
|
|
|
double b;
|
|
|
|
#pragma omp task
|
|
|
|
a += b;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void xxxx() {
|
|
|
|
foobar();
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Copy `a` to the list of shared variables
|
|
|
|
|
|
|
|
// Allocate task.
|
|
|
|
// Copy shared vars.
|
|
|
|
|
|
|
|
// Copy firstprivate value of `b`.
|
|
|
|
#endif // UNTIEDRT
|
[OPENMP] Initial codegen for 'omp task' directive.
The task region is emmitted in several steps:
Emit a call to kmp_task_t *__kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry).
Here task_entry is a pointer to the function:
kmp_int32 .omp_task_entry.(kmp_int32 gtid, kmp_task_t *tt) {
TaskFunction(gtid, tt->part_id, tt->shareds);
return 0;
}
Copy a list of shared variables to field shareds of the resulting structure kmp_task_t returned by the previous call (if any).
Copy a pointer to destructions function to field destructions of the resulting structure kmp_task_t.
Emit a call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task), where new_task is a resulting structure from previous items.
Differential Revision: http://reviews.llvm.org/D7560
llvm-svn: 231762
2015-03-10 15:28:44 +08:00
|
|
|
#endif
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@main
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-SAME: () #[[ATTR0:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[B:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK1-NEXT: [[S:%.*]] = alloca [2 x %struct.S], align 4
|
|
|
|
// CHECK1-NEXT: [[SAVED_STACK:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__VLA_EXPR0:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON:%.*]], align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED1:%.*]] = alloca [[STRUCT_ANON_0:%.*]], align 8
|
|
|
|
// CHECK1-NEXT: [[DOTDEP_ARR_ADDR:%.*]] = alloca [4 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK1-NEXT: [[DEP_COUNTER_ADDR:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED3:%.*]] = alloca [[STRUCT_ANON_2:%.*]], align 1
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED4:%.*]] = alloca [[STRUCT_ANON_4:%.*]], align 1
|
|
|
|
// CHECK1-NEXT: [[DOTDEP_ARR_ADDR5:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK1-NEXT: [[DEP_COUNTER_ADDR11:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED12:%.*]] = alloca [[STRUCT_ANON_6:%.*]], align 1
|
|
|
|
// CHECK1-NEXT: [[DOTDEP_ARR_ADDR13:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK1-NEXT: [[DEP_COUNTER_ADDR19:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED20:%.*]] = alloca [[STRUCT_ANON_8:%.*]], align 1
|
|
|
|
// CHECK1-NEXT: [[DOTDEP_ARR_ADDR21:%.*]] = alloca [3 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK1-NEXT: [[DEP_COUNTER_ADDR27:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED28:%.*]] = alloca [[STRUCT_ANON_10:%.*]], align 1
|
|
|
|
// CHECK1-NEXT: [[FLAG:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED29:%.*]] = alloca [[STRUCT_ANON_12:%.*]], align 1
|
|
|
|
// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 128
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED30:%.*]] = alloca [[STRUCT_ANON_14:%.*]], align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED31:%.*]] = alloca [[STRUCT_ANON_16:%.*]], align 1
|
|
|
|
// CHECK1-NEXT: [[TMP0:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1:[0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK1-NEXT: [[ARRAY_BEGIN:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[ARRAYCTOR_END:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAY_BEGIN]], i64 2
|
|
|
|
// CHECK1-NEXT: br label [[ARRAYCTOR_LOOP:%.*]]
|
|
|
|
// CHECK1: arrayctor.loop:
|
|
|
|
// CHECK1-NEXT: [[ARRAYCTOR_CUR:%.*]] = phi %struct.S* [ [[ARRAY_BEGIN]], [[ENTRY:%.*]] ], [ [[ARRAYCTOR_NEXT:%.*]], [[ARRAYCTOR_LOOP]] ]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[ARRAYCTOR_CUR]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[ARRAYCTOR_NEXT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYCTOR_CUR]], i64 1
|
|
|
|
// CHECK1-NEXT: [[ARRAYCTOR_DONE:%.*]] = icmp eq %struct.S* [[ARRAYCTOR_NEXT]], [[ARRAYCTOR_END]]
|
|
|
|
// CHECK1-NEXT: br i1 [[ARRAYCTOR_DONE]], label [[ARRAYCTOR_CONT:%.*]], label [[ARRAYCTOR_LOOP]]
|
|
|
|
// CHECK1: arrayctor.cont:
|
|
|
|
// CHECK1-NEXT: [[TMP1:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = call i8* @llvm.stacksave()
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP3]], i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = mul nuw i64 10, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[VLA:%.*]] = alloca i32, i64 [[TMP4]], align 16
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP2]], i64* [[__VLA_EXPR0]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i8* [[B]], i8** [[TMP5]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK1-NEXT: [[CONV:%.*]] = sext i8 [[TMP7]] to i32
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 33, i64 40, i64 16, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast i8* [[TMP8]] to %struct.kmp_task_t_with_privates*
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP9]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load i8*, i8** [[TMP11]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = bitcast %struct.anon* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK1-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP12]], i8* align 8 [[TMP13]], i64 16, i1 false)
|
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP10]], i32 0, i32 4
|
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = bitcast %union.kmp_cmplrdata_t* [[TMP14]] to i32*
|
|
|
|
// CHECK1-NEXT: store i32 [[CONV]], i32* [[TMP15]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP8]])
|
|
|
|
// CHECK1-NEXT: [[TMP17:%.*]] = getelementptr inbounds [[STRUCT_ANON_0]], %struct.anon.0* [[AGG_CAPTURED1]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP17]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP18:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP19:%.*]] = bitcast i8* [[TMP18]] to %struct.kmp_task_t_with_privates.1*
|
|
|
|
// CHECK1-NEXT: [[TMP20:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP21:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP20]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP22:%.*]] = load i8*, i8** [[TMP21]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP23:%.*]] = bitcast %struct.anon.0* [[AGG_CAPTURED1]] to i8*
|
|
|
|
// CHECK1-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP22]], i8* align 8 [[TMP23]], i64 8, i1 false)
|
|
|
|
// CHECK1-NEXT: [[TMP24:%.*]] = getelementptr inbounds [4 x %struct.kmp_depend_info], [4 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR]], i64 0, i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP25:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO:%.*]], %struct.kmp_depend_info* [[TMP24]], i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP26:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP25]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP26]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP27:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP25]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 4, i64* [[TMP27]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP28:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP25]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 1, i8* [[TMP28]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP29:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i64 1
|
|
|
|
// CHECK1-NEXT: [[TMP30:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP29]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP31:%.*]] = ptrtoint i8* [[B]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP31]], i64* [[TMP30]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP32:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP29]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 1, i64* [[TMP32]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP33:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP29]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 1, i8* [[TMP33]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP34:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i64 2
|
|
|
|
// CHECK1-NEXT: [[TMP35:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP34]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP36:%.*]] = ptrtoint [2 x %struct.S]* [[S]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP36]], i64* [[TMP35]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP37:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP34]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 8, i64* [[TMP37]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP38:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP34]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 1, i8* [[TMP38]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP39:%.*]] = mul nsw i64 0, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP39]]
|
|
|
|
// CHECK1-NEXT: [[TMP40:%.*]] = mul nsw i64 9, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP40]]
|
|
|
|
// CHECK1-NEXT: [[TMP41:%.*]] = getelementptr i32, i32* [[ARRAYIDX2]], i32 1
|
|
|
|
// CHECK1-NEXT: [[TMP42:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP43:%.*]] = ptrtoint i32* [[TMP41]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP44:%.*]] = sub nuw i64 [[TMP43]], [[TMP42]]
|
|
|
|
// CHECK1-NEXT: [[TMP45:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i64 3
|
|
|
|
// CHECK1-NEXT: [[TMP46:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP45]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP47:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP47]], i64* [[TMP46]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP48:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP45]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP44]], i64* [[TMP48]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP49:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP45]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 1, i8* [[TMP49]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i64 4, i64* [[DEP_COUNTER_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP50:%.*]] = bitcast %struct.kmp_depend_info* [[TMP24]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP51:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP18]], i32 4, i8* [[TMP50]], i32 0, i8* null)
|
|
|
|
// CHECK1-NEXT: [[TMP52:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.3*)* @.omp_task_entry..4 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP53:%.*]] = bitcast i8* [[TMP52]] to %struct.kmp_task_t_with_privates.3*
|
|
|
|
// CHECK1-NEXT: [[TMP54:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP53]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP55:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP54]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[TMP55]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP56:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP52]])
|
|
|
|
// CHECK1-NEXT: [[TMP57:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.5*)* @.omp_task_entry..6 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP58:%.*]] = bitcast i8* [[TMP57]] to %struct.kmp_task_t_with_privates.5*
|
|
|
|
// CHECK1-NEXT: [[TMP59:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP58]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP60:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR5]], i64 0, i64 0
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX6:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP61:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP62:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP61]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP63:%.*]] = ptrtoint %struct.S* [[ARRAYIDX6]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP63]], i64* [[TMP62]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP64:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP61]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 4, i64* [[TMP64]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP65:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP61]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 3, i8* [[TMP65]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP66:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK1-NEXT: [[TMP67:%.*]] = sext i8 [[TMP66]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP68:%.*]] = mul nsw i64 4, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP68]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX8:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX7]], i64 [[TMP67]]
|
|
|
|
// CHECK1-NEXT: [[TMP69:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK1-NEXT: [[TMP70:%.*]] = sext i8 [[TMP69]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP71:%.*]] = mul nsw i64 9, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX9:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP71]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX10:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX9]], i64 [[TMP70]]
|
|
|
|
// CHECK1-NEXT: [[TMP72:%.*]] = getelementptr i32, i32* [[ARRAYIDX10]], i32 1
|
|
|
|
// CHECK1-NEXT: [[TMP73:%.*]] = ptrtoint i32* [[ARRAYIDX8]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP74:%.*]] = ptrtoint i32* [[TMP72]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP75:%.*]] = sub nuw i64 [[TMP74]], [[TMP73]]
|
|
|
|
// CHECK1-NEXT: [[TMP76:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i64 1
|
|
|
|
// CHECK1-NEXT: [[TMP77:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP76]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP78:%.*]] = ptrtoint i32* [[ARRAYIDX8]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP78]], i64* [[TMP77]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP79:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP76]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP75]], i64* [[TMP79]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP80:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP76]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 3, i8* [[TMP80]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR11]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP81:%.*]] = bitcast %struct.kmp_depend_info* [[TMP60]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP82:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP59]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[TMP82]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP83:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP57]], i32 2, i8* [[TMP81]], i32 0, i8* null)
|
|
|
|
// CHECK1-NEXT: [[TMP84:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.7*)* @.omp_task_entry..8 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP85:%.*]] = bitcast i8* [[TMP84]] to %struct.kmp_task_t_with_privates.7*
|
|
|
|
// CHECK1-NEXT: [[TMP86:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP85]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP87:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR13]], i64 0, i64 0
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX14:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP88:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP89:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP88]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP90:%.*]] = ptrtoint %struct.S* [[ARRAYIDX14]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP90]], i64* [[TMP89]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP91:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP88]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 4, i64* [[TMP91]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP92:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP88]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 4, i8* [[TMP92]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP93:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK1-NEXT: [[TMP94:%.*]] = sext i8 [[TMP93]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP95:%.*]] = mul nsw i64 4, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX15:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP95]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX16:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX15]], i64 [[TMP94]]
|
|
|
|
// CHECK1-NEXT: [[TMP96:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK1-NEXT: [[TMP97:%.*]] = sext i8 [[TMP96]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP98:%.*]] = mul nsw i64 9, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX17:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP98]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX18:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX17]], i64 [[TMP97]]
|
|
|
|
// CHECK1-NEXT: [[TMP99:%.*]] = getelementptr i32, i32* [[ARRAYIDX18]], i32 1
|
|
|
|
// CHECK1-NEXT: [[TMP100:%.*]] = ptrtoint i32* [[ARRAYIDX16]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP101:%.*]] = ptrtoint i32* [[TMP99]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP102:%.*]] = sub nuw i64 [[TMP101]], [[TMP100]]
|
|
|
|
// CHECK1-NEXT: [[TMP103:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i64 1
|
|
|
|
// CHECK1-NEXT: [[TMP104:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP103]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP105:%.*]] = ptrtoint i32* [[ARRAYIDX16]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP105]], i64* [[TMP104]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP106:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP103]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP102]], i64* [[TMP106]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP107:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP103]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 4, i8* [[TMP107]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR19]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP108:%.*]] = bitcast %struct.kmp_depend_info* [[TMP87]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP109:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP86]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[TMP109]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP110:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP84]], i32 2, i8* [[TMP108]], i32 0, i8* null)
|
|
|
|
// CHECK1-NEXT: [[TMP111:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.9*)* @.omp_task_entry..10 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP112:%.*]] = bitcast i8* [[TMP111]] to %struct.kmp_task_t_with_privates.9*
|
|
|
|
// CHECK1-NEXT: [[TMP113:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP112]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP114:%.*]] = getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR21]], i64 0, i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP115:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i64 0
|
|
|
|
// CHECK1-NEXT: [[TMP116:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP115]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP116]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP117:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP115]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 4, i64* [[TMP117]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP118:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP115]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 3, i8* [[TMP118]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[ARRAYIDX22:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 1
|
|
|
|
// CHECK1-NEXT: [[TMP119:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i64 1
|
|
|
|
// CHECK1-NEXT: [[TMP120:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP119]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP121:%.*]] = ptrtoint %struct.S* [[ARRAYIDX22]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP121]], i64* [[TMP120]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP122:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP119]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 4, i64* [[TMP122]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP123:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP119]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 3, i8* [[TMP123]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP124:%.*]] = mul nsw i64 0, [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX23:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP124]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX24:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX23]], i64 3
|
|
|
|
// CHECK1-NEXT: [[TMP125:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: [[TMP126:%.*]] = sext i32 [[TMP125]] to i64
|
|
|
|
// CHECK1-NEXT: [[LEN_SUB_1:%.*]] = sub nsw i64 [[TMP126]], 1
|
|
|
|
// CHECK1-NEXT: [[TMP127:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: [[TMP128:%.*]] = sext i32 [[TMP127]] to i64
|
|
|
|
// CHECK1-NEXT: [[LB_ADD_LEN:%.*]] = add nsw i64 -1, [[TMP128]]
|
|
|
|
// CHECK1-NEXT: [[TMP129:%.*]] = mul nsw i64 [[LB_ADD_LEN]], [[TMP2]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX25:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP129]]
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX26:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX25]], i64 [[LEN_SUB_1]]
|
|
|
|
// CHECK1-NEXT: [[TMP130:%.*]] = getelementptr i32, i32* [[ARRAYIDX26]], i32 1
|
|
|
|
// CHECK1-NEXT: [[TMP131:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP132:%.*]] = ptrtoint i32* [[TMP130]] to i64
|
|
|
|
// CHECK1-NEXT: [[TMP133:%.*]] = sub nuw i64 [[TMP132]], [[TMP131]]
|
|
|
|
// CHECK1-NEXT: [[TMP134:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i64 2
|
|
|
|
// CHECK1-NEXT: [[TMP135:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP134]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP136:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP136]], i64* [[TMP135]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP137:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP134]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: store i64 [[TMP133]], i64* [[TMP137]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP138:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP134]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK1-NEXT: store i8 3, i8* [[TMP138]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i64 3, i64* [[DEP_COUNTER_ADDR27]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP139:%.*]] = bitcast %struct.kmp_depend_info* [[TMP114]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP140:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP111]], i32 3, i8* [[TMP139]], i32 0, i8* null)
|
|
|
|
// CHECK1-NEXT: [[TMP141:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.11*)* @.omp_task_entry..12 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP142:%.*]] = bitcast i8* [[TMP141]] to %struct.kmp_task_t_with_privates.11*
|
|
|
|
// CHECK1-NEXT: [[TMP143:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP142]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP144:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP141]])
|
|
|
|
// CHECK1-NEXT: store i8 0, i8* [[FLAG]], align 1
|
|
|
|
// CHECK1-NEXT: [[TMP145:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 1, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.13*)* @.omp_task_entry..14 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP146:%.*]] = bitcast i8* [[TMP145]] to %struct.kmp_task_t_with_privates.13*
|
|
|
|
// CHECK1-NEXT: [[TMP147:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP146]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP148:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP145]])
|
|
|
|
// CHECK1-NEXT: [[TMP149:%.*]] = getelementptr inbounds [[STRUCT_ANON_14]], %struct.anon.14* [[AGG_CAPTURED30]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i32* [[C]], i32** [[TMP149]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP150:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK1-NEXT: [[TOBOOL:%.*]] = icmp ne i8 [[TMP150]], 0
|
|
|
|
// CHECK1-NEXT: [[TMP151:%.*]] = select i1 [[TOBOOL]], i32 2, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP152:%.*]] = or i32 [[TMP151]], 1
|
|
|
|
// CHECK1-NEXT: [[TMP153:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 [[TMP152]], i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.15*)* @.omp_task_entry..16 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP154:%.*]] = bitcast i8* [[TMP153]] to %struct.kmp_task_t_with_privates.15*
|
|
|
|
// CHECK1-NEXT: [[TMP155:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP154]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP156:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP155]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP157:%.*]] = load i8*, i8** [[TMP156]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP158:%.*]] = bitcast %struct.anon.14* [[AGG_CAPTURED30]] to i8*
|
|
|
|
// CHECK1-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP157]], i8* align 8 [[TMP158]], i64 8, i1 false)
|
|
|
|
// CHECK1-NEXT: [[TMP159:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP153]])
|
|
|
|
// CHECK1-NEXT: [[TMP160:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.19*)* @.omp_task_entry..21 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP161:%.*]] = bitcast i8* [[TMP160]] to %struct.kmp_task_t_with_privates.19*
|
|
|
|
// CHECK1-NEXT: [[TMP162:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP161]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP163:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP161]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP164:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP163]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP165:%.*]] = load i32, i32* [[C]], align 128
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP165]], i32* [[TMP164]], align 128
|
|
|
|
// CHECK1-NEXT: [[TMP166:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP162]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[TMP166]], align 16
|
|
|
|
// CHECK1-NEXT: [[TMP167:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP160]])
|
|
|
|
// CHECK1-NEXT: [[TMP168:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP168]], i32* [[RETVAL]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP169:%.*]] = load i8*, i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK1-NEXT: call void @llvm.stackrestore(i8* [[TMP169]])
|
|
|
|
// CHECK1-NEXT: [[ARRAY_BEGIN32:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP170:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAY_BEGIN32]], i64 2
|
|
|
|
// CHECK1-NEXT: br label [[ARRAYDESTROY_BODY:%.*]]
|
|
|
|
// CHECK1: arraydestroy.body:
|
|
|
|
// CHECK1-NEXT: [[ARRAYDESTROY_ELEMENTPAST:%.*]] = phi %struct.S* [ [[TMP170]], [[ARRAYCTOR_CONT]] ], [ [[ARRAYDESTROY_ELEMENT:%.*]], [[ARRAYDESTROY_BODY]] ]
|
|
|
|
// CHECK1-NEXT: [[ARRAYDESTROY_ELEMENT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYDESTROY_ELEMENTPAST]], i64 -1
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[ARRAYDESTROY_ELEMENT]]) #[[ATTR4:[0-9]+]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[ARRAYDESTROY_DONE:%.*]] = icmp eq %struct.S* [[ARRAYDESTROY_ELEMENT]], [[ARRAY_BEGIN32]]
|
|
|
|
// CHECK1-NEXT: br i1 [[ARRAYDESTROY_DONE]], label [[ARRAYDESTROY_DONE33:%.*]], label [[ARRAYDESTROY_BODY]]
|
|
|
|
// CHECK1: arraydestroy.done33:
|
|
|
|
// CHECK1-NEXT: [[TMP171:%.*]] = load i32, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK1-NEXT: ret i32 [[TMP171]]
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_ZN1SC1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SC2Ev(%struct.S* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates* noalias noundef [[TMP1:%.*]]) #[[ATTR3:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates* [[TMP1]], %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates*, %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META3:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META6:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META8:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META10:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !12
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK1-NEXT: store %struct.anon* [[TMP8]], %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon*, %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: [[CONV_I:%.*]] = trunc i32 [[TMP11]] to i8
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = getelementptr inbounds [[STRUCT_ANON:%.*]], %struct.anon* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = load i8*, i8** [[TMP12]], align 8
|
|
|
|
// CHECK1-NEXT: store i8 [[CONV_I]], i8* [[TMP13]], align 1
|
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[TMP10]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP14]], align 8
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP15]], i64 0, i64 0
|
|
|
|
// CHECK1-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..2
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.1* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.0*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.1*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.1* [[TMP1]], %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.1*, %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.0*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.1* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META13:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META16:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META18:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META20:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !22
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK1-NEXT: store %struct.anon.0* [[TMP8]], %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.0*, %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_0:%.*]], %struct.anon.0* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP11]], align 8
|
|
|
|
// CHECK1-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP12]], i64 0, i64 1
|
|
|
|
// CHECK1-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..4
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.3* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.2*, align 8
|
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.3*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.3* [[TMP1]], %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.3*, %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.2*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.3* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META23:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META26:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META28:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META30:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK1-NEXT: store %struct.anon.2* [[TMP8]], %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.2*, %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK1-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK1-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK1-NEXT: ]
|
|
|
|
// CHECK1: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK1: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[TMP13]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP14]], i8* [[TMP15]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT:%.*]]
|
|
|
|
// CHECK1: .untied.jmp.1.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP17:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: call void @__kmpc_critical(%struct.ident_t* @[[GLOB1]], i32 [[TMP17]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* @a, align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: call void @__kmpc_end_critical(%struct.ident_t* @[[GLOB1]], i32 [[TMP17]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK1: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT]]
|
|
|
|
// CHECK1: .omp_outlined..3.exit:
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..6
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.5* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.4*, align 8
|
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.5*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.5* [[TMP1]], %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.5*, %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.4*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.5* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META33:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META36:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META38:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META40:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !42
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK1-NEXT: store %struct.anon.4* [[TMP8]], %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.4*, %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK1-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK1-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK1-NEXT: ]
|
|
|
|
// CHECK1: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK1: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[TMP13]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !42
|
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP14]], i8* [[TMP15]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT:%.*]]
|
|
|
|
// CHECK1: .untied.jmp.1.i:
|
|
|
|
// CHECK1-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK1: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT]]
|
|
|
|
// CHECK1: .omp_outlined..5.exit:
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.7* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.6*, align 8
|
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.7*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.7* [[TMP1]], %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.7*, %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.6*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.7* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META43:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META46:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META48:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META50:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !52
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK1-NEXT: store %struct.anon.6* [[TMP8]], %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.6*, %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK1-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK1-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK1-NEXT: ]
|
|
|
|
// CHECK1: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK1: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[TMP13]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !52
|
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP14]], i8* [[TMP15]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT:%.*]]
|
|
|
|
// CHECK1: .untied.jmp.1.i:
|
|
|
|
// CHECK1-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK1: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT]]
|
|
|
|
// CHECK1: .omp_outlined..7.exit:
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..10
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.9* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.9*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.9* [[TMP1]], %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.9*, %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.8*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.9* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META53:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META56:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META58:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META60:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !62
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK1-NEXT: store %struct.anon.8* [[TMP8]], %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.8*, %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..12
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.11* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.10*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.11*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.11* [[TMP1]], %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.11*, %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.10*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.11* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META63:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META66:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META68:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META70:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !72
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK1-NEXT: store %struct.anon.10* [[TMP8]], %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.10*, %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..14
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.13* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.12*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.13*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.13* [[TMP1]], %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.13*, %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.12*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.13* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META73:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META76:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META78:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META80:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !82
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK1-NEXT: store %struct.anon.12* [[TMP8]], %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.12*, %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 3, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..16
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.15* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.14*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.15*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.15* [[TMP1]], %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.15*, %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.14*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.15* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META83:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META86:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META88:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META90:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !92
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK1-NEXT: store %struct.anon.14* [[TMP8]], %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.14*, %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_14:%.*]], %struct.anon.14* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load i32*, i32** [[TMP11]], align 8
|
|
|
|
// CHECK1-NEXT: store i32 5, i32* [[TMP12]], align 128
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_privates_map.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct..kmp_privates.t* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]]) #[[ATTR7:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK1-NEXT: store %struct..kmp_privates.t* [[TMP0]], %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK1-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load %struct..kmp_privates.t*, %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP2]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP3]], i32** [[TMP4]], align 8
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..19
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.18* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.17*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.18*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.18* [[TMP1]], %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.18*, %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.17*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t* [[TMP9]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.18* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META93:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META96:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META98:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META100:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !102
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t*, i32**)* @.omp_task_privates_map. to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK1-NEXT: store %struct.anon.17* [[TMP8]], %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load %struct.anon.17*, %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 4, i32* [[TMP16]], align 128
|
|
|
|
// CHECK1-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_ZN1SD1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SD2Ev(%struct.S* noundef [[THIS1]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_privates_map..20
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct..kmp_privates.t.20* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]], %struct.S** noalias noundef [[TMP2:%.*]], %struct.S*** noalias noundef [[TMP3:%.*]]) #[[ATTR7]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t.20*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR2:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR3:%.*]] = alloca %struct.S***, align 8
|
|
|
|
// CHECK1-NEXT: store %struct..kmp_privates.t.20* [[TMP0]], %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK1-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S** [[TMP2]], %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S*** [[TMP3]], %struct.S**** [[DOTADDR3]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = load %struct..kmp_privates.t.20*, %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = load %struct.S***, %struct.S**** [[DOTADDR3]], align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S** [[TMP7]], %struct.S*** [[TMP8]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 3
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.S**, %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S* [[TMP9]], %struct.S** [[TMP10]], align 8
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..21
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.19* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.16*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTLOCAL_PTR_ADDR_I:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTLOCAL_PTR_ADDR1_I:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[REF_TMP_I:%.*]] = alloca [[STRUCT_S:%.*]], align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.19*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.19* [[TMP1]], %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.19*, %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.16*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t.20* [[TMP9]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.19* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META103:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META106:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META108:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META110:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t.20*, i32**, %struct.S**, %struct.S***)* @.omp_task_privates_map..20 to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: store %struct.anon.16* [[TMP8]], %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load %struct.anon.16*, %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**, %struct.S**, %struct.S***)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], %struct.S** [[DOTLOCAL_PTR_ADDR_I]], %struct.S*** [[DOTLOCAL_PTR_ADDR1_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP17:%.*]] = load %struct.S*, %struct.S** [[DOTLOCAL_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP18:%.*]] = load %struct.S**, %struct.S*** [[DOTLOCAL_PTR_ADDR1_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP19:%.*]] = load %struct.S*, %struct.S** [[TMP18]], align 8
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP20:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP21:%.*]] = load i32, i32* [[TMP20]], align 4
|
|
|
|
// CHECK1-NEXT: switch i32 [[TMP21]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK1-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 1, label [[DOTUNTIED_JMP_2_I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 2, label [[DOTUNTIED_JMP_3_I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 3, label [[DOTUNTIED_JMP_5_I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 4, label [[DOTUNTIED_JMP_7_I:%.*]]
|
|
|
|
// CHECK1-NEXT: i32 5, label [[DOTUNTIED_JMP_10_I:%.*]]
|
|
|
|
// CHECK1-NEXT: ]
|
|
|
|
// CHECK1: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK1: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP22:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 1, i32* [[TMP22]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP23:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP24:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP25:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP23]], i8* [[TMP24]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT:%.*]]
|
|
|
|
// CHECK1: .untied.jmp.2.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[TMP17]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP26:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[DOTS2__VOID_ADDR_I:%.*]] = call i8* @__kmpc_alloc(i32 [[TMP26]], i64 4, i8* inttoptr (i64 7 to i8*)) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[DOTS2__ADDR_I:%.*]] = bitcast i8* [[DOTS2__VOID_ADDR_I]] to %struct.S*
|
|
|
|
// CHECK1-NEXT: store %struct.S* [[DOTS2__ADDR_I]], %struct.S** [[TMP18]], align 8
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP27:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 2, i32* [[TMP27]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP28:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP29:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP30:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP28]], i8* [[TMP29]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK1: .untied.jmp.3.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[TMP19]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[A_I]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP31:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP32:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP31]], i32 1, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.18*)* @.omp_task_entry..19 to i32 (i32, i8*)*)) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP33:%.*]] = bitcast i8* [[TMP32]] to %struct.kmp_task_t_with_privates.18*
|
|
|
|
// CHECK1-NEXT: [[TMP34:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP33]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP35:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP33]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP36:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP35]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP37:%.*]] = load i32, i32* [[TMP16]], align 128
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP37]], i32* [[TMP36]], align 128
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP38:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP39:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP38]], i8* [[TMP32]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP40:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 3, i32* [[TMP40]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP41:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP42:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP43:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP41]], i8* [[TMP42]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK1: .untied.jmp.5.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP44:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP45:%.*]] = call i32 @__kmpc_omp_taskyield(%struct.ident_t* @[[GLOB1]], i32 [[TMP44]], i32 0) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP46:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 4, i32* [[TMP46]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP47:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP48:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP49:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP47]], i8* [[TMP48]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK1: .untied.jmp.7.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP50:%.*]] = bitcast %struct.S* [[TMP17]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP51:%.*]] = bitcast %struct.S* [[REF_TMP_I]] to i8*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP50]], i8* align 4 [[TMP51]], i64 4, i1 false) #[[ATTR4]]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[A9_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i32 10, i32* [[A9_I]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP52:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP53:%.*]] = call i32 @__kmpc_omp_taskwait(%struct.ident_t* @[[GLOB1]], i32 [[TMP52]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP54:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 5, i32* [[TMP54]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP55:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK1-NEXT: [[TMP56:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: [[TMP57:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP55]], i8* [[TMP56]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK1: .untied.jmp.10.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[TMP19]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP58:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP59:%.*]] = bitcast %struct.S* [[TMP19]] to i8*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: call void @__kmpc_free(i32 [[TMP58]], i8* [[TMP59]], i8* inttoptr (i64 7 to i8*)) #[[ATTR4]]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[TMP17]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK1: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK1: .omp_outlined..17.exit:
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_ZN1SC2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[A:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[THIS1]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[A]], align 4
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_ZN1SD2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@__cxx_global_var_init
|
|
|
|
// CHECK1-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
|
|
|
// CHECK1-NEXT: entry:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN2S1C1Ev(%struct.S1* noundef @s1)
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_ZN2S1C1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN2S1C2Ev(%struct.S1* noundef [[THIS1]])
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_ZN2S1C2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK1-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-NEXT: call void @_ZN2S18taskinitEv(%struct.S1* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_ZN2S18taskinitEv
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct.S1* noundef [[THIS:%.*]]) #[[ATTR8:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON_21:%.*]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP0:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]])
|
|
|
|
// CHECK1-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP1:%.*]] = getelementptr inbounds [[STRUCT_ANON_21]], %struct.anon.21* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store %struct.S1* [[THIS1]], %struct.S1** [[TMP1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.22*)* @.omp_task_entry..23 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = bitcast i8* [[TMP2]] to %struct.kmp_task_t_with_privates.22*
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = load i8*, i8** [[TMP5]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = bitcast %struct.anon.21* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK1-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP6]], i8* align 8 [[TMP7]], i64 8, i1 false)
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP2]])
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..23
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.22* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.21*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.22*, align 8
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.22* [[TMP1]], %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.22*, %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.21*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.22* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META113:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META116:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META118:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META120:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !122
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK1-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK1-NEXT: store %struct.anon.21* [[TMP8]], %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load %struct.anon.21*, %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_21:%.*]], %struct.anon.21* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load %struct.S1*, %struct.S1** [[TMP11]], align 8
|
|
|
|
// CHECK1-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S1:%.*]], %struct.S1* [[TMP12]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store i32 0, i32* [[A_I]], align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_Z4xxxxv
|
|
|
|
// CHECK1-SAME: () #[[ATTR8]] {
|
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: call void @_Z6foobarIvEvv()
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_Z6foobarIvEvv
|
|
|
|
// CHECK1-SAME: () #[[ATTR8]] {
|
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[A:%.*]] = alloca float, align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: call void (%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...) @__kmpc_fork_call(%struct.ident_t* @[[GLOB1]], i32 1, void (i32*, i32*, ...)* bitcast (void (i32*, i32*, float*)* @.omp_outlined..24 to void (i32*, i32*, ...)*), float* [[A]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_outlined..24
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32* noalias noundef [[DOTGLOBAL_TID_:%.*]], i32* noalias noundef [[DOTBOUND_TID_:%.*]], float* noundef nonnull align 4 dereferenceable(4) [[A:%.*]]) #[[ATTR9:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTBOUND_TID__ADDR:%.*]] = alloca i32*, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[A_ADDR:%.*]] = alloca float*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[B:%.*]] = alloca double, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON_23:%.*]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32* [[DOTGLOBAL_TID_]], i32** [[DOTGLOBAL_TID__ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: store i32* [[DOTBOUND_TID_]], i32** [[DOTBOUND_TID__ADDR]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: store float* [[A]], float** [[A_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP0:%.*]] = load float*, float** [[A_ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP1:%.*]] = load i32*, i32** [[DOTGLOBAL_TID__ADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[TMP1]], align 4
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = call i32 @__kmpc_single(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]])
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = icmp ne i32 [[TMP3]], 0
|
|
|
|
// CHECK1-NEXT: br i1 [[TMP4]], label [[OMP_IF_THEN:%.*]], label [[OMP_IF_END:%.*]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1: omp_if.then:
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_ANON_23]], %struct.anon.23* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: store float* [[TMP0]], float** [[TMP5]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]], i32 1, i64 48, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.24*)* @.omp_task_entry..27 to i32 (i32, i8*)*))
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = bitcast i8* [[TMP6]] to %struct.kmp_task_t_with_privates.24*
|
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24:%.*]], %struct.kmp_task_t_with_privates.24* [[TMP7]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP8]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = load i8*, i8** [[TMP9]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = bitcast %struct.anon.23* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK1-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP10]], i8* align 8 [[TMP11]], i64 8, i1 false)
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24]], %struct.kmp_task_t_with_privates.24* [[TMP7]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_25:%.*]], %struct..kmp_privates.t.25* [[TMP12]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = load double, double* [[B]], align 8
|
|
|
|
// CHECK1-NEXT: store double [[TMP14]], double* [[TMP13]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]], i8* [[TMP6]])
|
|
|
|
// CHECK1-NEXT: call void @__kmpc_end_single(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: br label [[OMP_IF_END]]
|
|
|
|
// CHECK1: omp_if.end:
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: call void @__kmpc_barrier(%struct.ident_t* @[[GLOB2:[0-9]+]], i32 [[TMP2]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_privates_map..26
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (%struct..kmp_privates.t.25* noalias noundef [[TMP0:%.*]], double** noalias noundef [[TMP1:%.*]]) #[[ATTR7]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t.25*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca double**, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: store %struct..kmp_privates.t.25* [[TMP0]], %struct..kmp_privates.t.25** [[DOTADDR]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store double** [[TMP1]], double*** [[DOTADDR1]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load %struct..kmp_privates.t.25*, %struct..kmp_privates.t.25** [[DOTADDR]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_25:%.*]], %struct..kmp_privates.t.25* [[TMP2]], i32 0, i32 0
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = load double**, double*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: store double* [[TMP3]], double** [[TMP4]], align 8
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@.omp_task_entry..27
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK1-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.24* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK1-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.23*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca double*, align 8
|
|
|
|
// CHECK1-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.24*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: store %struct.kmp_task_t_with_privates.24* [[TMP1]], %struct.kmp_task_t_with_privates.24** [[DOTADDR1]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.24*, %struct.kmp_task_t_with_privates.24** [[DOTADDR1]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24:%.*]], %struct.kmp_task_t_with_privates.24* [[TMP3]], i32 0, i32 0
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK1-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK1-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.23*
|
|
|
|
// CHECK1-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24]], %struct.kmp_task_t_with_privates.24* [[TMP3]], i32 0, i32 1
|
|
|
|
// CHECK1-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t.25* [[TMP9]] to i8*
|
|
|
|
// CHECK1-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.24* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META123:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META126:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META128:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META130:![0-9]+]])
|
|
|
|
// CHECK1-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !132
|
|
|
|
// CHECK1-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !132
|
|
|
|
// CHECK1-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !132
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t.25*, double**)* @.omp_task_privates_map..26 to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !132
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !132
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: store %struct.anon.23* [[TMP8]], %struct.anon.23** [[__CONTEXT_ADDR_I]], align 8, !noalias !132
|
|
|
|
// CHECK1-NEXT: [[TMP12:%.*]] = load %struct.anon.23*, %struct.anon.23** [[__CONTEXT_ADDR_I]], align 8, !noalias !132
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !132
|
|
|
|
// CHECK1-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !132
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, double**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-NEXT: call void [[TMP15]](i8* [[TMP14]], double** [[DOTFIRSTPRIV_PTR_ADDR_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK1-NEXT: [[TMP16:%.*]] = load double*, double** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !132
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP17:%.*]] = load double, double* [[TMP16]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK1-NEXT: [[TMP18:%.*]] = getelementptr inbounds [[STRUCT_ANON_23:%.*]], %struct.anon.23* [[TMP12]], i32 0, i32 0
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: [[TMP19:%.*]] = load float*, float** [[TMP18]], align 8
|
|
|
|
// CHECK1-NEXT: [[TMP20:%.*]] = load float, float* [[TMP19]], align 4
|
|
|
|
// CHECK1-NEXT: [[CONV_I:%.*]] = fpext float [[TMP20]] to double
|
|
|
|
// CHECK1-NEXT: [[ADD_I:%.*]] = fadd double [[CONV_I]], [[TMP17]]
|
|
|
|
// CHECK1-NEXT: [[CONV1_I:%.*]] = fptrunc double [[ADD_I]] to float
|
|
|
|
// CHECK1-NEXT: store float [[CONV1_I]], float* [[TMP19]], align 4
|
|
|
|
// CHECK1-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK1-LABEL: define {{[^@]+}}@_GLOBAL__sub_I_task_codegen.cpp
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK1-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK1-NEXT: entry:
|
|
|
|
// CHECK1-NEXT: call void @__cxx_global_var_init()
|
|
|
|
// CHECK1-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@main
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-SAME: () #[[ATTR0:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[B:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK2-NEXT: [[S:%.*]] = alloca [2 x %struct.S], align 4
|
|
|
|
// CHECK2-NEXT: [[SAVED_STACK:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__VLA_EXPR0:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON:%.*]], align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED1:%.*]] = alloca [[STRUCT_ANON_0:%.*]], align 8
|
|
|
|
// CHECK2-NEXT: [[DOTDEP_ARR_ADDR:%.*]] = alloca [4 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK2-NEXT: [[DEP_COUNTER_ADDR:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED3:%.*]] = alloca [[STRUCT_ANON_2:%.*]], align 1
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED4:%.*]] = alloca [[STRUCT_ANON_4:%.*]], align 1
|
|
|
|
// CHECK2-NEXT: [[DOTDEP_ARR_ADDR5:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK2-NEXT: [[DEP_COUNTER_ADDR11:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED12:%.*]] = alloca [[STRUCT_ANON_6:%.*]], align 1
|
|
|
|
// CHECK2-NEXT: [[DOTDEP_ARR_ADDR13:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK2-NEXT: [[DEP_COUNTER_ADDR19:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED20:%.*]] = alloca [[STRUCT_ANON_8:%.*]], align 1
|
|
|
|
// CHECK2-NEXT: [[DOTDEP_ARR_ADDR21:%.*]] = alloca [3 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK2-NEXT: [[DEP_COUNTER_ADDR27:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED28:%.*]] = alloca [[STRUCT_ANON_10:%.*]], align 1
|
|
|
|
// CHECK2-NEXT: [[FLAG:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED29:%.*]] = alloca [[STRUCT_ANON_12:%.*]], align 1
|
|
|
|
// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 128
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED30:%.*]] = alloca [[STRUCT_ANON_14:%.*]], align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED31:%.*]] = alloca [[STRUCT_ANON_16:%.*]], align 1
|
|
|
|
// CHECK2-NEXT: [[TMP0:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1:[0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK2-NEXT: [[ARRAY_BEGIN:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[ARRAYCTOR_END:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAY_BEGIN]], i64 2
|
|
|
|
// CHECK2-NEXT: br label [[ARRAYCTOR_LOOP:%.*]]
|
|
|
|
// CHECK2: arrayctor.loop:
|
|
|
|
// CHECK2-NEXT: [[ARRAYCTOR_CUR:%.*]] = phi %struct.S* [ [[ARRAY_BEGIN]], [[ENTRY:%.*]] ], [ [[ARRAYCTOR_NEXT:%.*]], [[ARRAYCTOR_LOOP]] ]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[ARRAYCTOR_CUR]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[ARRAYCTOR_NEXT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYCTOR_CUR]], i64 1
|
|
|
|
// CHECK2-NEXT: [[ARRAYCTOR_DONE:%.*]] = icmp eq %struct.S* [[ARRAYCTOR_NEXT]], [[ARRAYCTOR_END]]
|
|
|
|
// CHECK2-NEXT: br i1 [[ARRAYCTOR_DONE]], label [[ARRAYCTOR_CONT:%.*]], label [[ARRAYCTOR_LOOP]]
|
|
|
|
// CHECK2: arrayctor.cont:
|
|
|
|
// CHECK2-NEXT: [[TMP1:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = call i8* @llvm.stacksave()
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP3]], i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = mul nuw i64 10, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[VLA:%.*]] = alloca i32, i64 [[TMP4]], align 16
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP2]], i64* [[__VLA_EXPR0]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i8* [[B]], i8** [[TMP5]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK2-NEXT: [[CONV:%.*]] = sext i8 [[TMP7]] to i32
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 33, i64 40, i64 16, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast i8* [[TMP8]] to %struct.kmp_task_t_with_privates*
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP9]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load i8*, i8** [[TMP11]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = bitcast %struct.anon* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK2-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP12]], i8* align 8 [[TMP13]], i64 16, i1 false)
|
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP10]], i32 0, i32 4
|
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = bitcast %union.kmp_cmplrdata_t* [[TMP14]] to i32*
|
|
|
|
// CHECK2-NEXT: store i32 [[CONV]], i32* [[TMP15]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP8]])
|
|
|
|
// CHECK2-NEXT: [[TMP17:%.*]] = getelementptr inbounds [[STRUCT_ANON_0]], %struct.anon.0* [[AGG_CAPTURED1]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP17]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP18:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP19:%.*]] = bitcast i8* [[TMP18]] to %struct.kmp_task_t_with_privates.1*
|
|
|
|
// CHECK2-NEXT: [[TMP20:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP21:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP20]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP22:%.*]] = load i8*, i8** [[TMP21]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP23:%.*]] = bitcast %struct.anon.0* [[AGG_CAPTURED1]] to i8*
|
|
|
|
// CHECK2-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP22]], i8* align 8 [[TMP23]], i64 8, i1 false)
|
|
|
|
// CHECK2-NEXT: [[TMP24:%.*]] = getelementptr inbounds [4 x %struct.kmp_depend_info], [4 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR]], i64 0, i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP25:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO:%.*]], %struct.kmp_depend_info* [[TMP24]], i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP26:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP25]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP26]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP27:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP25]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 4, i64* [[TMP27]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP28:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP25]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 1, i8* [[TMP28]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP29:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i64 1
|
|
|
|
// CHECK2-NEXT: [[TMP30:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP29]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP31:%.*]] = ptrtoint i8* [[B]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP31]], i64* [[TMP30]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP32:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP29]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 1, i64* [[TMP32]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP33:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP29]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 1, i8* [[TMP33]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP34:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i64 2
|
|
|
|
// CHECK2-NEXT: [[TMP35:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP34]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP36:%.*]] = ptrtoint [2 x %struct.S]* [[S]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP36]], i64* [[TMP35]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP37:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP34]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 8, i64* [[TMP37]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP38:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP34]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 1, i8* [[TMP38]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP39:%.*]] = mul nsw i64 0, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP39]]
|
|
|
|
// CHECK2-NEXT: [[TMP40:%.*]] = mul nsw i64 9, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP40]]
|
|
|
|
// CHECK2-NEXT: [[TMP41:%.*]] = getelementptr i32, i32* [[ARRAYIDX2]], i32 1
|
|
|
|
// CHECK2-NEXT: [[TMP42:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP43:%.*]] = ptrtoint i32* [[TMP41]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP44:%.*]] = sub nuw i64 [[TMP43]], [[TMP42]]
|
|
|
|
// CHECK2-NEXT: [[TMP45:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i64 3
|
|
|
|
// CHECK2-NEXT: [[TMP46:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP45]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP47:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP47]], i64* [[TMP46]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP48:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP45]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP44]], i64* [[TMP48]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP49:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP45]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 1, i8* [[TMP49]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i64 4, i64* [[DEP_COUNTER_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP50:%.*]] = bitcast %struct.kmp_depend_info* [[TMP24]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP51:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP18]], i32 4, i8* [[TMP50]], i32 0, i8* null)
|
|
|
|
// CHECK2-NEXT: [[TMP52:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.3*)* @.omp_task_entry..4 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP53:%.*]] = bitcast i8* [[TMP52]] to %struct.kmp_task_t_with_privates.3*
|
|
|
|
// CHECK2-NEXT: [[TMP54:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP53]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP55:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP54]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[TMP55]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP56:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP52]])
|
|
|
|
// CHECK2-NEXT: [[TMP57:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.5*)* @.omp_task_entry..6 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP58:%.*]] = bitcast i8* [[TMP57]] to %struct.kmp_task_t_with_privates.5*
|
|
|
|
// CHECK2-NEXT: [[TMP59:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP58]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP60:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR5]], i64 0, i64 0
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX6:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP61:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP62:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP61]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP63:%.*]] = ptrtoint %struct.S* [[ARRAYIDX6]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP63]], i64* [[TMP62]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP64:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP61]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 4, i64* [[TMP64]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP65:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP61]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 3, i8* [[TMP65]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP66:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK2-NEXT: [[TMP67:%.*]] = sext i8 [[TMP66]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP68:%.*]] = mul nsw i64 4, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP68]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX8:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX7]], i64 [[TMP67]]
|
|
|
|
// CHECK2-NEXT: [[TMP69:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK2-NEXT: [[TMP70:%.*]] = sext i8 [[TMP69]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP71:%.*]] = mul nsw i64 9, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX9:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP71]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX10:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX9]], i64 [[TMP70]]
|
|
|
|
// CHECK2-NEXT: [[TMP72:%.*]] = getelementptr i32, i32* [[ARRAYIDX10]], i32 1
|
|
|
|
// CHECK2-NEXT: [[TMP73:%.*]] = ptrtoint i32* [[ARRAYIDX8]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP74:%.*]] = ptrtoint i32* [[TMP72]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP75:%.*]] = sub nuw i64 [[TMP74]], [[TMP73]]
|
|
|
|
// CHECK2-NEXT: [[TMP76:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i64 1
|
|
|
|
// CHECK2-NEXT: [[TMP77:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP76]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP78:%.*]] = ptrtoint i32* [[ARRAYIDX8]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP78]], i64* [[TMP77]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP79:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP76]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP75]], i64* [[TMP79]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP80:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP76]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 3, i8* [[TMP80]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR11]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP81:%.*]] = bitcast %struct.kmp_depend_info* [[TMP60]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP82:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP59]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[TMP82]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP83:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP57]], i32 2, i8* [[TMP81]], i32 0, i8* null)
|
|
|
|
// CHECK2-NEXT: [[TMP84:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.7*)* @.omp_task_entry..8 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP85:%.*]] = bitcast i8* [[TMP84]] to %struct.kmp_task_t_with_privates.7*
|
|
|
|
// CHECK2-NEXT: [[TMP86:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP85]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP87:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR13]], i64 0, i64 0
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX14:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP88:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP89:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP88]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP90:%.*]] = ptrtoint %struct.S* [[ARRAYIDX14]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP90]], i64* [[TMP89]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP91:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP88]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 4, i64* [[TMP91]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP92:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP88]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 4, i8* [[TMP92]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP93:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK2-NEXT: [[TMP94:%.*]] = sext i8 [[TMP93]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP95:%.*]] = mul nsw i64 4, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX15:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP95]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX16:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX15]], i64 [[TMP94]]
|
|
|
|
// CHECK2-NEXT: [[TMP96:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK2-NEXT: [[TMP97:%.*]] = sext i8 [[TMP96]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP98:%.*]] = mul nsw i64 9, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX17:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP98]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX18:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX17]], i64 [[TMP97]]
|
|
|
|
// CHECK2-NEXT: [[TMP99:%.*]] = getelementptr i32, i32* [[ARRAYIDX18]], i32 1
|
|
|
|
// CHECK2-NEXT: [[TMP100:%.*]] = ptrtoint i32* [[ARRAYIDX16]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP101:%.*]] = ptrtoint i32* [[TMP99]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP102:%.*]] = sub nuw i64 [[TMP101]], [[TMP100]]
|
|
|
|
// CHECK2-NEXT: [[TMP103:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i64 1
|
|
|
|
// CHECK2-NEXT: [[TMP104:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP103]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP105:%.*]] = ptrtoint i32* [[ARRAYIDX16]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP105]], i64* [[TMP104]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP106:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP103]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP102]], i64* [[TMP106]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP107:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP103]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 4, i8* [[TMP107]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR19]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP108:%.*]] = bitcast %struct.kmp_depend_info* [[TMP87]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP109:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP86]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[TMP109]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP110:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP84]], i32 2, i8* [[TMP108]], i32 0, i8* null)
|
|
|
|
// CHECK2-NEXT: [[TMP111:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.9*)* @.omp_task_entry..10 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP112:%.*]] = bitcast i8* [[TMP111]] to %struct.kmp_task_t_with_privates.9*
|
|
|
|
// CHECK2-NEXT: [[TMP113:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP112]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP114:%.*]] = getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR21]], i64 0, i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP115:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i64 0
|
|
|
|
// CHECK2-NEXT: [[TMP116:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP115]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP116]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP117:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP115]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 4, i64* [[TMP117]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP118:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP115]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 3, i8* [[TMP118]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[ARRAYIDX22:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 1
|
|
|
|
// CHECK2-NEXT: [[TMP119:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i64 1
|
|
|
|
// CHECK2-NEXT: [[TMP120:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP119]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP121:%.*]] = ptrtoint %struct.S* [[ARRAYIDX22]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP121]], i64* [[TMP120]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP122:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP119]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 4, i64* [[TMP122]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP123:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP119]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 3, i8* [[TMP123]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP124:%.*]] = mul nsw i64 0, [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX23:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP124]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX24:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX23]], i64 3
|
|
|
|
// CHECK2-NEXT: [[TMP125:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: [[TMP126:%.*]] = sext i32 [[TMP125]] to i64
|
|
|
|
// CHECK2-NEXT: [[LEN_SUB_1:%.*]] = sub nsw i64 [[TMP126]], 1
|
|
|
|
// CHECK2-NEXT: [[TMP127:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: [[TMP128:%.*]] = sext i32 [[TMP127]] to i64
|
|
|
|
// CHECK2-NEXT: [[LB_ADD_LEN:%.*]] = add nsw i64 -1, [[TMP128]]
|
|
|
|
// CHECK2-NEXT: [[TMP129:%.*]] = mul nsw i64 [[LB_ADD_LEN]], [[TMP2]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX25:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP129]]
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX26:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX25]], i64 [[LEN_SUB_1]]
|
|
|
|
// CHECK2-NEXT: [[TMP130:%.*]] = getelementptr i32, i32* [[ARRAYIDX26]], i32 1
|
|
|
|
// CHECK2-NEXT: [[TMP131:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP132:%.*]] = ptrtoint i32* [[TMP130]] to i64
|
|
|
|
// CHECK2-NEXT: [[TMP133:%.*]] = sub nuw i64 [[TMP132]], [[TMP131]]
|
|
|
|
// CHECK2-NEXT: [[TMP134:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i64 2
|
|
|
|
// CHECK2-NEXT: [[TMP135:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP134]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP136:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP136]], i64* [[TMP135]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP137:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP134]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: store i64 [[TMP133]], i64* [[TMP137]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP138:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP134]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK2-NEXT: store i8 3, i8* [[TMP138]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i64 3, i64* [[DEP_COUNTER_ADDR27]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP139:%.*]] = bitcast %struct.kmp_depend_info* [[TMP114]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP140:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP111]], i32 3, i8* [[TMP139]], i32 0, i8* null)
|
|
|
|
// CHECK2-NEXT: [[TMP141:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.11*)* @.omp_task_entry..12 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP142:%.*]] = bitcast i8* [[TMP141]] to %struct.kmp_task_t_with_privates.11*
|
|
|
|
// CHECK2-NEXT: [[TMP143:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP142]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP144:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP141]])
|
|
|
|
// CHECK2-NEXT: store i8 0, i8* [[FLAG]], align 1
|
|
|
|
// CHECK2-NEXT: [[TMP145:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 1, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.13*)* @.omp_task_entry..14 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP146:%.*]] = bitcast i8* [[TMP145]] to %struct.kmp_task_t_with_privates.13*
|
|
|
|
// CHECK2-NEXT: [[TMP147:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP146]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP148:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP145]])
|
|
|
|
// CHECK2-NEXT: [[TMP149:%.*]] = getelementptr inbounds [[STRUCT_ANON_14]], %struct.anon.14* [[AGG_CAPTURED30]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i32* [[C]], i32** [[TMP149]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP150:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK2-NEXT: [[TOBOOL:%.*]] = icmp ne i8 [[TMP150]], 0
|
|
|
|
// CHECK2-NEXT: [[TMP151:%.*]] = select i1 [[TOBOOL]], i32 2, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP152:%.*]] = or i32 [[TMP151]], 1
|
|
|
|
// CHECK2-NEXT: [[TMP153:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 [[TMP152]], i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.15*)* @.omp_task_entry..16 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP154:%.*]] = bitcast i8* [[TMP153]] to %struct.kmp_task_t_with_privates.15*
|
|
|
|
// CHECK2-NEXT: [[TMP155:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP154]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP156:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP155]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP157:%.*]] = load i8*, i8** [[TMP156]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP158:%.*]] = bitcast %struct.anon.14* [[AGG_CAPTURED30]] to i8*
|
|
|
|
// CHECK2-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP157]], i8* align 8 [[TMP158]], i64 8, i1 false)
|
|
|
|
// CHECK2-NEXT: [[TMP159:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP153]])
|
|
|
|
// CHECK2-NEXT: [[TMP160:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 0, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.19*)* @.omp_task_entry..21 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP161:%.*]] = bitcast i8* [[TMP160]] to %struct.kmp_task_t_with_privates.19*
|
|
|
|
// CHECK2-NEXT: [[TMP162:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP161]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP163:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP161]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP164:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP163]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP165:%.*]] = load i32, i32* [[C]], align 128
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP165]], i32* [[TMP164]], align 128
|
|
|
|
// CHECK2-NEXT: [[TMP166:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP162]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[TMP166]], align 16
|
|
|
|
// CHECK2-NEXT: [[TMP167:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP160]])
|
|
|
|
// CHECK2-NEXT: [[TMP168:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP168]], i32* [[RETVAL]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP169:%.*]] = load i8*, i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK2-NEXT: call void @llvm.stackrestore(i8* [[TMP169]])
|
|
|
|
// CHECK2-NEXT: [[ARRAY_BEGIN32:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP170:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAY_BEGIN32]], i64 2
|
|
|
|
// CHECK2-NEXT: br label [[ARRAYDESTROY_BODY:%.*]]
|
|
|
|
// CHECK2: arraydestroy.body:
|
|
|
|
// CHECK2-NEXT: [[ARRAYDESTROY_ELEMENTPAST:%.*]] = phi %struct.S* [ [[TMP170]], [[ARRAYCTOR_CONT]] ], [ [[ARRAYDESTROY_ELEMENT:%.*]], [[ARRAYDESTROY_BODY]] ]
|
|
|
|
// CHECK2-NEXT: [[ARRAYDESTROY_ELEMENT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYDESTROY_ELEMENTPAST]], i64 -1
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[ARRAYDESTROY_ELEMENT]]) #[[ATTR4:[0-9]+]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[ARRAYDESTROY_DONE:%.*]] = icmp eq %struct.S* [[ARRAYDESTROY_ELEMENT]], [[ARRAY_BEGIN32]]
|
|
|
|
// CHECK2-NEXT: br i1 [[ARRAYDESTROY_DONE]], label [[ARRAYDESTROY_DONE33:%.*]], label [[ARRAYDESTROY_BODY]]
|
|
|
|
// CHECK2: arraydestroy.done33:
|
|
|
|
// CHECK2-NEXT: [[TMP171:%.*]] = load i32, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK2-NEXT: ret i32 [[TMP171]]
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_ZN1SC1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SC2Ev(%struct.S* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates* noalias noundef [[TMP1:%.*]]) #[[ATTR3:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates* [[TMP1]], %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates*, %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META3:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META6:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META8:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META10:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !12
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK2-NEXT: store %struct.anon* [[TMP8]], %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon*, %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: [[CONV_I:%.*]] = trunc i32 [[TMP11]] to i8
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = getelementptr inbounds [[STRUCT_ANON:%.*]], %struct.anon* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = load i8*, i8** [[TMP12]], align 8
|
|
|
|
// CHECK2-NEXT: store i8 [[CONV_I]], i8* [[TMP13]], align 1
|
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[TMP10]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP14]], align 8
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP15]], i64 0, i64 0
|
|
|
|
// CHECK2-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..2
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.1* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.0*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.1*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.1* [[TMP1]], %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.1*, %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.0*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.1* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META13:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META16:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META18:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META20:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !22
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK2-NEXT: store %struct.anon.0* [[TMP8]], %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.0*, %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_0:%.*]], %struct.anon.0* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP11]], align 8
|
|
|
|
// CHECK2-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP12]], i64 0, i64 1
|
|
|
|
// CHECK2-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..4
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.3* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.2*, align 8
|
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.3*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.3* [[TMP1]], %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.3*, %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.2*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.3* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META23:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META26:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META28:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META30:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK2-NEXT: store %struct.anon.2* [[TMP8]], %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.2*, %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK2-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK2-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK2-NEXT: ]
|
|
|
|
// CHECK2: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK2: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[TMP13]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP14]], i8* [[TMP15]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT:%.*]]
|
|
|
|
// CHECK2: .untied.jmp.1.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP17:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: call void @__kmpc_critical(%struct.ident_t* @[[GLOB1]], i32 [[TMP17]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* @a, align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: call void @__kmpc_end_critical(%struct.ident_t* @[[GLOB1]], i32 [[TMP17]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK2: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT]]
|
|
|
|
// CHECK2: .omp_outlined..3.exit:
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..6
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.5* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.4*, align 8
|
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.5*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.5* [[TMP1]], %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.5*, %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.4*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.5* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META33:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META36:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META38:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META40:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !42
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK2-NEXT: store %struct.anon.4* [[TMP8]], %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.4*, %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK2-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK2-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK2-NEXT: ]
|
|
|
|
// CHECK2: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK2: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[TMP13]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !42
|
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP14]], i8* [[TMP15]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT:%.*]]
|
|
|
|
// CHECK2: .untied.jmp.1.i:
|
|
|
|
// CHECK2-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK2: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT]]
|
|
|
|
// CHECK2: .omp_outlined..5.exit:
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.7* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.6*, align 8
|
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.7*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.7* [[TMP1]], %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.7*, %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.6*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.7* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META43:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META46:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META48:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META50:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !52
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK2-NEXT: store %struct.anon.6* [[TMP8]], %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.6*, %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK2-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK2-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK2-NEXT: ]
|
|
|
|
// CHECK2: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK2: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[TMP13]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !52
|
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP16:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP14]], i8* [[TMP15]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT:%.*]]
|
|
|
|
// CHECK2: .untied.jmp.1.i:
|
|
|
|
// CHECK2-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK2: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT]]
|
|
|
|
// CHECK2: .omp_outlined..7.exit:
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..10
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.9* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.9*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.9* [[TMP1]], %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.9*, %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.8*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.9* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META53:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META56:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META58:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META60:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !62
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK2-NEXT: store %struct.anon.8* [[TMP8]], %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.8*, %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..12
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.11* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.10*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.11*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.11* [[TMP1]], %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.11*, %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.10*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.11* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META63:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META66:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META68:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META70:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !72
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK2-NEXT: store %struct.anon.10* [[TMP8]], %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.10*, %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..14
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.13* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.12*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.13*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.13* [[TMP1]], %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.13*, %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.12*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.13* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META73:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META76:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META78:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META80:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !82
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK2-NEXT: store %struct.anon.12* [[TMP8]], %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.12*, %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 3, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..16
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.15* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.14*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.15*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.15* [[TMP1]], %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.15*, %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.14*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.15* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META83:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META86:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META88:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META90:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !92
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK2-NEXT: store %struct.anon.14* [[TMP8]], %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.14*, %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_14:%.*]], %struct.anon.14* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load i32*, i32** [[TMP11]], align 8
|
|
|
|
// CHECK2-NEXT: store i32 5, i32* [[TMP12]], align 128
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_privates_map.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct..kmp_privates.t* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]]) #[[ATTR7:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK2-NEXT: store %struct..kmp_privates.t* [[TMP0]], %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK2-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load %struct..kmp_privates.t*, %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP2]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP3]], i32** [[TMP4]], align 8
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..19
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.18* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.17*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.18*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.18* [[TMP1]], %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.18*, %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.17*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t* [[TMP9]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.18* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META93:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META96:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META98:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META100:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !102
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t*, i32**)* @.omp_task_privates_map. to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK2-NEXT: store %struct.anon.17* [[TMP8]], %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load %struct.anon.17*, %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 4, i32* [[TMP16]], align 128
|
|
|
|
// CHECK2-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_ZN1SD1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SD2Ev(%struct.S* noundef [[THIS1]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_privates_map..20
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct..kmp_privates.t.20* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]], %struct.S** noalias noundef [[TMP2:%.*]], %struct.S*** noalias noundef [[TMP3:%.*]]) #[[ATTR7]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t.20*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR2:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR3:%.*]] = alloca %struct.S***, align 8
|
|
|
|
// CHECK2-NEXT: store %struct..kmp_privates.t.20* [[TMP0]], %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK2-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S** [[TMP2]], %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S*** [[TMP3]], %struct.S**** [[DOTADDR3]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = load %struct..kmp_privates.t.20*, %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = load %struct.S***, %struct.S**** [[DOTADDR3]], align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S** [[TMP7]], %struct.S*** [[TMP8]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 3
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.S**, %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S* [[TMP9]], %struct.S** [[TMP10]], align 8
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..21
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.19* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.16*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTLOCAL_PTR_ADDR_I:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTLOCAL_PTR_ADDR1_I:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[REF_TMP_I:%.*]] = alloca [[STRUCT_S:%.*]], align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.19*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.19* [[TMP1]], %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.19*, %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.16*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t.20* [[TMP9]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.19* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META103:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META106:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META108:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META110:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t.20*, i32**, %struct.S**, %struct.S***)* @.omp_task_privates_map..20 to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: store %struct.anon.16* [[TMP8]], %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load %struct.anon.16*, %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**, %struct.S**, %struct.S***)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], %struct.S** [[DOTLOCAL_PTR_ADDR_I]], %struct.S*** [[DOTLOCAL_PTR_ADDR1_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP17:%.*]] = load %struct.S*, %struct.S** [[DOTLOCAL_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP18:%.*]] = load %struct.S**, %struct.S*** [[DOTLOCAL_PTR_ADDR1_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP19:%.*]] = load %struct.S*, %struct.S** [[TMP18]], align 8
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP20:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP21:%.*]] = load i32, i32* [[TMP20]], align 4
|
|
|
|
// CHECK2-NEXT: switch i32 [[TMP21]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK2-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 1, label [[DOTUNTIED_JMP_2_I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 2, label [[DOTUNTIED_JMP_3_I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 3, label [[DOTUNTIED_JMP_5_I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 4, label [[DOTUNTIED_JMP_7_I:%.*]]
|
|
|
|
// CHECK2-NEXT: i32 5, label [[DOTUNTIED_JMP_10_I:%.*]]
|
|
|
|
// CHECK2-NEXT: ]
|
|
|
|
// CHECK2: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK2: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP22:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 1, i32* [[TMP22]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP23:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP24:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP25:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP23]], i8* [[TMP24]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT:%.*]]
|
|
|
|
// CHECK2: .untied.jmp.2.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[TMP17]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP26:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[DOTS2__VOID_ADDR_I:%.*]] = call i8* @__kmpc_alloc(i32 [[TMP26]], i64 4, i8* inttoptr (i64 7 to i8*)) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[DOTS2__ADDR_I:%.*]] = bitcast i8* [[DOTS2__VOID_ADDR_I]] to %struct.S*
|
|
|
|
// CHECK2-NEXT: store %struct.S* [[DOTS2__ADDR_I]], %struct.S** [[TMP18]], align 8
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP27:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 2, i32* [[TMP27]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP28:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP29:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP30:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP28]], i8* [[TMP29]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK2: .untied.jmp.3.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[TMP19]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[A_I]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP31:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP32:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP31]], i32 1, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.18*)* @.omp_task_entry..19 to i32 (i32, i8*)*)) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP33:%.*]] = bitcast i8* [[TMP32]] to %struct.kmp_task_t_with_privates.18*
|
|
|
|
// CHECK2-NEXT: [[TMP34:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP33]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP35:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP33]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP36:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP35]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP37:%.*]] = load i32, i32* [[TMP16]], align 128
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP37]], i32* [[TMP36]], align 128
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP38:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP39:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP38]], i8* [[TMP32]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP40:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 3, i32* [[TMP40]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP41:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP42:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP43:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP41]], i8* [[TMP42]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK2: .untied.jmp.5.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP44:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP45:%.*]] = call i32 @__kmpc_omp_taskyield(%struct.ident_t* @[[GLOB1]], i32 [[TMP44]], i32 0) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP46:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 4, i32* [[TMP46]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP47:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP48:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP49:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP47]], i8* [[TMP48]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK2: .untied.jmp.7.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP50:%.*]] = bitcast %struct.S* [[TMP17]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP51:%.*]] = bitcast %struct.S* [[REF_TMP_I]] to i8*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP50]], i8* align 4 [[TMP51]], i64 4, i1 false) #[[ATTR4]]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[A9_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i32 10, i32* [[A9_I]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP52:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP53:%.*]] = call i32 @__kmpc_omp_taskwait(%struct.ident_t* @[[GLOB1]], i32 [[TMP52]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP54:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 5, i32* [[TMP54]], align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP55:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK2-NEXT: [[TMP56:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: [[TMP57:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP55]], i8* [[TMP56]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK2: .untied.jmp.10.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[TMP19]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP58:%.*]] = load i32, i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP59:%.*]] = bitcast %struct.S* [[TMP19]] to i8*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: call void @__kmpc_free(i32 [[TMP58]], i8* [[TMP59]], i8* inttoptr (i64 7 to i8*)) #[[ATTR4]]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[TMP17]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK2: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK2: .omp_outlined..17.exit:
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_ZN1SC2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[A:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[THIS1]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[A]], align 4
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_ZN1SD2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@__cxx_global_var_init
|
|
|
|
// CHECK2-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
|
|
|
// CHECK2-NEXT: entry:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN2S1C1Ev(%struct.S1* noundef @s1)
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_ZN2S1C1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN2S1C2Ev(%struct.S1* noundef [[THIS1]])
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_ZN2S1C2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK2-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-NEXT: call void @_ZN2S18taskinitEv(%struct.S1* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_ZN2S18taskinitEv
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct.S1* noundef [[THIS:%.*]]) #[[ATTR8:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON_21:%.*]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP0:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]])
|
|
|
|
// CHECK2-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP1:%.*]] = getelementptr inbounds [[STRUCT_ANON_21]], %struct.anon.21* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store %struct.S1* [[THIS1]], %struct.S1** [[TMP1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.22*)* @.omp_task_entry..23 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = bitcast i8* [[TMP2]] to %struct.kmp_task_t_with_privates.22*
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = load i8*, i8** [[TMP5]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = bitcast %struct.anon.21* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK2-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP6]], i8* align 8 [[TMP7]], i64 8, i1 false)
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]], i8* [[TMP2]])
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..23
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.22* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.21*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.22*, align 8
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.22* [[TMP1]], %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.22*, %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.21*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.22* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META113:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META116:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META118:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META120:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !122
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK2-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK2-NEXT: store %struct.anon.21* [[TMP8]], %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load %struct.anon.21*, %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_21:%.*]], %struct.anon.21* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load %struct.S1*, %struct.S1** [[TMP11]], align 8
|
|
|
|
// CHECK2-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S1:%.*]], %struct.S1* [[TMP12]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store i32 0, i32* [[A_I]], align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_Z4xxxxv
|
|
|
|
// CHECK2-SAME: () #[[ATTR8]] {
|
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: call void @_Z6foobarIvEvv()
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_Z6foobarIvEvv
|
|
|
|
// CHECK2-SAME: () #[[ATTR8]] {
|
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[A:%.*]] = alloca float, align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: call void (%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...) @__kmpc_fork_call(%struct.ident_t* @[[GLOB1]], i32 1, void (i32*, i32*, ...)* bitcast (void (i32*, i32*, float*)* @.omp_outlined..24 to void (i32*, i32*, ...)*), float* [[A]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_outlined..24
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32* noalias noundef [[DOTGLOBAL_TID_:%.*]], i32* noalias noundef [[DOTBOUND_TID_:%.*]], float* noundef nonnull align 4 dereferenceable(4) [[A:%.*]]) #[[ATTR9:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTBOUND_TID__ADDR:%.*]] = alloca i32*, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[A_ADDR:%.*]] = alloca float*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[B:%.*]] = alloca double, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON_23:%.*]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32* [[DOTGLOBAL_TID_]], i32** [[DOTGLOBAL_TID__ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: store i32* [[DOTBOUND_TID_]], i32** [[DOTBOUND_TID__ADDR]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: store float* [[A]], float** [[A_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP0:%.*]] = load float*, float** [[A_ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP1:%.*]] = load i32*, i32** [[DOTGLOBAL_TID__ADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[TMP1]], align 4
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = call i32 @__kmpc_single(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]])
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = icmp ne i32 [[TMP3]], 0
|
|
|
|
// CHECK2-NEXT: br i1 [[TMP4]], label [[OMP_IF_THEN:%.*]], label [[OMP_IF_END:%.*]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2: omp_if.then:
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_ANON_23]], %struct.anon.23* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: store float* [[TMP0]], float** [[TMP5]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]], i32 1, i64 48, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.24*)* @.omp_task_entry..27 to i32 (i32, i8*)*))
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = bitcast i8* [[TMP6]] to %struct.kmp_task_t_with_privates.24*
|
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24:%.*]], %struct.kmp_task_t_with_privates.24* [[TMP7]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP8]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = load i8*, i8** [[TMP9]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = bitcast %struct.anon.23* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK2-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP10]], i8* align 8 [[TMP11]], i64 8, i1 false)
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24]], %struct.kmp_task_t_with_privates.24* [[TMP7]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_25:%.*]], %struct..kmp_privates.t.25* [[TMP12]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = load double, double* [[B]], align 8
|
|
|
|
// CHECK2-NEXT: store double [[TMP14]], double* [[TMP13]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]], i8* [[TMP6]])
|
|
|
|
// CHECK2-NEXT: call void @__kmpc_end_single(%struct.ident_t* @[[GLOB1]], i32 [[TMP2]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: br label [[OMP_IF_END]]
|
|
|
|
// CHECK2: omp_if.end:
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: call void @__kmpc_barrier(%struct.ident_t* @[[GLOB2:[0-9]+]], i32 [[TMP2]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_privates_map..26
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (%struct..kmp_privates.t.25* noalias noundef [[TMP0:%.*]], double** noalias noundef [[TMP1:%.*]]) #[[ATTR7]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t.25*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca double**, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: store %struct..kmp_privates.t.25* [[TMP0]], %struct..kmp_privates.t.25** [[DOTADDR]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store double** [[TMP1]], double*** [[DOTADDR1]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load %struct..kmp_privates.t.25*, %struct..kmp_privates.t.25** [[DOTADDR]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_25:%.*]], %struct..kmp_privates.t.25* [[TMP2]], i32 0, i32 0
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = load double**, double*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: store double* [[TMP3]], double** [[TMP4]], align 8
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@.omp_task_entry..27
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK2-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.24* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK2-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.23*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca double*, align 8
|
|
|
|
// CHECK2-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.24*, align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: store %struct.kmp_task_t_with_privates.24* [[TMP1]], %struct.kmp_task_t_with_privates.24** [[DOTADDR1]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.24*, %struct.kmp_task_t_with_privates.24** [[DOTADDR1]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24:%.*]], %struct.kmp_task_t_with_privates.24* [[TMP3]], i32 0, i32 0
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK2-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK2-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.23*
|
|
|
|
// CHECK2-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_24]], %struct.kmp_task_t_with_privates.24* [[TMP3]], i32 0, i32 1
|
|
|
|
// CHECK2-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t.25* [[TMP9]] to i8*
|
|
|
|
// CHECK2-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.24* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META123:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META126:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META128:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META130:![0-9]+]])
|
|
|
|
// CHECK2-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !132
|
|
|
|
// CHECK2-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !132
|
|
|
|
// CHECK2-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !132
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t.25*, double**)* @.omp_task_privates_map..26 to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !132
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !132
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: store %struct.anon.23* [[TMP8]], %struct.anon.23** [[__CONTEXT_ADDR_I]], align 8, !noalias !132
|
|
|
|
// CHECK2-NEXT: [[TMP12:%.*]] = load %struct.anon.23*, %struct.anon.23** [[__CONTEXT_ADDR_I]], align 8, !noalias !132
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !132
|
|
|
|
// CHECK2-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !132
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, double**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-NEXT: call void [[TMP15]](i8* [[TMP14]], double** [[DOTFIRSTPRIV_PTR_ADDR_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK2-NEXT: [[TMP16:%.*]] = load double*, double** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !132
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP17:%.*]] = load double, double* [[TMP16]], align 8
|
2021-09-22 04:20:39 +08:00
|
|
|
// CHECK2-NEXT: [[TMP18:%.*]] = getelementptr inbounds [[STRUCT_ANON_23:%.*]], %struct.anon.23* [[TMP12]], i32 0, i32 0
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: [[TMP19:%.*]] = load float*, float** [[TMP18]], align 8
|
|
|
|
// CHECK2-NEXT: [[TMP20:%.*]] = load float, float* [[TMP19]], align 4
|
|
|
|
// CHECK2-NEXT: [[CONV_I:%.*]] = fpext float [[TMP20]] to double
|
|
|
|
// CHECK2-NEXT: [[ADD_I:%.*]] = fadd double [[CONV_I]], [[TMP17]]
|
|
|
|
// CHECK2-NEXT: [[CONV1_I:%.*]] = fptrunc double [[ADD_I]] to float
|
|
|
|
// CHECK2-NEXT: store float [[CONV1_I]], float* [[TMP19]], align 4
|
|
|
|
// CHECK2-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK2-LABEL: define {{[^@]+}}@_GLOBAL__sub_I_task_codegen.cpp
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK2-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK2-NEXT: entry:
|
|
|
|
// CHECK2-NEXT: call void @__cxx_global_var_init()
|
|
|
|
// CHECK2-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@main
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-SAME: () #[[ATTR0:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[B:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK3-NEXT: [[S:%.*]] = alloca [2 x %struct.S], align 4
|
|
|
|
// CHECK3-NEXT: [[SAVED_STACK:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__VLA_EXPR0:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON:%.*]], align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED2:%.*]] = alloca [[STRUCT_ANON_0:%.*]], align 8
|
|
|
|
// CHECK3-NEXT: [[DOTDEP_ARR_ADDR:%.*]] = alloca [4 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK3-NEXT: [[DEP_COUNTER_ADDR:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED6:%.*]] = alloca [[STRUCT_ANON_2:%.*]], align 1
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED9:%.*]] = alloca [[STRUCT_ANON_4:%.*]], align 1
|
|
|
|
// CHECK3-NEXT: [[DOTDEP_ARR_ADDR11:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK3-NEXT: [[DEP_COUNTER_ADDR17:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED19:%.*]] = alloca [[STRUCT_ANON_6:%.*]], align 1
|
|
|
|
// CHECK3-NEXT: [[DOTDEP_ARR_ADDR21:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK3-NEXT: [[DEP_COUNTER_ADDR27:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED29:%.*]] = alloca [[STRUCT_ANON_8:%.*]], align 1
|
|
|
|
// CHECK3-NEXT: [[DOTDEP_ARR_ADDR31:%.*]] = alloca [3 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK3-NEXT: [[DEP_COUNTER_ADDR37:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED39:%.*]] = alloca [[STRUCT_ANON_10:%.*]], align 1
|
|
|
|
// CHECK3-NEXT: [[FLAG:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED42:%.*]] = alloca [[STRUCT_ANON_12:%.*]], align 1
|
|
|
|
// CHECK3-NEXT: [[C:%.*]] = alloca i32, align 128
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED45:%.*]] = alloca [[STRUCT_ANON_14:%.*]], align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED48:%.*]] = alloca [[STRUCT_ANON_16:%.*]], align 1
|
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK3-NEXT: [[ARRAY_BEGIN:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[ARRAYCTOR_END:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAY_BEGIN]], i64 2
|
|
|
|
// CHECK3-NEXT: br label [[ARRAYCTOR_LOOP:%.*]]
|
|
|
|
// CHECK3: arrayctor.loop:
|
|
|
|
// CHECK3-NEXT: [[ARRAYCTOR_CUR:%.*]] = phi %struct.S* [ [[ARRAY_BEGIN]], [[ENTRY:%.*]] ], [ [[ARRAYCTOR_NEXT:%.*]], [[ARRAYCTOR_LOOP]] ]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[ARRAYCTOR_CUR]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[ARRAYCTOR_NEXT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYCTOR_CUR]], i64 1
|
|
|
|
// CHECK3-NEXT: [[ARRAYCTOR_DONE:%.*]] = icmp eq %struct.S* [[ARRAYCTOR_NEXT]], [[ARRAYCTOR_END]]
|
|
|
|
// CHECK3-NEXT: br i1 [[ARRAYCTOR_DONE]], label [[ARRAYCTOR_CONT:%.*]], label [[ARRAYCTOR_LOOP]]
|
|
|
|
// CHECK3: arrayctor.cont:
|
|
|
|
// CHECK3-NEXT: [[TMP0:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = call i8* @llvm.stacksave()
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP2]], i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = mul nuw i64 10, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[VLA:%.*]] = alloca i32, i64 [[TMP3]], align 16
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP1]], i64* [[__VLA_EXPR0]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i8* [[B]], i8** [[TMP4]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP5]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK3-NEXT: [[CONV:%.*]] = sext i8 [[TMP6]] to i32
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB3:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1:[0-9]+]], i32 [[OMP_GLOBAL_THREAD_NUM]], i32 33, i64 40, i64 16, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.kmp_task_t_with_privates*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP8]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP9]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = load i8*, i8** [[TMP10]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = bitcast %struct.anon* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK3-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP11]], i8* align 8 [[TMP12]], i64 16, i1 false)
|
|
|
|
// CHECK3-NEXT: [[TMP13:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP9]], i32 0, i32 4
|
|
|
|
// CHECK3-NEXT: [[TMP14:%.*]] = bitcast %union.kmp_cmplrdata_t* [[TMP13]] to i32*
|
|
|
|
// CHECK3-NEXT: store i32 [[CONV]], i32* [[TMP14]], align 8
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM1:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB3]])
|
|
|
|
// CHECK3-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM1]], i8* [[TMP7]])
|
|
|
|
// CHECK3-NEXT: [[TMP16:%.*]] = getelementptr inbounds [[STRUCT_ANON_0]], %struct.anon.0* [[AGG_CAPTURED2]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP16]], align 8
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM3:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB5:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP17:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM3]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP18:%.*]] = bitcast i8* [[TMP17]] to %struct.kmp_task_t_with_privates.1*
|
|
|
|
// CHECK3-NEXT: [[TMP19:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP18]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP20:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP21:%.*]] = load i8*, i8** [[TMP20]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP22:%.*]] = bitcast %struct.anon.0* [[AGG_CAPTURED2]] to i8*
|
|
|
|
// CHECK3-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP21]], i8* align 8 [[TMP22]], i64 8, i1 false)
|
|
|
|
// CHECK3-NEXT: [[TMP23:%.*]] = getelementptr inbounds [4 x %struct.kmp_depend_info], [4 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR]], i64 0, i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP24:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO:%.*]], %struct.kmp_depend_info* [[TMP23]], i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP25:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP25]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP26:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 4, i64* [[TMP26]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP27:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 1, i8* [[TMP27]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP28:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP23]], i64 1
|
|
|
|
// CHECK3-NEXT: [[TMP29:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP28]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP30:%.*]] = ptrtoint i8* [[B]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP30]], i64* [[TMP29]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP31:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP28]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 1, i64* [[TMP31]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP32:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP28]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 1, i8* [[TMP32]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP33:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP23]], i64 2
|
|
|
|
// CHECK3-NEXT: [[TMP34:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP33]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP35:%.*]] = ptrtoint [2 x %struct.S]* [[S]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP35]], i64* [[TMP34]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP36:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP33]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 8, i64* [[TMP36]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP37:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP33]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 1, i8* [[TMP37]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP38:%.*]] = mul nsw i64 0, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP38]]
|
|
|
|
// CHECK3-NEXT: [[TMP39:%.*]] = mul nsw i64 9, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP39]]
|
|
|
|
// CHECK3-NEXT: [[TMP40:%.*]] = getelementptr i32, i32* [[ARRAYIDX4]], i32 1
|
|
|
|
// CHECK3-NEXT: [[TMP41:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP42:%.*]] = ptrtoint i32* [[TMP40]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP43:%.*]] = sub nuw i64 [[TMP42]], [[TMP41]]
|
|
|
|
// CHECK3-NEXT: [[TMP44:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP23]], i64 3
|
|
|
|
// CHECK3-NEXT: [[TMP45:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP44]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP46:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP46]], i64* [[TMP45]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP47:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP44]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP43]], i64* [[TMP47]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP48:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP44]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 1, i8* [[TMP48]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i64 4, i64* [[DEP_COUNTER_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP49:%.*]] = bitcast %struct.kmp_depend_info* [[TMP23]] to i8*
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM5:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB5]])
|
|
|
|
// CHECK3-NEXT: [[TMP50:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM5]], i8* [[TMP17]], i32 4, i8* [[TMP49]], i32 0, i8* null)
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM7:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB7:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP51:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM7]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.3*)* @.omp_task_entry..4 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP52:%.*]] = bitcast i8* [[TMP51]] to %struct.kmp_task_t_with_privates.3*
|
|
|
|
// CHECK3-NEXT: [[TMP53:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP52]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM8:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB7]])
|
|
|
|
// CHECK3-NEXT: [[TMP54:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP53]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[TMP54]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP55:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM8]], i8* [[TMP51]])
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM10:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB9:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP56:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM10]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.5*)* @.omp_task_entry..6 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP57:%.*]] = bitcast i8* [[TMP56]] to %struct.kmp_task_t_with_privates.5*
|
|
|
|
// CHECK3-NEXT: [[TMP58:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP57]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP59:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR11]], i64 0, i64 0
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX12:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP60:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP59]], i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP61:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP62:%.*]] = ptrtoint %struct.S* [[ARRAYIDX12]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP62]], i64* [[TMP61]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP63:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 4, i64* [[TMP63]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP64:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 3, i8* [[TMP64]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP65:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK3-NEXT: [[TMP66:%.*]] = sext i8 [[TMP65]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP67:%.*]] = mul nsw i64 4, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX13:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP67]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX14:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX13]], i64 [[TMP66]]
|
|
|
|
// CHECK3-NEXT: [[TMP68:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK3-NEXT: [[TMP69:%.*]] = sext i8 [[TMP68]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP70:%.*]] = mul nsw i64 9, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX15:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP70]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX16:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX15]], i64 [[TMP69]]
|
|
|
|
// CHECK3-NEXT: [[TMP71:%.*]] = getelementptr i32, i32* [[ARRAYIDX16]], i32 1
|
|
|
|
// CHECK3-NEXT: [[TMP72:%.*]] = ptrtoint i32* [[ARRAYIDX14]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP73:%.*]] = ptrtoint i32* [[TMP71]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP74:%.*]] = sub nuw i64 [[TMP73]], [[TMP72]]
|
|
|
|
// CHECK3-NEXT: [[TMP75:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP59]], i64 1
|
|
|
|
// CHECK3-NEXT: [[TMP76:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP75]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP77:%.*]] = ptrtoint i32* [[ARRAYIDX14]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP77]], i64* [[TMP76]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP78:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP75]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP74]], i64* [[TMP78]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP79:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP75]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 3, i8* [[TMP79]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR17]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP80:%.*]] = bitcast %struct.kmp_depend_info* [[TMP59]] to i8*
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM18:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB9]])
|
|
|
|
// CHECK3-NEXT: [[TMP81:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP58]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[TMP81]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP82:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM18]], i8* [[TMP56]], i32 2, i8* [[TMP80]], i32 0, i8* null)
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM20:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB11:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP83:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM20]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.7*)* @.omp_task_entry..8 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP84:%.*]] = bitcast i8* [[TMP83]] to %struct.kmp_task_t_with_privates.7*
|
|
|
|
// CHECK3-NEXT: [[TMP85:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP84]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP86:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR21]], i64 0, i64 0
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX22:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP87:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP86]], i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP88:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP89:%.*]] = ptrtoint %struct.S* [[ARRAYIDX22]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP89]], i64* [[TMP88]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP90:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 4, i64* [[TMP90]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP91:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 4, i8* [[TMP91]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP92:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK3-NEXT: [[TMP93:%.*]] = sext i8 [[TMP92]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP94:%.*]] = mul nsw i64 4, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX23:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP94]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX24:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX23]], i64 [[TMP93]]
|
|
|
|
// CHECK3-NEXT: [[TMP95:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK3-NEXT: [[TMP96:%.*]] = sext i8 [[TMP95]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP97:%.*]] = mul nsw i64 9, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX25:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP97]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX26:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX25]], i64 [[TMP96]]
|
|
|
|
// CHECK3-NEXT: [[TMP98:%.*]] = getelementptr i32, i32* [[ARRAYIDX26]], i32 1
|
|
|
|
// CHECK3-NEXT: [[TMP99:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP100:%.*]] = ptrtoint i32* [[TMP98]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP101:%.*]] = sub nuw i64 [[TMP100]], [[TMP99]]
|
|
|
|
// CHECK3-NEXT: [[TMP102:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP86]], i64 1
|
|
|
|
// CHECK3-NEXT: [[TMP103:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP102]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP104:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP104]], i64* [[TMP103]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP105:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP102]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP101]], i64* [[TMP105]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP106:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP102]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 4, i8* [[TMP106]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR27]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP107:%.*]] = bitcast %struct.kmp_depend_info* [[TMP86]] to i8*
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM28:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB11]])
|
|
|
|
// CHECK3-NEXT: [[TMP108:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP85]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[TMP108]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP109:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM28]], i8* [[TMP83]], i32 2, i8* [[TMP107]], i32 0, i8* null)
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM30:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB13:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP110:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM30]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.9*)* @.omp_task_entry..10 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP111:%.*]] = bitcast i8* [[TMP110]] to %struct.kmp_task_t_with_privates.9*
|
|
|
|
// CHECK3-NEXT: [[TMP112:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP111]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP113:%.*]] = getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR31]], i64 0, i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP114:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP113]], i64 0
|
|
|
|
// CHECK3-NEXT: [[TMP115:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP115]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP116:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 4, i64* [[TMP116]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP117:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 3, i8* [[TMP117]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[ARRAYIDX32:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 1
|
|
|
|
// CHECK3-NEXT: [[TMP118:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP113]], i64 1
|
|
|
|
// CHECK3-NEXT: [[TMP119:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP118]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP120:%.*]] = ptrtoint %struct.S* [[ARRAYIDX32]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP120]], i64* [[TMP119]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP121:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP118]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 4, i64* [[TMP121]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP122:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP118]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 3, i8* [[TMP122]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP123:%.*]] = mul nsw i64 0, [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX33:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP123]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX34:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX33]], i64 3
|
|
|
|
// CHECK3-NEXT: [[TMP124:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: [[TMP125:%.*]] = sext i32 [[TMP124]] to i64
|
|
|
|
// CHECK3-NEXT: [[LEN_SUB_1:%.*]] = sub nsw i64 [[TMP125]], 1
|
|
|
|
// CHECK3-NEXT: [[TMP126:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: [[TMP127:%.*]] = sext i32 [[TMP126]] to i64
|
|
|
|
// CHECK3-NEXT: [[LB_ADD_LEN:%.*]] = add nsw i64 -1, [[TMP127]]
|
|
|
|
// CHECK3-NEXT: [[TMP128:%.*]] = mul nsw i64 [[LB_ADD_LEN]], [[TMP1]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX35:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP128]]
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX36:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX35]], i64 [[LEN_SUB_1]]
|
|
|
|
// CHECK3-NEXT: [[TMP129:%.*]] = getelementptr i32, i32* [[ARRAYIDX36]], i32 1
|
|
|
|
// CHECK3-NEXT: [[TMP130:%.*]] = ptrtoint i32* [[ARRAYIDX34]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP131:%.*]] = ptrtoint i32* [[TMP129]] to i64
|
|
|
|
// CHECK3-NEXT: [[TMP132:%.*]] = sub nuw i64 [[TMP131]], [[TMP130]]
|
|
|
|
// CHECK3-NEXT: [[TMP133:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP113]], i64 2
|
|
|
|
// CHECK3-NEXT: [[TMP134:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP133]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP135:%.*]] = ptrtoint i32* [[ARRAYIDX34]] to i64
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP135]], i64* [[TMP134]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP136:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP133]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: store i64 [[TMP132]], i64* [[TMP136]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP137:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP133]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK3-NEXT: store i8 3, i8* [[TMP137]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i64 3, i64* [[DEP_COUNTER_ADDR37]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP138:%.*]] = bitcast %struct.kmp_depend_info* [[TMP113]] to i8*
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM38:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB13]])
|
|
|
|
// CHECK3-NEXT: [[TMP139:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM38]], i8* [[TMP110]], i32 3, i8* [[TMP138]], i32 0, i8* null)
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM40:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB15:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP140:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM40]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.11*)* @.omp_task_entry..12 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP141:%.*]] = bitcast i8* [[TMP140]] to %struct.kmp_task_t_with_privates.11*
|
|
|
|
// CHECK3-NEXT: [[TMP142:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP141]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM41:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB15]])
|
|
|
|
// CHECK3-NEXT: [[TMP143:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM41]], i8* [[TMP140]])
|
|
|
|
// CHECK3-NEXT: store i8 0, i8* [[FLAG]], align 1
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM43:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB17:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP144:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM43]], i32 1, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.13*)* @.omp_task_entry..14 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP145:%.*]] = bitcast i8* [[TMP144]] to %struct.kmp_task_t_with_privates.13*
|
|
|
|
// CHECK3-NEXT: [[TMP146:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP145]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM44:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB17]])
|
|
|
|
// CHECK3-NEXT: [[TMP147:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM44]], i8* [[TMP144]])
|
|
|
|
// CHECK3-NEXT: [[TMP148:%.*]] = getelementptr inbounds [[STRUCT_ANON_14]], %struct.anon.14* [[AGG_CAPTURED45]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i32* [[C]], i32** [[TMP148]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP149:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK3-NEXT: [[TOBOOL:%.*]] = icmp ne i8 [[TMP149]], 0
|
|
|
|
// CHECK3-NEXT: [[TMP150:%.*]] = select i1 [[TOBOOL]], i32 2, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP151:%.*]] = or i32 [[TMP150]], 1
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM46:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB19:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP152:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM46]], i32 [[TMP151]], i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.15*)* @.omp_task_entry..16 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP153:%.*]] = bitcast i8* [[TMP152]] to %struct.kmp_task_t_with_privates.15*
|
|
|
|
// CHECK3-NEXT: [[TMP154:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP153]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP155:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP154]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP156:%.*]] = load i8*, i8** [[TMP155]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP157:%.*]] = bitcast %struct.anon.14* [[AGG_CAPTURED45]] to i8*
|
|
|
|
// CHECK3-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP156]], i8* align 8 [[TMP157]], i64 8, i1 false)
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM47:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB19]])
|
|
|
|
// CHECK3-NEXT: [[TMP158:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM47]], i8* [[TMP152]])
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM49:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP159:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM49]], i32 0, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.19*)* @.omp_task_entry..21 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP160:%.*]] = bitcast i8* [[TMP159]] to %struct.kmp_task_t_with_privates.19*
|
|
|
|
// CHECK3-NEXT: [[TMP161:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP160]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP162:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP160]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP163:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP162]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP164:%.*]] = load i32, i32* [[C]], align 128
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP164]], i32* [[TMP163]], align 128
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM50:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]])
|
|
|
|
// CHECK3-NEXT: [[TMP165:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP161]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[TMP165]], align 16
|
|
|
|
// CHECK3-NEXT: [[TMP166:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM50]], i8* [[TMP159]])
|
|
|
|
// CHECK3-NEXT: [[TMP167:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP167]], i32* [[RETVAL]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP168:%.*]] = load i8*, i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK3-NEXT: call void @llvm.stackrestore(i8* [[TMP168]])
|
|
|
|
// CHECK3-NEXT: [[ARRAY_BEGIN51:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP169:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAY_BEGIN51]], i64 2
|
|
|
|
// CHECK3-NEXT: br label [[ARRAYDESTROY_BODY:%.*]]
|
|
|
|
// CHECK3: arraydestroy.body:
|
|
|
|
// CHECK3-NEXT: [[ARRAYDESTROY_ELEMENTPAST:%.*]] = phi %struct.S* [ [[TMP169]], [[ARRAYCTOR_CONT]] ], [ [[ARRAYDESTROY_ELEMENT:%.*]], [[ARRAYDESTROY_BODY]] ]
|
|
|
|
// CHECK3-NEXT: [[ARRAYDESTROY_ELEMENT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYDESTROY_ELEMENTPAST]], i64 -1
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[ARRAYDESTROY_ELEMENT]]) #[[ATTR4:[0-9]+]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[ARRAYDESTROY_DONE:%.*]] = icmp eq %struct.S* [[ARRAYDESTROY_ELEMENT]], [[ARRAY_BEGIN51]]
|
|
|
|
// CHECK3-NEXT: br i1 [[ARRAYDESTROY_DONE]], label [[ARRAYDESTROY_DONE52:%.*]], label [[ARRAYDESTROY_BODY]]
|
|
|
|
// CHECK3: arraydestroy.done52:
|
|
|
|
// CHECK3-NEXT: [[TMP170:%.*]] = load i32, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK3-NEXT: ret i32 [[TMP170]]
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_ZN1SC1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SC2Ev(%struct.S* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates* noalias noundef [[TMP1:%.*]]) #[[ATTR3:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates* [[TMP1]], %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates*, %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META3:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META6:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META8:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META10:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !12
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK3-NEXT: store %struct.anon* [[TMP8]], %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon*, %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: [[CONV_I:%.*]] = trunc i32 [[TMP11]] to i8
|
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = getelementptr inbounds [[STRUCT_ANON:%.*]], %struct.anon* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP13:%.*]] = load i8*, i8** [[TMP12]], align 8
|
|
|
|
// CHECK3-NEXT: store i8 [[CONV_I]], i8* [[TMP13]], align 1
|
|
|
|
// CHECK3-NEXT: [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[TMP10]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: [[TMP15:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP14]], align 8
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP15]], i64 0, i64 0
|
|
|
|
// CHECK3-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..2
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.1* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.0*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.1*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.1* [[TMP1]], %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.1*, %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.0*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.1* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META13:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META16:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META18:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META20:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !22
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK3-NEXT: store %struct.anon.0* [[TMP8]], %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.0*, %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_0:%.*]], %struct.anon.0* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP11]], align 8
|
|
|
|
// CHECK3-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP12]], i64 0, i64 1
|
|
|
|
// CHECK3-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..4
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.3* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.2*, align 8
|
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.3*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.3* [[TMP1]], %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.3*, %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.2*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.3* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META23:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META26:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META28:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META30:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK3-NEXT: store %struct.anon.2* [[TMP8]], %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.2*, %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK3-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK3-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK3-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK3-NEXT: ]
|
|
|
|
// CHECK3: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK3: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[TMP13]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB7]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP14]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT:%.*]]
|
|
|
|
// CHECK3: .untied.jmp.1.i:
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM2_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]]) #[[ATTR4]]
|
|
|
|
// CHECK3-NEXT: call void @__kmpc_critical(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM2_I]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* @a, align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: call void @__kmpc_end_critical(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM2_I]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK3: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT]]
|
|
|
|
// CHECK3: .omp_outlined..3.exit:
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..6
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.5* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.4*, align 8
|
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.5*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.5* [[TMP1]], %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.5*, %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.4*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.5* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META33:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META36:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META38:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META40:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !42
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK3-NEXT: store %struct.anon.4* [[TMP8]], %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.4*, %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK3-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK3-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK3-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK3-NEXT: ]
|
|
|
|
// CHECK3: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK3: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[TMP13]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB9]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP14]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT:%.*]]
|
|
|
|
// CHECK3: .untied.jmp.1.i:
|
|
|
|
// CHECK3-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK3: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT]]
|
|
|
|
// CHECK3: .omp_outlined..5.exit:
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.7* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.6*, align 8
|
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.7*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.7* [[TMP1]], %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.7*, %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.6*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.7* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META43:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META46:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META48:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META50:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !52
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK3-NEXT: store %struct.anon.6* [[TMP8]], %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.6*, %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK3-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK3-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK3-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK3-NEXT: ]
|
|
|
|
// CHECK3: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK3: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[TMP13]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB11]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP14]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT:%.*]]
|
|
|
|
// CHECK3: .untied.jmp.1.i:
|
|
|
|
// CHECK3-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK3: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT]]
|
|
|
|
// CHECK3: .omp_outlined..7.exit:
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..10
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.9* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.9*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.9* [[TMP1]], %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.9*, %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.8*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.9* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META53:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META56:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META58:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META60:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !62
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK3-NEXT: store %struct.anon.8* [[TMP8]], %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.8*, %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..12
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.11* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.10*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.11*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.11* [[TMP1]], %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.11*, %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.10*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.11* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META63:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META66:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META68:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META70:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !72
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK3-NEXT: store %struct.anon.10* [[TMP8]], %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.10*, %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..14
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.13* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.12*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.13*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.13* [[TMP1]], %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.13*, %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.12*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.13* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META73:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META76:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META78:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META80:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !82
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK3-NEXT: store %struct.anon.12* [[TMP8]], %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.12*, %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 3, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..16
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.15* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.14*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.15*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.15* [[TMP1]], %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.15*, %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.14*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.15* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META83:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META86:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META88:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META90:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !92
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK3-NEXT: store %struct.anon.14* [[TMP8]], %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.14*, %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_14:%.*]], %struct.anon.14* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load i32*, i32** [[TMP11]], align 8
|
|
|
|
// CHECK3-NEXT: store i32 5, i32* [[TMP12]], align 128
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_privates_map.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct..kmp_privates.t* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]]) #[[ATTR7:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK3-NEXT: store %struct..kmp_privates.t* [[TMP0]], %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK3-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load %struct..kmp_privates.t*, %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP2]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP3]], i32** [[TMP4]], align 8
|
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..19
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.18* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.17*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.18*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.18* [[TMP1]], %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.18*, %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.17*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t* [[TMP9]] to i8*
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.18* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META93:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META96:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META98:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META100:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !102
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t*, i32**)* @.omp_task_privates_map. to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK3-NEXT: store %struct.anon.17* [[TMP8]], %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load %struct.anon.17*, %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK3-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK3-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 4, i32* [[TMP16]], align 128
|
|
|
|
// CHECK3-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_ZN1SD1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SD2Ev(%struct.S* noundef [[THIS1]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_privates_map..20
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct..kmp_privates.t.20* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]], %struct.S** noalias noundef [[TMP2:%.*]], %struct.S** noalias noundef [[TMP3:%.*]]) #[[ATTR7]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t.20*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR2:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR3:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK3-NEXT: store %struct..kmp_privates.t.20* [[TMP0]], %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK3-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S** [[TMP2]], %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S** [[TMP3]], %struct.S*** [[DOTADDR3]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = load %struct..kmp_privates.t.20*, %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 1
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = load %struct.S**, %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S* [[TMP7]], %struct.S** [[TMP8]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.S**, %struct.S*** [[DOTADDR3]], align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S* [[TMP9]], %struct.S** [[TMP10]], align 8
|
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..21
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.19* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.16*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTLOCAL_PTR_ADDR_I:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTLOCAL_PTR_ADDR1_I:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[S1_I:%.*]] = alloca [[STRUCT_S:%.*]], align 4
|
|
|
|
// CHECK3-NEXT: [[S2_I:%.*]] = alloca [[STRUCT_S]], align 4
|
|
|
|
// CHECK3-NEXT: [[REF_TMP_I:%.*]] = alloca [[STRUCT_S]], align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.19*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.19* [[TMP1]], %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.19*, %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.16*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t.20* [[TMP9]] to i8*
|
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.19* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META103:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META106:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META108:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META110:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t.20*, i32**, %struct.S**, %struct.S**)* @.omp_task_privates_map..20 to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: store %struct.anon.16* [[TMP8]], %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load %struct.anon.16*, %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**, %struct.S**, %struct.S**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], %struct.S** [[DOTLOCAL_PTR_ADDR_I]], %struct.S** [[DOTLOCAL_PTR_ADDR1_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: [[TMP17:%.*]] = load %struct.S*, %struct.S** [[DOTLOCAL_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: [[TMP18:%.*]] = load %struct.S*, %struct.S** [[DOTLOCAL_PTR_ADDR1_I]], align 8, !noalias !112
|
|
|
|
// CHECK3-NEXT: [[TMP19:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP20:%.*]] = load i32, i32* [[TMP19]], align 4
|
|
|
|
// CHECK3-NEXT: switch i32 [[TMP20]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK3-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK3-NEXT: i32 1, label [[DOTUNTIED_JMP_2_I:%.*]]
|
|
|
|
// CHECK3-NEXT: i32 2, label [[DOTUNTIED_JMP_6_I:%.*]]
|
|
|
|
// CHECK3-NEXT: i32 3, label [[DOTUNTIED_JMP_10_I:%.*]]
|
|
|
|
// CHECK3-NEXT: i32 4, label [[DOTUNTIED_JMP_15_I:%.*]]
|
|
|
|
// CHECK3-NEXT: ]
|
|
|
|
// CHECK3: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK3: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP21:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 1, i32* [[TMP21]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP22:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[TMP23:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP22]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT:%.*]]
|
|
|
|
// CHECK3: .untied.jmp.2.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[S1_I]]) #[[ATTR4]]
|
|
|
|
// CHECK3-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[S2_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[S2_I]], i32 0, i32 0
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[A_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM3_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB23:[0-9]+]]) #[[ATTR4]]
|
|
|
|
// CHECK3-NEXT: [[TMP24:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM3_I]], i32 1, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.18*)* @.omp_task_entry..19 to i32 (i32, i8*)*)) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP25:%.*]] = bitcast i8* [[TMP24]] to %struct.kmp_task_t_with_privates.18*
|
|
|
|
// CHECK3-NEXT: [[TMP26:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP25]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP27:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP25]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP28:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP27]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP29:%.*]] = load i32, i32* [[TMP16]], align 128
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP29]], i32* [[TMP28]], align 128
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM4_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB23]]) #[[ATTR4]]
|
|
|
|
// CHECK3-NEXT: [[TMP30:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM4_I]], i8* [[TMP24]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP31:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 2, i32* [[TMP31]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM5_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP32:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[TMP33:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM5_I]], i8* [[TMP32]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK3: .untied.jmp.6.i:
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM8_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]]) #[[ATTR4]]
|
|
|
|
// CHECK3-NEXT: [[TMP34:%.*]] = call i32 @__kmpc_omp_taskyield(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM8_I]], i32 0) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP35:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 3, i32* [[TMP35]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM9_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP36:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[TMP37:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM9_I]], i8* [[TMP36]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK3: .untied.jmp.10.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP38:%.*]] = bitcast %struct.S* [[S1_I]] to i8*
|
|
|
|
// CHECK3-NEXT: [[TMP39:%.*]] = bitcast %struct.S* [[REF_TMP_I]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP38]], i8* align 4 [[TMP39]], i64 4, i1 false) #[[ATTR4]], !noalias !112
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[A12_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[S2_I]], i32 0, i32 0
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 10, i32* [[A12_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM13_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]]) #[[ATTR4]]
|
|
|
|
// CHECK3-NEXT: [[TMP40:%.*]] = call i32 @__kmpc_omp_taskwait(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM13_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP41:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: store i32 4, i32* [[TMP41]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM14_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[TMP42:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: [[TMP43:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM14_I]], i8* [[TMP42]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK3: .untied.jmp.15.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[S2_I]]) #[[ATTR4]]
|
|
|
|
// CHECK3-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[S1_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK3: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK3: .omp_outlined..17.exit:
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_ZN1SC2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[A:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[THIS1]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[A]], align 4
|
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_ZN1SD2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@__cxx_global_var_init
|
|
|
|
// CHECK3-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
|
|
|
// CHECK3-NEXT: entry:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN2S1C1Ev(%struct.S1* noundef @s1)
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_ZN2S1C1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN2S1C2Ev(%struct.S1* noundef [[THIS1]])
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_ZN2S1C2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-NEXT: call void @_ZN2S18taskinitEv(%struct.S1* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_ZN2S18taskinitEv
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (%struct.S1* noundef [[THIS:%.*]]) #[[ATTR8:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK3-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON_21:%.*]], align 8
|
|
|
|
// CHECK3-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ANON_21]], %struct.anon.21* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store %struct.S1* [[THIS1]], %struct.S1** [[TMP0]], align 8
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB25:[0-9]+]])
|
|
|
|
// CHECK3-NEXT: [[TMP1:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.22*)* @.omp_task_entry..23 to i32 (i32, i8*)*))
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = bitcast i8* [[TMP1]] to %struct.kmp_task_t_with_privates.22*
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP2]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = load i8*, i8** [[TMP4]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = bitcast %struct.anon.21* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK3-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP5]], i8* align 8 [[TMP6]], i64 8, i1 false)
|
|
|
|
// CHECK3-NEXT: [[OMP_GLOBAL_THREAD_NUM2:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB25]])
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM2]], i8* [[TMP1]])
|
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@.omp_task_entry..23
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK3-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.22* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK3-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.21*, align 8
|
|
|
|
// CHECK3-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK3-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.22*, align 8
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: store %struct.kmp_task_t_with_privates.22* [[TMP1]], %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK3-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.22*, %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK3-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK3-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.21*
|
|
|
|
// CHECK3-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.22* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META113:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META116:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META118:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META120:![0-9]+]])
|
|
|
|
// CHECK3-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !122
|
|
|
|
// CHECK3-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK3-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK3-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK3-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK3-NEXT: store %struct.anon.21* [[TMP8]], %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK3-NEXT: [[TMP10:%.*]] = load %struct.anon.21*, %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_21:%.*]], %struct.anon.21* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: [[TMP12:%.*]] = load %struct.S1*, %struct.S1** [[TMP11]], align 8
|
|
|
|
// CHECK3-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S1:%.*]], %struct.S1* [[TMP12]], i32 0, i32 0
|
|
|
|
// CHECK3-NEXT: store i32 0, i32* [[A_I]], align 4
|
|
|
|
// CHECK3-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK3-LABEL: define {{[^@]+}}@_GLOBAL__sub_I_task_codegen.cpp
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK3-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK3-NEXT: entry:
|
|
|
|
// CHECK3-NEXT: call void @__cxx_global_var_init()
|
|
|
|
// CHECK3-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@main
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-SAME: () #[[ATTR0:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[B:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK4-NEXT: [[S:%.*]] = alloca [2 x %struct.S], align 4
|
|
|
|
// CHECK4-NEXT: [[SAVED_STACK:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__VLA_EXPR0:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON:%.*]], align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED2:%.*]] = alloca [[STRUCT_ANON_0:%.*]], align 8
|
|
|
|
// CHECK4-NEXT: [[DOTDEP_ARR_ADDR:%.*]] = alloca [4 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK4-NEXT: [[DEP_COUNTER_ADDR:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED6:%.*]] = alloca [[STRUCT_ANON_2:%.*]], align 1
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED9:%.*]] = alloca [[STRUCT_ANON_4:%.*]], align 1
|
|
|
|
// CHECK4-NEXT: [[DOTDEP_ARR_ADDR11:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK4-NEXT: [[DEP_COUNTER_ADDR17:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED19:%.*]] = alloca [[STRUCT_ANON_6:%.*]], align 1
|
|
|
|
// CHECK4-NEXT: [[DOTDEP_ARR_ADDR21:%.*]] = alloca [2 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK4-NEXT: [[DEP_COUNTER_ADDR27:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED29:%.*]] = alloca [[STRUCT_ANON_8:%.*]], align 1
|
|
|
|
// CHECK4-NEXT: [[DOTDEP_ARR_ADDR31:%.*]] = alloca [3 x %struct.kmp_depend_info], align 8
|
|
|
|
// CHECK4-NEXT: [[DEP_COUNTER_ADDR37:%.*]] = alloca i64, align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED39:%.*]] = alloca [[STRUCT_ANON_10:%.*]], align 1
|
|
|
|
// CHECK4-NEXT: [[FLAG:%.*]] = alloca i8, align 1
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED42:%.*]] = alloca [[STRUCT_ANON_12:%.*]], align 1
|
|
|
|
// CHECK4-NEXT: [[C:%.*]] = alloca i32, align 128
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED45:%.*]] = alloca [[STRUCT_ANON_14:%.*]], align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED48:%.*]] = alloca [[STRUCT_ANON_16:%.*]], align 1
|
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK4-NEXT: [[ARRAY_BEGIN:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[ARRAYCTOR_END:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAY_BEGIN]], i64 2
|
|
|
|
// CHECK4-NEXT: br label [[ARRAYCTOR_LOOP:%.*]]
|
|
|
|
// CHECK4: arrayctor.loop:
|
|
|
|
// CHECK4-NEXT: [[ARRAYCTOR_CUR:%.*]] = phi %struct.S* [ [[ARRAY_BEGIN]], [[ENTRY:%.*]] ], [ [[ARRAYCTOR_NEXT:%.*]], [[ARRAYCTOR_LOOP]] ]
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[ARRAYCTOR_CUR]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[ARRAYCTOR_NEXT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYCTOR_CUR]], i64 1
|
|
|
|
// CHECK4-NEXT: [[ARRAYCTOR_DONE:%.*]] = icmp eq %struct.S* [[ARRAYCTOR_NEXT]], [[ARRAYCTOR_END]]
|
|
|
|
// CHECK4-NEXT: br i1 [[ARRAYCTOR_DONE]], label [[ARRAYCTOR_CONT:%.*]], label [[ARRAYCTOR_LOOP]]
|
|
|
|
// CHECK4: arrayctor.cont:
|
|
|
|
// CHECK4-NEXT: [[TMP0:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = call i8* @llvm.stacksave()
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP2]], i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = mul nuw i64 10, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[VLA:%.*]] = alloca i32, i64 [[TMP3]], align 16
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP1]], i64* [[__VLA_EXPR0]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i8* [[B]], i8** [[TMP4]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[AGG_CAPTURED]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP5]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK4-NEXT: [[CONV:%.*]] = sext i8 [[TMP6]] to i32
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB3:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1:[0-9]+]], i32 [[OMP_GLOBAL_THREAD_NUM]], i32 33, i64 40, i64 16, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.kmp_task_t_with_privates*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP8]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP9]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = load i8*, i8** [[TMP10]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = bitcast %struct.anon* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK4-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP11]], i8* align 8 [[TMP12]], i64 16, i1 false)
|
|
|
|
// CHECK4-NEXT: [[TMP13:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP9]], i32 0, i32 4
|
|
|
|
// CHECK4-NEXT: [[TMP14:%.*]] = bitcast %union.kmp_cmplrdata_t* [[TMP13]] to i32*
|
|
|
|
// CHECK4-NEXT: store i32 [[CONV]], i32* [[TMP14]], align 8
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM1:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB3]])
|
|
|
|
// CHECK4-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM1]], i8* [[TMP7]])
|
|
|
|
// CHECK4-NEXT: [[TMP16:%.*]] = getelementptr inbounds [[STRUCT_ANON_0]], %struct.anon.0* [[AGG_CAPTURED2]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store [2 x %struct.S]* [[S]], [2 x %struct.S]** [[TMP16]], align 8
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM3:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB5:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP17:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM3]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP18:%.*]] = bitcast i8* [[TMP17]] to %struct.kmp_task_t_with_privates.1*
|
|
|
|
// CHECK4-NEXT: [[TMP19:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP18]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP20:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP19]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP21:%.*]] = load i8*, i8** [[TMP20]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP22:%.*]] = bitcast %struct.anon.0* [[AGG_CAPTURED2]] to i8*
|
|
|
|
// CHECK4-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP21]], i8* align 8 [[TMP22]], i64 8, i1 false)
|
|
|
|
// CHECK4-NEXT: [[TMP23:%.*]] = getelementptr inbounds [4 x %struct.kmp_depend_info], [4 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR]], i64 0, i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP24:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO:%.*]], %struct.kmp_depend_info* [[TMP23]], i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP25:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP25]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP26:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 4, i64* [[TMP26]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP27:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP24]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 1, i8* [[TMP27]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP28:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP23]], i64 1
|
|
|
|
// CHECK4-NEXT: [[TMP29:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP28]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP30:%.*]] = ptrtoint i8* [[B]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP30]], i64* [[TMP29]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP31:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP28]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 1, i64* [[TMP31]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP32:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP28]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 1, i8* [[TMP32]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP33:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP23]], i64 2
|
|
|
|
// CHECK4-NEXT: [[TMP34:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP33]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP35:%.*]] = ptrtoint [2 x %struct.S]* [[S]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP35]], i64* [[TMP34]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP36:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP33]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 8, i64* [[TMP36]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP37:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP33]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 1, i8* [[TMP37]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP38:%.*]] = mul nsw i64 0, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP38]]
|
|
|
|
// CHECK4-NEXT: [[TMP39:%.*]] = mul nsw i64 9, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP39]]
|
|
|
|
// CHECK4-NEXT: [[TMP40:%.*]] = getelementptr i32, i32* [[ARRAYIDX4]], i32 1
|
|
|
|
// CHECK4-NEXT: [[TMP41:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP42:%.*]] = ptrtoint i32* [[TMP40]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP43:%.*]] = sub nuw i64 [[TMP42]], [[TMP41]]
|
|
|
|
// CHECK4-NEXT: [[TMP44:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP23]], i64 3
|
|
|
|
// CHECK4-NEXT: [[TMP45:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP44]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP46:%.*]] = ptrtoint i32* [[ARRAYIDX]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP46]], i64* [[TMP45]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP47:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP44]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP43]], i64* [[TMP47]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP48:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP44]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 1, i8* [[TMP48]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i64 4, i64* [[DEP_COUNTER_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP49:%.*]] = bitcast %struct.kmp_depend_info* [[TMP23]] to i8*
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM5:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB5]])
|
|
|
|
// CHECK4-NEXT: [[TMP50:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM5]], i8* [[TMP17]], i32 4, i8* [[TMP49]], i32 0, i8* null)
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM7:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB7:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP51:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM7]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.3*)* @.omp_task_entry..4 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP52:%.*]] = bitcast i8* [[TMP51]] to %struct.kmp_task_t_with_privates.3*
|
|
|
|
// CHECK4-NEXT: [[TMP53:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP52]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM8:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB7]])
|
|
|
|
// CHECK4-NEXT: [[TMP54:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP53]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[TMP54]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP55:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM8]], i8* [[TMP51]])
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM10:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB9:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP56:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM10]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.5*)* @.omp_task_entry..6 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP57:%.*]] = bitcast i8* [[TMP56]] to %struct.kmp_task_t_with_privates.5*
|
|
|
|
// CHECK4-NEXT: [[TMP58:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP57]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP59:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR11]], i64 0, i64 0
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX12:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP60:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP59]], i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP61:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP62:%.*]] = ptrtoint %struct.S* [[ARRAYIDX12]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP62]], i64* [[TMP61]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP63:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 4, i64* [[TMP63]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP64:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP60]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 3, i8* [[TMP64]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP65:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK4-NEXT: [[TMP66:%.*]] = sext i8 [[TMP65]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP67:%.*]] = mul nsw i64 4, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX13:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP67]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX14:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX13]], i64 [[TMP66]]
|
|
|
|
// CHECK4-NEXT: [[TMP68:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK4-NEXT: [[TMP69:%.*]] = sext i8 [[TMP68]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP70:%.*]] = mul nsw i64 9, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX15:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP70]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX16:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX15]], i64 [[TMP69]]
|
|
|
|
// CHECK4-NEXT: [[TMP71:%.*]] = getelementptr i32, i32* [[ARRAYIDX16]], i32 1
|
|
|
|
// CHECK4-NEXT: [[TMP72:%.*]] = ptrtoint i32* [[ARRAYIDX14]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP73:%.*]] = ptrtoint i32* [[TMP71]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP74:%.*]] = sub nuw i64 [[TMP73]], [[TMP72]]
|
|
|
|
// CHECK4-NEXT: [[TMP75:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP59]], i64 1
|
|
|
|
// CHECK4-NEXT: [[TMP76:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP75]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP77:%.*]] = ptrtoint i32* [[ARRAYIDX14]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP77]], i64* [[TMP76]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP78:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP75]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP74]], i64* [[TMP78]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP79:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP75]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 3, i8* [[TMP79]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR17]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP80:%.*]] = bitcast %struct.kmp_depend_info* [[TMP59]] to i8*
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM18:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB9]])
|
|
|
|
// CHECK4-NEXT: [[TMP81:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP58]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[TMP81]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP82:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM18]], i8* [[TMP56]], i32 2, i8* [[TMP80]], i32 0, i8* null)
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM20:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB11:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP83:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM20]], i32 0, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.7*)* @.omp_task_entry..8 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP84:%.*]] = bitcast i8* [[TMP83]] to %struct.kmp_task_t_with_privates.7*
|
|
|
|
// CHECK4-NEXT: [[TMP85:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP84]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP86:%.*]] = getelementptr inbounds [2 x %struct.kmp_depend_info], [2 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR21]], i64 0, i64 0
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX22:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP87:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP86]], i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP88:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP89:%.*]] = ptrtoint %struct.S* [[ARRAYIDX22]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP89]], i64* [[TMP88]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP90:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 4, i64* [[TMP90]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP91:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP87]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 4, i8* [[TMP91]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP92:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK4-NEXT: [[TMP93:%.*]] = sext i8 [[TMP92]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP94:%.*]] = mul nsw i64 4, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX23:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP94]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX24:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX23]], i64 [[TMP93]]
|
|
|
|
// CHECK4-NEXT: [[TMP95:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK4-NEXT: [[TMP96:%.*]] = sext i8 [[TMP95]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP97:%.*]] = mul nsw i64 9, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX25:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP97]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX26:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX25]], i64 [[TMP96]]
|
|
|
|
// CHECK4-NEXT: [[TMP98:%.*]] = getelementptr i32, i32* [[ARRAYIDX26]], i32 1
|
|
|
|
// CHECK4-NEXT: [[TMP99:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP100:%.*]] = ptrtoint i32* [[TMP98]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP101:%.*]] = sub nuw i64 [[TMP100]], [[TMP99]]
|
|
|
|
// CHECK4-NEXT: [[TMP102:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP86]], i64 1
|
|
|
|
// CHECK4-NEXT: [[TMP103:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP102]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP104:%.*]] = ptrtoint i32* [[ARRAYIDX24]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP104]], i64* [[TMP103]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP105:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP102]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP101]], i64* [[TMP105]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP106:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP102]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 4, i8* [[TMP106]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i64 2, i64* [[DEP_COUNTER_ADDR27]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP107:%.*]] = bitcast %struct.kmp_depend_info* [[TMP86]] to i8*
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM28:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB11]])
|
|
|
|
// CHECK4-NEXT: [[TMP108:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP85]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[TMP108]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP109:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM28]], i8* [[TMP83]], i32 2, i8* [[TMP107]], i32 0, i8* null)
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM30:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB13:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP110:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM30]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.9*)* @.omp_task_entry..10 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP111:%.*]] = bitcast i8* [[TMP110]] to %struct.kmp_task_t_with_privates.9*
|
|
|
|
// CHECK4-NEXT: [[TMP112:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP111]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP113:%.*]] = getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x %struct.kmp_depend_info]* [[DOTDEP_ARR_ADDR31]], i64 0, i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP114:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP113]], i64 0
|
|
|
|
// CHECK4-NEXT: [[TMP115:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i64 ptrtoint (i32* @a to i64), i64* [[TMP115]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP116:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 4, i64* [[TMP116]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP117:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP114]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 3, i8* [[TMP117]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[ARRAYIDX32:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i64 0, i64 1
|
|
|
|
// CHECK4-NEXT: [[TMP118:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP113]], i64 1
|
|
|
|
// CHECK4-NEXT: [[TMP119:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP118]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP120:%.*]] = ptrtoint %struct.S* [[ARRAYIDX32]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP120]], i64* [[TMP119]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP121:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP118]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 4, i64* [[TMP121]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP122:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP118]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 3, i8* [[TMP122]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP123:%.*]] = mul nsw i64 0, [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX33:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP123]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX34:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX33]], i64 3
|
|
|
|
// CHECK4-NEXT: [[TMP124:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: [[TMP125:%.*]] = sext i32 [[TMP124]] to i64
|
|
|
|
// CHECK4-NEXT: [[LEN_SUB_1:%.*]] = sub nsw i64 [[TMP125]], 1
|
|
|
|
// CHECK4-NEXT: [[TMP126:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: [[TMP127:%.*]] = sext i32 [[TMP126]] to i64
|
|
|
|
// CHECK4-NEXT: [[LB_ADD_LEN:%.*]] = add nsw i64 -1, [[TMP127]]
|
|
|
|
// CHECK4-NEXT: [[TMP128:%.*]] = mul nsw i64 [[LB_ADD_LEN]], [[TMP1]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX35:%.*]] = getelementptr inbounds i32, i32* [[VLA]], i64 [[TMP128]]
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX36:%.*]] = getelementptr inbounds i32, i32* [[ARRAYIDX35]], i64 [[LEN_SUB_1]]
|
|
|
|
// CHECK4-NEXT: [[TMP129:%.*]] = getelementptr i32, i32* [[ARRAYIDX36]], i32 1
|
|
|
|
// CHECK4-NEXT: [[TMP130:%.*]] = ptrtoint i32* [[ARRAYIDX34]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP131:%.*]] = ptrtoint i32* [[TMP129]] to i64
|
|
|
|
// CHECK4-NEXT: [[TMP132:%.*]] = sub nuw i64 [[TMP131]], [[TMP130]]
|
|
|
|
// CHECK4-NEXT: [[TMP133:%.*]] = getelementptr [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP113]], i64 2
|
|
|
|
// CHECK4-NEXT: [[TMP134:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP133]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP135:%.*]] = ptrtoint i32* [[ARRAYIDX34]] to i64
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP135]], i64* [[TMP134]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP136:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP133]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: store i64 [[TMP132]], i64* [[TMP136]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP137:%.*]] = getelementptr inbounds [[STRUCT_KMP_DEPEND_INFO]], %struct.kmp_depend_info* [[TMP133]], i32 0, i32 2
|
2021-06-09 22:35:50 +08:00
|
|
|
// CHECK4-NEXT: store i8 3, i8* [[TMP137]], align 8
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i64 3, i64* [[DEP_COUNTER_ADDR37]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP138:%.*]] = bitcast %struct.kmp_depend_info* [[TMP113]] to i8*
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM38:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB13]])
|
|
|
|
// CHECK4-NEXT: [[TMP139:%.*]] = call i32 @__kmpc_omp_task_with_deps(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM38]], i8* [[TMP110]], i32 3, i8* [[TMP138]], i32 0, i8* null)
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM40:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB15:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP140:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM40]], i32 3, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.11*)* @.omp_task_entry..12 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP141:%.*]] = bitcast i8* [[TMP140]] to %struct.kmp_task_t_with_privates.11*
|
|
|
|
// CHECK4-NEXT: [[TMP142:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP141]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM41:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB15]])
|
|
|
|
// CHECK4-NEXT: [[TMP143:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM41]], i8* [[TMP140]])
|
|
|
|
// CHECK4-NEXT: store i8 0, i8* [[FLAG]], align 1
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM43:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB17:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP144:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM43]], i32 1, i64 40, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.13*)* @.omp_task_entry..14 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP145:%.*]] = bitcast i8* [[TMP144]] to %struct.kmp_task_t_with_privates.13*
|
|
|
|
// CHECK4-NEXT: [[TMP146:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP145]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM44:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB17]])
|
|
|
|
// CHECK4-NEXT: [[TMP147:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM44]], i8* [[TMP144]])
|
|
|
|
// CHECK4-NEXT: [[TMP148:%.*]] = getelementptr inbounds [[STRUCT_ANON_14]], %struct.anon.14* [[AGG_CAPTURED45]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i32* [[C]], i32** [[TMP148]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP149:%.*]] = load i8, i8* [[B]], align 1
|
|
|
|
// CHECK4-NEXT: [[TOBOOL:%.*]] = icmp ne i8 [[TMP149]], 0
|
|
|
|
// CHECK4-NEXT: [[TMP150:%.*]] = select i1 [[TOBOOL]], i32 2, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP151:%.*]] = or i32 [[TMP150]], 1
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM46:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB19:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP152:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM46]], i32 [[TMP151]], i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.15*)* @.omp_task_entry..16 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP153:%.*]] = bitcast i8* [[TMP152]] to %struct.kmp_task_t_with_privates.15*
|
|
|
|
// CHECK4-NEXT: [[TMP154:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP153]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP155:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP154]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP156:%.*]] = load i8*, i8** [[TMP155]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP157:%.*]] = bitcast %struct.anon.14* [[AGG_CAPTURED45]] to i8*
|
|
|
|
// CHECK4-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP156]], i8* align 8 [[TMP157]], i64 8, i1 false)
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM47:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB19]])
|
|
|
|
// CHECK4-NEXT: [[TMP158:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM47]], i8* [[TMP152]])
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM49:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP159:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM49]], i32 0, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.19*)* @.omp_task_entry..21 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP160:%.*]] = bitcast i8* [[TMP159]] to %struct.kmp_task_t_with_privates.19*
|
|
|
|
// CHECK4-NEXT: [[TMP161:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP160]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP162:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP160]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP163:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP162]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP164:%.*]] = load i32, i32* [[C]], align 128
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP164]], i32* [[TMP163]], align 128
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM50:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]])
|
|
|
|
// CHECK4-NEXT: [[TMP165:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP161]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[TMP165]], align 16
|
|
|
|
// CHECK4-NEXT: [[TMP166:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM50]], i8* [[TMP159]])
|
|
|
|
// CHECK4-NEXT: [[TMP167:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP167]], i32* [[RETVAL]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP168:%.*]] = load i8*, i8** [[SAVED_STACK]], align 8
|
|
|
|
// CHECK4-NEXT: call void @llvm.stackrestore(i8* [[TMP168]])
|
|
|
|
// CHECK4-NEXT: [[ARRAY_BEGIN51:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[S]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP169:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAY_BEGIN51]], i64 2
|
|
|
|
// CHECK4-NEXT: br label [[ARRAYDESTROY_BODY:%.*]]
|
|
|
|
// CHECK4: arraydestroy.body:
|
|
|
|
// CHECK4-NEXT: [[ARRAYDESTROY_ELEMENTPAST:%.*]] = phi %struct.S* [ [[TMP169]], [[ARRAYCTOR_CONT]] ], [ [[ARRAYDESTROY_ELEMENT:%.*]], [[ARRAYDESTROY_BODY]] ]
|
|
|
|
// CHECK4-NEXT: [[ARRAYDESTROY_ELEMENT]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[ARRAYDESTROY_ELEMENTPAST]], i64 -1
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[ARRAYDESTROY_ELEMENT]]) #[[ATTR4:[0-9]+]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[ARRAYDESTROY_DONE:%.*]] = icmp eq %struct.S* [[ARRAYDESTROY_ELEMENT]], [[ARRAY_BEGIN51]]
|
|
|
|
// CHECK4-NEXT: br i1 [[ARRAYDESTROY_DONE]], label [[ARRAYDESTROY_DONE52:%.*]], label [[ARRAYDESTROY_BODY]]
|
|
|
|
// CHECK4: arraydestroy.done52:
|
|
|
|
// CHECK4-NEXT: [[TMP170:%.*]] = load i32, i32* [[RETVAL]], align 4
|
|
|
|
// CHECK4-NEXT: ret i32 [[TMP170]]
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_ZN1SC1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SC2Ev(%struct.S* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates* noalias noundef [[TMP1:%.*]]) #[[ATTR3:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates* [[TMP1]], %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates*, %struct.kmp_task_t_with_privates** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES:%.*]], %struct.kmp_task_t_with_privates* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META3:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META6:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META8:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META10:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !12
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK4-NEXT: store %struct.anon* [[TMP8]], %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon*, %struct.anon** [[__CONTEXT_ADDR_I]], align 8, !noalias !12
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = load i32, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: [[CONV_I:%.*]] = trunc i32 [[TMP11]] to i8
|
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = getelementptr inbounds [[STRUCT_ANON:%.*]], %struct.anon* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP13:%.*]] = load i8*, i8** [[TMP12]], align 8
|
|
|
|
// CHECK4-NEXT: store i8 [[CONV_I]], i8* [[TMP13]], align 1
|
|
|
|
// CHECK4-NEXT: [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_ANON]], %struct.anon* [[TMP10]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: [[TMP15:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP14]], align 8
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP15]], i64 0, i64 0
|
|
|
|
// CHECK4-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..2
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.1* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.0*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.1*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.1* [[TMP1]], %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.1*, %struct.kmp_task_t_with_privates.1** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_1:%.*]], %struct.kmp_task_t_with_privates.1* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.0*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.1* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META13:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META16:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META18:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META20:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !22
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK4-NEXT: store %struct.anon.0* [[TMP8]], %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.0*, %struct.anon.0** [[__CONTEXT_ADDR_I]], align 8, !noalias !22
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 15, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_0:%.*]], %struct.anon.0* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load [2 x %struct.S]*, [2 x %struct.S]** [[TMP11]], align 8
|
|
|
|
// CHECK4-NEXT: [[ARRAYIDX_I:%.*]] = getelementptr inbounds [2 x %struct.S], [2 x %struct.S]* [[TMP12]], i64 0, i64 1
|
|
|
|
// CHECK4-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[ARRAYIDX_I]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i32 10, i32* [[A_I]], align 4
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..4
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.3* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.2*, align 8
|
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.3*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.3* [[TMP1]], %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.3*, %struct.kmp_task_t_with_privates.3** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_3:%.*]], %struct.kmp_task_t_with_privates.3* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.2*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.3* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META23:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META26:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META28:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META30:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !32
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK4-NEXT: store %struct.anon.2* [[TMP8]], %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.2*, %struct.anon.2** [[__CONTEXT_ADDR_I]], align 8, !noalias !32
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK4-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK4-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK4-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK4-NEXT: ]
|
|
|
|
// CHECK4: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK4: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[TMP13]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB7]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !32
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP14]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT:%.*]]
|
|
|
|
// CHECK4: .untied.jmp.1.i:
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM2_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]]) #[[ATTR4]]
|
|
|
|
// CHECK4-NEXT: call void @__kmpc_critical(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM2_I]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* @a, align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: call void @__kmpc_end_critical(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM2_I]], [8 x i32]* @.gomp_critical_user_.var) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK4: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !32
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__3_EXIT]]
|
|
|
|
// CHECK4: .omp_outlined..3.exit:
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..6
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.5* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.4*, align 8
|
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.5*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.5* [[TMP1]], %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.5*, %struct.kmp_task_t_with_privates.5** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_5:%.*]], %struct.kmp_task_t_with_privates.5* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.4*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.5* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META33:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META36:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META38:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META40:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !42
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK4-NEXT: store %struct.anon.4* [[TMP8]], %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.4*, %struct.anon.4** [[__CONTEXT_ADDR_I]], align 8, !noalias !42
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK4-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK4-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK4-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK4-NEXT: ]
|
|
|
|
// CHECK4: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK4: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[TMP13]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB9]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !42
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP14]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT:%.*]]
|
|
|
|
// CHECK4: .untied.jmp.1.i:
|
|
|
|
// CHECK4-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK4: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !42
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__5_EXIT]]
|
|
|
|
// CHECK4: .omp_outlined..5.exit:
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.7* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.6*, align 8
|
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.7*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.7* [[TMP1]], %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.7*, %struct.kmp_task_t_with_privates.7** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_7:%.*]], %struct.kmp_task_t_with_privates.7* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.6*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.7* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META43:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META46:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META48:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META50:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !52
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK4-NEXT: store %struct.anon.6* [[TMP8]], %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.6*, %struct.anon.6** [[__CONTEXT_ADDR_I]], align 8, !noalias !52
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load i32, i32* [[TMP11]], align 4
|
|
|
|
// CHECK4-NEXT: switch i32 [[TMP12]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK4-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK4-NEXT: i32 1, label [[DOTUNTIED_JMP_1_I:%.*]]
|
|
|
|
// CHECK4-NEXT: ]
|
|
|
|
// CHECK4: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK4: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP13:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[TMP13]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB11]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !52
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[TMP15:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP14]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT:%.*]]
|
|
|
|
// CHECK4: .untied.jmp.1.i:
|
|
|
|
// CHECK4-NEXT: store i32 1, i32* @a, align 4
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK4: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !52
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__7_EXIT]]
|
|
|
|
// CHECK4: .omp_outlined..7.exit:
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..10
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.9* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.9*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.9* [[TMP1]], %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.9*, %struct.kmp_task_t_with_privates.9** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_9:%.*]], %struct.kmp_task_t_with_privates.9* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.8*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.9* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META53:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META56:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META58:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META60:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !62
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK4-NEXT: store %struct.anon.8* [[TMP8]], %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.8*, %struct.anon.8** [[__CONTEXT_ADDR_I]], align 8, !noalias !62
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..12
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.11* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.10*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.11*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.11* [[TMP1]], %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.11*, %struct.kmp_task_t_with_privates.11** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_11:%.*]], %struct.kmp_task_t_with_privates.11* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.10*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.11* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META63:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META66:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META68:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META70:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !72
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK4-NEXT: store %struct.anon.10* [[TMP8]], %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.10*, %struct.anon.10** [[__CONTEXT_ADDR_I]], align 8, !noalias !72
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 2, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..14
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.13* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.12*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.13*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.13* [[TMP1]], %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.13*, %struct.kmp_task_t_with_privates.13** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_13:%.*]], %struct.kmp_task_t_with_privates.13* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.12*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.13* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META73:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META76:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META78:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META80:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !82
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK4-NEXT: store %struct.anon.12* [[TMP8]], %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.12*, %struct.anon.12** [[__CONTEXT_ADDR_I]], align 8, !noalias !82
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 3, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..16
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.15* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.14*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.15*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.15* [[TMP1]], %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.15*, %struct.kmp_task_t_with_privates.15** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_15:%.*]], %struct.kmp_task_t_with_privates.15* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.14*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.15* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META83:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META86:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META88:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META90:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !92
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK4-NEXT: store %struct.anon.14* [[TMP8]], %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.14*, %struct.anon.14** [[__CONTEXT_ADDR_I]], align 8, !noalias !92
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_14:%.*]], %struct.anon.14* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load i32*, i32** [[TMP11]], align 8
|
|
|
|
// CHECK4-NEXT: store i32 5, i32* [[TMP12]], align 128
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_privates_map.
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct..kmp_privates.t* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]]) #[[ATTR7:[0-9]+]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK4-NEXT: store %struct..kmp_privates.t* [[TMP0]], %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK4-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load %struct..kmp_privates.t*, %struct..kmp_privates.t** [[DOTADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP2]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP3]], i32** [[TMP4]], align 8
|
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..19
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.18* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.17*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.18*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.18* [[TMP1]], %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.18*, %struct.kmp_task_t_with_privates.18** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.17*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t* [[TMP9]] to i8*
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.18* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META93:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META96:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META98:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META100:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !102
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t*, i32**)* @.omp_task_privates_map. to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK4-NEXT: store %struct.anon.17* [[TMP8]], %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load %struct.anon.17*, %struct.anon.17** [[__CONTEXT_ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK4-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !102
|
|
|
|
// CHECK4-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !102
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 4, i32* [[TMP16]], align 128
|
|
|
|
// CHECK4-NEXT: store i32 4, i32* @a, align 4
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_ZN1SD1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SD2Ev(%struct.S* noundef [[THIS1]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_privates_map..20
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct..kmp_privates.t.20* noalias noundef [[TMP0:%.*]], i32** noalias noundef [[TMP1:%.*]], %struct.S** noalias noundef [[TMP2:%.*]], %struct.S** noalias noundef [[TMP3:%.*]]) #[[ATTR7]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca %struct..kmp_privates.t.20*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca i32**, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR2:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR3:%.*]] = alloca %struct.S**, align 8
|
|
|
|
// CHECK4-NEXT: store %struct..kmp_privates.t.20* [[TMP0]], %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK4-NEXT: store i32** [[TMP1]], i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S** [[TMP2]], %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S** [[TMP3]], %struct.S*** [[DOTADDR3]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = load %struct..kmp_privates.t.20*, %struct..kmp_privates.t.20** [[DOTADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20:%.*]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = load i32**, i32*** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 1
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = load %struct.S**, %struct.S*** [[DOTADDR2]], align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S* [[TMP7]], %struct.S** [[TMP8]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T_20]], %struct..kmp_privates.t.20* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.S**, %struct.S*** [[DOTADDR3]], align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S* [[TMP9]], %struct.S** [[TMP10]], align 8
|
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..21
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.19* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.16*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTFIRSTPRIV_PTR_ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTLOCAL_PTR_ADDR_I:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTLOCAL_PTR_ADDR1_I:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_SLOT_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[S1_I:%.*]] = alloca [[STRUCT_S:%.*]], align 4
|
|
|
|
// CHECK4-NEXT: [[S2_I:%.*]] = alloca [[STRUCT_S]], align 4
|
|
|
|
// CHECK4-NEXT: [[REF_TMP_I:%.*]] = alloca [[STRUCT_S]], align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.19*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.19* [[TMP1]], %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.19*, %struct.kmp_task_t_with_privates.19** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19:%.*]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 128
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.16*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_19]], %struct.kmp_task_t_with_privates.19* [[TMP3]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = bitcast %struct..kmp_privates.t.20* [[TMP9]] to i8*
|
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = bitcast %struct.kmp_task_t_with_privates.19* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META103:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META106:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META108:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META110:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !112
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP10]], i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* bitcast (void (%struct..kmp_privates.t.20*, i32**, %struct.S**, %struct.S**)* @.omp_task_privates_map..20 to void (i8*, ...)*), void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP11]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: store %struct.anon.16* [[TMP8]], %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load %struct.anon.16*, %struct.anon.16** [[__CONTEXT_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: [[TMP13:%.*]] = load void (i8*, ...)*, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: [[TMP14:%.*]] = load i8*, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP15:%.*]] = bitcast void (i8*, ...)* [[TMP13]] to void (i8*, i32**, %struct.S**, %struct.S**)*
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: call void [[TMP15]](i8* [[TMP14]], i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], %struct.S** [[DOTLOCAL_PTR_ADDR_I]], %struct.S** [[DOTLOCAL_PTR_ADDR1_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP16:%.*]] = load i32*, i32** [[DOTFIRSTPRIV_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: [[TMP17:%.*]] = load %struct.S*, %struct.S** [[DOTLOCAL_PTR_ADDR_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: [[TMP18:%.*]] = load %struct.S*, %struct.S** [[DOTLOCAL_PTR_ADDR1_I]], align 8, !noalias !112
|
|
|
|
// CHECK4-NEXT: [[TMP19:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP20:%.*]] = load i32, i32* [[TMP19]], align 4
|
|
|
|
// CHECK4-NEXT: switch i32 [[TMP20]], label [[DOTUNTIED_DONE__I:%.*]] [
|
|
|
|
// CHECK4-NEXT: i32 0, label [[DOTUNTIED_JMP__I:%.*]]
|
|
|
|
// CHECK4-NEXT: i32 1, label [[DOTUNTIED_JMP_2_I:%.*]]
|
|
|
|
// CHECK4-NEXT: i32 2, label [[DOTUNTIED_JMP_6_I:%.*]]
|
|
|
|
// CHECK4-NEXT: i32 3, label [[DOTUNTIED_JMP_10_I:%.*]]
|
|
|
|
// CHECK4-NEXT: i32 4, label [[DOTUNTIED_JMP_15_I:%.*]]
|
|
|
|
// CHECK4-NEXT: ]
|
|
|
|
// CHECK4: .untied.done..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I:%.*]]
|
|
|
|
// CHECK4: .untied.jmp..i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP21:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 1, i32* [[TMP21]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP22:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[TMP23:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM_I]], i8* [[TMP22]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT:%.*]]
|
|
|
|
// CHECK4: .untied.jmp.2.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[S1_I]]) #[[ATTR4]]
|
|
|
|
// CHECK4-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[S2_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[S2_I]], i32 0, i32 0
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[A_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM3_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB23:[0-9]+]]) #[[ATTR4]]
|
|
|
|
// CHECK4-NEXT: [[TMP24:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM3_I]], i32 1, i64 256, i64 1, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.18*)* @.omp_task_entry..19 to i32 (i32, i8*)*)) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP25:%.*]] = bitcast i8* [[TMP24]] to %struct.kmp_task_t_with_privates.18*
|
|
|
|
// CHECK4-NEXT: [[TMP26:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18:%.*]], %struct.kmp_task_t_with_privates.18* [[TMP25]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP27:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_18]], %struct.kmp_task_t_with_privates.18* [[TMP25]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP28:%.*]] = getelementptr inbounds [[STRUCT__KMP_PRIVATES_T:%.*]], %struct..kmp_privates.t* [[TMP27]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP29:%.*]] = load i32, i32* [[TMP16]], align 128
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP29]], i32* [[TMP28]], align 128
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM4_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB23]]) #[[ATTR4]]
|
|
|
|
// CHECK4-NEXT: [[TMP30:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM4_I]], i8* [[TMP24]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP31:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 2, i32* [[TMP31]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM5_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP32:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[TMP33:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM5_I]], i8* [[TMP32]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK4: .untied.jmp.6.i:
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM8_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]]) #[[ATTR4]]
|
|
|
|
// CHECK4-NEXT: [[TMP34:%.*]] = call i32 @__kmpc_omp_taskyield(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM8_I]], i32 0) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP35:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 3, i32* [[TMP35]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM9_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP36:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[TMP37:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM9_I]], i8* [[TMP36]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK4: .untied.jmp.10.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SC1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP38:%.*]] = bitcast %struct.S* [[S1_I]] to i8*
|
|
|
|
// CHECK4-NEXT: [[TMP39:%.*]] = bitcast %struct.S* [[REF_TMP_I]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP38]], i8* align 4 [[TMP39]], i64 4, i1 false) #[[ATTR4]], !noalias !112
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[REF_TMP_I]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[A12_I:%.*]] = getelementptr inbounds [[STRUCT_S]], %struct.S* [[S2_I]], i32 0, i32 0
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 10, i32* [[A12_I]], align 4, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM13_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1]]) #[[ATTR4]]
|
|
|
|
// CHECK4-NEXT: [[TMP40:%.*]] = call i32 @__kmpc_omp_taskwait(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM13_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP41:%.*]] = load i32*, i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: store i32 4, i32* [[TMP41]], align 4
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM14_I:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB21]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[TMP42:%.*]] = load i8*, i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !112
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: [[TMP43:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM14_I]], i8* [[TMP42]]) #[[ATTR4]]
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK4: .untied.jmp.15.i:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[S2_I]]) #[[ATTR4]]
|
|
|
|
// CHECK4-NEXT: call void @_ZN1SD1Ev(%struct.S* noundef [[S1_I]]) #[[ATTR4]]
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[CLEANUP_I]]
|
|
|
|
// CHECK4: cleanup.i:
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: [[CLEANUP_DEST_I:%.*]] = load i32, i32* [[CLEANUP_DEST_SLOT_I]], align 4, !noalias !112
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: br label [[DOTOMP_OUTLINED__17_EXIT]]
|
|
|
|
// CHECK4: .omp_outlined..17.exit:
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_ZN1SC2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[A:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], %struct.S* [[THIS1]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[A]], align 4
|
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_ZN1SD2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct.S* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S*, align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S* [[THIS]], %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[THIS1:%.*]] = load %struct.S*, %struct.S** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@__cxx_global_var_init
|
|
|
|
// CHECK4-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
|
|
|
// CHECK4-NEXT: entry:
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN2S1C1Ev(%struct.S1* noundef @s1)
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_ZN2S1C1Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN2S1C2Ev(%struct.S1* noundef [[THIS1]])
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_ZN2S1C2Ev
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct.S1* noundef [[THIS:%.*]]) unnamed_addr #[[ATTR1]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-NEXT: call void @_ZN2S18taskinitEv(%struct.S1* noundef [[THIS1]])
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_ZN2S18taskinitEv
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (%struct.S1* noundef [[THIS:%.*]]) #[[ATTR8:[0-9]+]] align 2 {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[THIS_ADDR:%.*]] = alloca %struct.S1*, align 8
|
|
|
|
// CHECK4-NEXT: [[AGG_CAPTURED:%.*]] = alloca [[STRUCT_ANON_21:%.*]], align 8
|
|
|
|
// CHECK4-NEXT: store %struct.S1* [[THIS]], %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[THIS1:%.*]] = load %struct.S1*, %struct.S1** [[THIS_ADDR]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ANON_21]], %struct.anon.21* [[AGG_CAPTURED]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store %struct.S1* [[THIS1]], %struct.S1** [[TMP0]], align 8
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB25:[0-9]+]])
|
|
|
|
// CHECK4-NEXT: [[TMP1:%.*]] = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM]], i32 1, i64 40, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.22*)* @.omp_task_entry..23 to i32 (i32, i8*)*))
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = bitcast i8* [[TMP1]] to %struct.kmp_task_t_with_privates.22*
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP2]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = load i8*, i8** [[TMP4]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = bitcast %struct.anon.21* [[AGG_CAPTURED]] to i8*
|
|
|
|
// CHECK4-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TMP5]], i8* align 8 [[TMP6]], i64 8, i1 false)
|
|
|
|
// CHECK4-NEXT: [[OMP_GLOBAL_THREAD_NUM2:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB25]])
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = call i32 @__kmpc_omp_task(%struct.ident_t* @[[GLOB1]], i32 [[OMP_GLOBAL_THREAD_NUM2]], i8* [[TMP1]])
|
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@.omp_task_entry..23
|
2021-10-15 18:26:07 +08:00
|
|
|
// CHECK4-SAME: (i32 noundef [[TMP0:%.*]], %struct.kmp_task_t_with_privates.22* noalias noundef [[TMP1:%.*]]) #[[ATTR3]] {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: [[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTPART_ID__ADDR_I:%.*]] = alloca i32*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTPRIVATES__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTCOPY_FN__ADDR_I:%.*]] = alloca void (i8*, ...)*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTTASK_T__ADDR_I:%.*]] = alloca i8*, align 8
|
|
|
|
// CHECK4-NEXT: [[__CONTEXT_ADDR_I:%.*]] = alloca %struct.anon.21*, align 8
|
|
|
|
// CHECK4-NEXT: [[DOTADDR:%.*]] = alloca i32, align 4
|
|
|
|
// CHECK4-NEXT: [[DOTADDR1:%.*]] = alloca %struct.kmp_task_t_with_privates.22*, align 8
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP0]], i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: store %struct.kmp_task_t_with_privates.22* [[TMP1]], %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP2:%.*]] = load i32, i32* [[DOTADDR]], align 4
|
|
|
|
// CHECK4-NEXT: [[TMP3:%.*]] = load %struct.kmp_task_t_with_privates.22*, %struct.kmp_task_t_with_privates.22** [[DOTADDR1]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T_WITH_PRIVATES_22:%.*]], %struct.kmp_task_t_with_privates.22* [[TMP3]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T:%.*]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 2
|
|
|
|
// CHECK4-NEXT: [[TMP6:%.*]] = getelementptr inbounds [[STRUCT_KMP_TASK_T]], %struct.kmp_task_t* [[TMP4]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP7:%.*]] = load i8*, i8** [[TMP6]], align 8
|
|
|
|
// CHECK4-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to %struct.anon.21*
|
|
|
|
// CHECK4-NEXT: [[TMP9:%.*]] = bitcast %struct.kmp_task_t_with_privates.22* [[TMP3]] to i8*
|
2021-06-25 02:39:12 +08:00
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META113:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META116:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META118:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: call void @llvm.experimental.noalias.scope.decl(metadata [[META120:![0-9]+]])
|
|
|
|
// CHECK4-NEXT: store i32 [[TMP2]], i32* [[DOTGLOBAL_TID__ADDR_I]], align 4, !noalias !122
|
|
|
|
// CHECK4-NEXT: store i32* [[TMP5]], i32** [[DOTPART_ID__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK4-NEXT: store i8* null, i8** [[DOTPRIVATES__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK4-NEXT: store void (i8*, ...)* null, void (i8*, ...)** [[DOTCOPY_FN__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK4-NEXT: store i8* [[TMP9]], i8** [[DOTTASK_T__ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK4-NEXT: store %struct.anon.21* [[TMP8]], %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
|
|
|
// CHECK4-NEXT: [[TMP10:%.*]] = load %struct.anon.21*, %struct.anon.21** [[__CONTEXT_ADDR_I]], align 8, !noalias !122
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: [[TMP11:%.*]] = getelementptr inbounds [[STRUCT_ANON_21:%.*]], %struct.anon.21* [[TMP10]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: [[TMP12:%.*]] = load %struct.S1*, %struct.S1** [[TMP11]], align 8
|
|
|
|
// CHECK4-NEXT: [[A_I:%.*]] = getelementptr inbounds [[STRUCT_S1:%.*]], %struct.S1* [[TMP12]], i32 0, i32 0
|
|
|
|
// CHECK4-NEXT: store i32 0, i32* [[A_I]], align 4
|
|
|
|
// CHECK4-NEXT: ret i32 0
|
|
|
|
//
|
|
|
|
//
|
|
|
|
// CHECK4-LABEL: define {{[^@]+}}@_GLOBAL__sub_I_task_codegen.cpp
|
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
2021-04-22 13:57:28 +08:00
|
|
|
// CHECK4-SAME: () #[[ATTR7]] section "__TEXT,__StaticInit,regular,pure_instructions" {
|
2021-05-06 06:13:14 +08:00
|
|
|
// CHECK4-NEXT: entry:
|
|
|
|
// CHECK4-NEXT: call void @__cxx_global_var_init()
|
|
|
|
// CHECK4-NEXT: ret void
|
|
|
|
//
|