directives.
'kmp_task_t' record type added a new field for 'priority' clause and
changed the representation of pointer to destructors for privates used
within loop-based directives.
Old representation:
typedef struct kmp_task { /* GEH: Shouldn't this be
aligned somehow? */
void *shareds; /**< pointer to block of
pointers to shared vars */
kmp_routine_entry_t routine; /**< pointer to routine
to call for executing task */
kmp_int32 part_id; /**< part id for the
task */
kmp_routine_entry_t destructors; /* pointer to function to
invoke deconstructors of firstprivate C++ objects */
/* private vars */
} kmp_task_t;
New representation:
typedef struct kmp_task { /* GEH: Shouldn't this be
aligned somehow? */
void *shareds; /**< pointer to block of
pointers to shared vars */
kmp_routine_entry_t routine; /**< pointer to routine
to call for executing task */
kmp_int32 part_id; /**< part id for the
task */
kmp_cmplrdata_t data1; /* Two known
optional additions: destructors and priority */
kmp_cmplrdata_t data2; /* Process
destructors first, priority second */
/* future data */
/* private vars */
} kmp_task_t;
Also excessive initialization of 'destructors' fields to 'null' was
removed from codegen if it is known that no destructors shal be used.
Currently a special bit is used in 'kmp_tasking_flags_t' bitfields
('destructors_thunk' bitfield).
llvm-svn: 271201
Summary: This patch implements the code generation for the `target update` directive. The implemntation relies on the logic already in place for target data standalone directives, i.e. target enter/exit data.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: caomhin, cfe-commits
Differential Revision: http://reviews.llvm.org/D20650
llvm-svn: 270886
Summary:
The patch contains the parsing and sema support for the `from` clause.
Patch based on the original post by Kelvin Li.
Reviewers: hfinkel, carlo.bertolli, kkwli0, arpith-jacob, ABataev
Subscribers: caomhin, cfe-commits
Differential Revision: http://reviews.llvm.org/D18488
llvm-svn: 270882
Summary:
The patch contains the parsing and sema support for the `to` clause.
Patch based on the original post by Kelvin Li.
Reviewers: carlo.bertolli, hfinkel, kkwli0, arpith-jacob, ABataev
Subscribers: caomhin, cfe-commits
Differential Revision: http://reviews.llvm.org/D18597
llvm-svn: 270880
Summary:
This patch is to add parsing and sema support for `target update` directive. Support for the `to` and `from` clauses will be added by a different patch. This patch also adds support for other clauses that are already implemented upstream and apply to `target update`, e.g. `device` and `if`.
This patch is based on the original post by Kelvin Li.
Reviewers: hfinkel, carlo.bertolli, kkwli0, arpith-jacob, ABataev
Subscribers: caomhin, cfe-commits
Differential Revision: http://reviews.llvm.org/D15944
llvm-svn: 270878
Getting accurate locations for loops is important, because those locations are
used by the frontend to generate optimization remarks. Currently, optimization
remarks for loops often appear on the wrong line, often the first line of the
loop body instead of the loop itself. This is confusing because that line might
itself be another loop, or might be somewhere else completely if the body was
an inlined function call. This happens because of the way we find the loop's
starting location. First, we look for a preheader, and if we find one, and its
terminator has a debug location, then we use that. Otherwise, we look for a
location on an instruction in the loop header.
The fallback heuristic is not bad, but will almost always find the beginning of
the body, and not the loop statement itself. The preheader location search
often fails because there's often not a preheader, and even when there is a
preheader, depending on how it was formed, it sometimes carries the location of
some preceeding code.
I don't see any good theoretical way to fix this problem. On the other hand,
this seems like a straightforward solution: Put the debug location in the
loop's llvm.loop metadata. When emitting debug information, this commit causes
us to add the debug location as an operand to each loop's llvm.loop metadata.
Thus, we now generate this metadata for all loops (not just loops with
optimization hints) when we're otherwise generating debug information.
The remark test case changes depend on the companion LLVM commit r270771.
llvm-svn: 270772
directives.
If firstprivate variable is is captured by value in outlined region and then used as firstprivate variable in inner worksharing directive, the copy for this firstprivate variable was not created. Fixed this bug.
llvm-svn: 270536
For better performance and to unify code with offloading part we pass
scalar firstprivate values by value, instead of by reference. It will
remove some extra copying operations.
llvm-svn: 269751
directives.
OpenMP 4.5 supports clause 'priority' in task-based directives. Patch
adds initial codegen support for this clause in codegen.
llvm-svn: 269050
schedule modifiers.
Runtime library expects some additional data in schedule argument for
loop-based directives, that have additional schedule modifiers
'monotonic|nonmonotonic'.
llvm-svn: 269035
OpenMP 4.5 adds taskloop/taskloop simd directives. These directives
allow to use lastprivate clause. Patch adds codegen for this clause.
llvm-svn: 268618
OpenMP 4.5 defines 'taskloop simd' directive, which is combined
directive for 'taskloop' and 'simd' directives. Patch adds initial
codegen support for this directive and its 2 basic clauses 'safelen' and
'simdlen'.
llvm-svn: 267872
directive.
OpenMP 4.5 defines 'taskloop' directive and 2 additional clauses
'grainsize' and 'num_tasks' for this directive. Patch adds codegen for
these clauses.
These clauses are generated as arguments of the '__kmpc_taskloop'
libcall and are encoded the following way:
void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val, kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup, int sched, kmp_uint64 grainsize, void *task_dup);
If 'grainsize' is specified, 'sched' argument must be set to '1' and
'grainsize' argument must be set to the value of the 'grainsize' clause.
If 'num_tasks' is specified, 'sched' argument must be set to '2' and
'grainsize' argument must be set to the value of the 'num_tasks' clause.
It is possible because these 2 clauses are mutually exclusive and can't
be used at the same time on the same directive.
If none of these clauses is specified, 'sched' argument must be set to
'0'.
llvm-svn: 267862
Summary:
This patch adds support for the target exit data directive code generation.
Given that, apart from the employed runtime call, target exit data requires the same code generation pattern as target enter data, the OpenMP codegen entry point was renamed and reused for both.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17369
llvm-svn: 267814
Summary: This patch adds support for the target enter data directive code generation.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17368
llvm-svn: 267812
Summary:
This patch adds support for the target data directive code generation.
Part of the already existent functionality related with data maps is moved to a new function so that it could be reused.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17367
llvm-svn: 267811
declare reductions.
If reduction clause is applied to instance of class with user-defined
reduction operation without initialization clause, it may cause a crash.
Patch fixes this issue.
llvm-svn: 267695
Currently there is a problem with codegen of inlined directives inside
lambdas, it may cause a crash during codegen because of incorrect
capturing of variables. Patch fixes this problem.
llvm-svn: 267677
The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed.
The next code will be generated for the taskloop directive:
#pragma omp taskloop num_tasks(N) lastprivate(j)
for( i=0; i<N*GRAIN*STRIDE-1; i+=STRIDE ) {
int th = omp_get_thread_num();
#pragma omp atomic
counter++;
#pragma omp atomic
th_counter[th]++;
j = i;
}
Generated code:
task = __kmpc_omp_task_alloc(NULL,gtid,1,sizeof(struct
task),sizeof(struct shar),&task_entry);
psh = task->shareds;
psh->pth_counter = &th_counter;
psh->pcounter = &counter;
psh->pj = &j;
task->lb = 0;
task->ub = N*GRAIN*STRIDE-2;
task->st = STRIDE;
__kmpc_taskloop(
NULL, // location
gtid, // gtid
task, // task structure
1, // if clause value
&task->lb, // lower bound
&task->ub, // upper bound
STRIDE, // loop increment
0, // 1 if nogroup specified
2, // schedule type: 0-none, 1-grainsize, 2-num_tasks
N, // schedule value (ignored for type 0)
(void*)&__task_dup_entry // tasks duplication routine
);
llvm-svn: 267395
causes code generation failure.
The codegen part of firstprivate clause for member decls used type of
original variable without skipping reference type from
OMPCapturedExprDecl. Patch fixes this problem.
llvm-svn: 267125
If loop control variable for simd-based directives is explicitly marked
as linear/lastprivate in clauses, codegen for such construct would
crash. Patch fixes this problem.
llvm-svn: 267101
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266853
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266754
If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks.
llvm-svn: 266722
OpenMP 4.0 defines clause 'uniform' in 'declare simd' directive:
'uniform' '(' <argument-list> ')'
The uniform clause declares one or more arguments to have an invariant value for all concurrent invocations of the function in the execution of a single SIMD loop.
The special this pointer can be used as if was one of the arguments to the function in any of the linear, aligned, or uniform clauses.
llvm-svn: 266041
Summary: See LLVM change D18775 for details, this change depends on it.
Reviewers: jyknight, reames
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18776
llvm-svn: 265569
This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it:
Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region.
Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime.
Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region.
http://reviews.llvm.org/D17963
llvm-svn: 265304
For better support of some specific GNU extensions some extra
transformation of AST nodes were introduced. These transformations are
very hard to handle. The code is improved in handling of these
extensions by using captured expressions construct.
llvm-svn: 264709
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
(required) code to support target specific codegen.
llvm-svn: 264700
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
(required) code to support target specific codegen.
llvm-svn: 264576
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
(required) code to support target specific codegen.
llvm-svn: 264569
OpenMP 4.0 allows to define custom reduction operations using '#pragma
omp declare reduction' construct. Patch allows to use this custom
defined reduction operations in 'reduction' clauses.
llvm-svn: 263701
OpenMP 4.5 allows privatization of non-static data members in OpenMP
constructs. Patch adds proper codegen support for data members in
'linear' clause
llvm-svn: 263003
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.
http://reviews.llvm.org/D17170
llvm-svn: 262832
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.
http://reviews.llvm.org/D17170
llvm-svn: 262741
Add code generation support for firstprivate and private clauses of teams on the host. Add extensive regression tests including lambda functions and vla testing.
http://reviews.llvm.org/D17582
llvm-svn: 262663
Summary:
This patch implements the launching of a target region in the presence of a nested teams region, i.e calls tgt_target_teams with the required arguments gathered from the enclosed teams directive.
The actual codegen of the region enclosed by the teams construct will be contributed in a separate patch.
Reviewers: hfinkel, arpith-jacob, kkwli0, carlo.bertolli, ABataev
Subscribers: cfe-commits, caomhin, fraggamuffin
Differential Revision: http://reviews.llvm.org/D17019
llvm-svn: 262625
OpenMP 4.5 allows to privatize data members of current class in member
functions. Patch adds initial support for privatization of data members
in 'linear' clause, no codegen support.
llvm-svn: 262578
OpenMP 4.5 allows to privatize non-static data members of current class
in non-static member functions. Patch supports codegen for non-static
data members in 'reduction' clauses.
llvm-svn: 262460