llvm-project

Commit Graph

Author	SHA1	Message	Date
George Rokos	29d0f00340	[OpenMP] Create COMDAT group for OpenMP offload registration code to avoid multiple copies Thanks to Sergey Dmitriev for submitting the patch. Differential Revision: https://reviews.llvm.org/D33509 llvm-svn: 304056	2017-05-27 03:03:13 +00:00
Alexey Bataev	979966fcd8	[OPENMP] Allow value of thread local variables in target regions. If the variable is marked as TLS variable and target device does not support TLS, the error is emitted for the variable even if it is not used in target regions. Patch fixes this and allows to use the values of the TLS variables in target regions. llvm-svn: 303768	2017-05-24 16:00:02 +00:00
Alexey Bataev	2c84541a21	[OPENMP] Check DSA for variables captured by value. Currently clang checks for default data sharing attributes only for variables captured in OpenMP regions by reference. Patch adds checks for variables captured by value. llvm-svn: 303077	2017-05-15 16:26:15 +00:00
Reid Kleckner	f1deb837ee	Fix bugs checking va_start in lambdas and erroneous contexts Summary: First, getCurFunction looks through blocks and lambdas, which is wrong. Inside a lambda, va_start should refer to the lambda call operator prototype. This fixes PR32737. Second, we shouldn't use any of the getCur* methods, because they look through contexts that we don't want to look through (EnumDecl, CapturedStmtDecl). We can use CurContext directly as the calling context. Finally, this code assumed that CallExprs would never appear outside of code contexts (block, function, obj-c method), which is wrong. Struct member initializers are an easy way to create and parse exprs in a non-code context. Reviewers: rsmith Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D32761 llvm-svn: 302188	2017-05-04 19:51:05 +00:00
Carlo Bertolli	d8844b9d43	[OpenMP] Extended parse for 'always' map modifier https://reviews.llvm.org/D32807 This patch allows the map modifier 'always' to be separated by the map type (to, from, tofrom) only by a whitespace, rather than strictly by a comma as in current trunk. llvm-svn: 302031	2017-05-03 15:28:48 +00:00
Nick Lewycky	e7d6fbdfb7	Remove Sema::CheckForIntOverflow, and instead check all full-expressions. CheckForIntOverflow used to implement a whitelist of top-level expressions to send to the constant expression evaluator, which handled many more expressions than the CheckForIntOverflow whitelist did. llvm-svn: 301742	2017-04-29 09:33:46 +00:00
Alexey Bataev	b435a5ff5e	[OPENMP] Fix failing test. llvm-svn: 301417	2017-04-26 15:30:36 +00:00
Alexey Bataev	4b46539ef3	[OPENMP] Fix handling of OpenMP code during template instantiation. If some function template is instantiated during handling of OpenMP code, currently it may cause crash of compiler because of trying of capturing variables in non-capturing function scopes. Patch fixes this bug. llvm-svn: 301416	2017-04-26 15:06:24 +00:00
Carlo Bertolli	356822fe7b	Minor fix for distribute_parallel_for_num_threads_codegen on AARCH64 llvm-svn: 301348	2017-04-25 18:59:37 +00:00
Carlo Bertolli	b0ff0a69c3	Recommit of [OpenMP] Initial implementation of code generation for pragma 'distribute parallel for' on host https://reviews.llvm.org/D29508 This patch makes the following additions: It abstracts away loop bound generation code from procedures associated with pragma 'for' and loops in general, in such a way that the same procedures can be used for 'distribute parallel for' without the need for a full re-implementation. It implements code generation for 'distribute parallel for' and adds regression tests. It includes tests for clauses. It is important to notice that most of the clauses are implemented as part of existing procedures. For instance, firstprivate is already implemented for 'distribute' and 'for' as separate pragmas. As the implementation of 'distribute parallel for' is based on the same procedures, then we automatically obtain implementation for such clauses without the need to add new code. However, this requires regression tests that verify correctness of produced code. llvm-svn: 301340	2017-04-25 17:52:12 +00:00
Carlo Bertolli	f09daae75d	Revert r301223 llvm-svn: 301233	2017-04-24 19:50:35 +00:00
Carlo Bertolli	4287d65c10	[OpenMP] Initial implementation of code generation for pragma 'distribute parallel for' on host https://reviews.llvm.org/D29508 This patch makes the following additions: 1. It abstracts away loop bound generation code from procedures associated with pragma 'for' and loops in general, in such a way that the same procedures can be used for 'distribute parallel for' without the need for a full re-implementation. 2. It implements code generation for 'distribute parallel for' and adds regression tests. It includes tests for clauses. It is important to notice that most of the clauses are implemented as part of existing procedures. For instance, firstprivate is already implemented for 'distribute' and 'for' as separate pragmas. As the implementation of 'distribute parallel for' is based on the same procedures, then we automatically obtain implementation for such clauses without the need to add new code. However, this requires regression tests that verify correctness of produced code. Looking forward to comments. llvm-svn: 301223	2017-04-24 19:26:11 +00:00
Carlo Bertolli	ffafe10fac	[OpenMP] Prepare sema to support combined constructs with omp distribute and omp for https://reviews.llvm.org/D32237 This patch prepares sema with additional fields to support all those composite and combined constructs of OpenMP that include pragma 'distribute' and 'for', such as 'distribute parallel for'. It also extends the regression tests for 'distribute parallel for' and adds a new one. llvm-svn: 300802	2017-04-20 00:39:39 +00:00
Alexey Bataev	f7ce166220	[OPENMP] Fix for PR32333: Crash in call of outlined Function. If the type of the captured variable is a pointer(s) to variably modified type, this type was not processed correctly. Need to drill into the type, find the innermost variably modified array type and convert it to canonical parameter type. llvm-svn: 299868	2017-04-10 19:16:45 +00:00
Jonas Hahnfeld	bf5061b18c	[test] Unbreak OpenMP/linking.c with arch-specific libdir After rL296927, -rpath gets added after linking the OpenMP runtime. That's why -lgcc does not immediately follow -lomp or -lgomp. llvm-svn: 297264	2017-03-08 09:07:33 +00:00
Jonas Hahnfeld	64a9e3c530	[OpenMP] Generate better diagnostics for cancel and cancellation point checkNestingOfRegions uses CancelRegion to determine whether cancel and cancellation point are valid in the given nesting. This leads to unuseful diagnostics if CancelRegion is invalid. The given test case has produced: region cannot be closely nested inside 'parallel' region As a solution, introduce checkCancelRegion and call it first to get the expected error: one of 'for', 'parallel', 'sections' or 'taskgroup' is expected Differential Revision: https://reviews.llvm.org/D30135 llvm-svn: 295808	2017-02-22 06:49:10 +00:00
Jonas Hahnfeld	b07931f01d	[OpenMP] Fix cancellation point in task with no cancel With tasks, the cancel may happen in another task. This has a different region info which means that we can't find it here. Differential Revision: https://reviews.llvm.org/D30091 llvm-svn: 295474	2017-02-17 18:32:58 +00:00
Jonas Hahnfeld	20fce72f1b	[OpenMP] Remove barriers at cancel and cancellation point This resolves a deadlock with the cancel directive when there is no explicit cancellation point. In that case, the implicit barrier acts as cancellation point. After removing the barrier after cancel, the now unmatched barrier for the explicit cancellation point has to go as well. This has probably worked before rL255992: With the calls for the explicit barrier, it was sure that all threads passed a barrier before exiting. Reported by Simon Convent and Joachim Protze! Differential Revision: https://reviews.llvm.org/D30088 llvm-svn: 295473	2017-02-17 18:32:51 +00:00
Arpith Chacko Jacob	fc711b1f47	[OpenMP] Teams reduction on the NVPTX device. This patch implements codegen for the reduction clause on any teams construct for elementary data types. It builds on parallel reductions on the GPU. Subsequently, the team master writes to a unique location in a global memory scratchpad. The last team to do so loads and reduces this array to calculate the final result. This patch emits two helper functions that are used by the OpenMP runtime on the GPU to perform reductions across teams. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29879 llvm-svn: 295335	2017-02-16 16:48:49 +00:00
Arpith Chacko Jacob	101e8fb1f3	[OpenMP] Parallel reduction on the NVPTX device. This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295333	2017-02-16 16:20:16 +00:00
Arpith Chacko Jacob	bd6344c0be	Revert r295319 while investigating buildbot failure. llvm-svn: 295323	2017-02-16 14:25:35 +00:00
Arpith Chacko Jacob	8e170fc857	[OpenMP] Parallel reduction on the NVPTX device. This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295319	2017-02-16 14:03:36 +00:00
Reid Kleckner	9de921470d	[CodeGen] Treat auto-generated __dso_handle symbol as HiddenVisibility Fixes https://bugs.llvm.org/show_bug.cgi?id=31932 Based on a patch by Roland McGrath Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D29843 llvm-svn: 294978	2017-02-13 18:49:21 +00:00
Charles Li	4f80074629	[Lit Test] Make tests C++11 compatible - Parse OpenMP Differential Revision: https://reviews.llvm.org/D29725 llvm-svn: 294504	2017-02-08 19:46:15 +00:00
Carlo Bertolli	d2192d1b63	[OpenMP] Remove fixme comment in regression test and related unnecessary statement https://reviews.llvm.org/D29501 It looks like I forgot to remove a FIXME comment with the associated statement. The test does not need it and it gives the wrong impression of being an incomplete test. llvm-svn: 294195	2017-02-06 16:03:41 +00:00
Carlo Bertolli	7aa20a117d	[OpenMP] Add missing regression test for pragma distribute, clause firstprivate https://reviews.llvm.org/D28243 The regression test was missing from the previous already accepted patch. llvm-svn: 294026	2017-02-03 19:09:16 +00:00
Charles Li	bd1961d2bf	[Lit Test] Make tests C++11 compatible - OpenMP constant expressions C++11 introduced constexpr, hence the change in diagnostics. Differential Revision: https://reviews.llvm.org/D29480 llvm-svn: 294025	2017-02-03 18:58:34 +00:00
Arpith Chacko Jacob	cdda3daa7f	[OpenMP][NVPTX][CUDA] Adding support for printf for an NVPTX OpenMP device. Support for CUDA printf is exploited to support printf for an NVPTX OpenMP device. To reflect the support of both programming models, the file CGCUDABuiltin.cpp has been renamed to CGGPUBuiltin.cpp, and the call EmitCUDADevicePrintfCallExpr has been renamed to EmitGPUDevicePrintfCallExpr. Reviewers: jlebar Differential Revision: https://reviews.llvm.org/D17890 llvm-svn: 293444	2017-01-29 20:49:31 +00:00
Arpith Chacko Jacob	cca61a3a74	[OpenMP] Codegen support for 'target teams' on the NVPTX device. This is a simple patch to teach OpenMP codegen to emit the construct in Generic mode. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29143 llvm-svn: 293183	2017-01-26 15:43:27 +00:00
Arpith Chacko Jacob	2cd6eeabfd	[OpenMP] Support for the proc_bind-clause on 'target parallel' on the NVPTX device. This patch adds support for the proc_bind clause on the Spmd construct 'target parallel' on the NVPTX device. Since the parallel region is created upon kernel launch, this clause can be safely ignored on the NVPTX device at codegen time for level 0 parallelism. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29128 llvm-svn: 293069	2017-01-25 16:55:10 +00:00
Arpith Chacko Jacob	7ecc0b7f3d	[OpenMP] Support for thread_limit-clause on the 'target teams' directive. The thread_limit-clause on the combined directive applies to the 'teams' region of this construct. We modify the ThreadLimitClause class to capture the clause expression within the 'target' region. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29087 llvm-svn: 293049	2017-01-25 11:44:35 +00:00
Arpith Chacko Jacob	bc126344e1	[OpenMP] Support for num_teams-clause on the 'target teams' directive. The num_teams-clause on the combined directive applies to the 'teams' region of this construct. We modify the NumTeamsClause class to capture the clause expression within the 'target' region. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29085 llvm-svn: 293048	2017-01-25 11:28:18 +00:00
Arpith Chacko Jacob	99a1e0eba5	[OpenMP] Codegen support for 'target teams' on the host. This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293005	2017-01-25 02:18:43 +00:00
Arpith Chacko Jacob	86f9e46365	Reverting commit because an NVPTX patch sneaked in. Break up into two patches. llvm-svn: 293003	2017-01-25 01:45:59 +00:00
Arpith Chacko Jacob	4dbf368e14	[OpenMP] Codegen support for 'target teams' on the host. This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293001	2017-01-25 01:38:33 +00:00
Arpith Chacko Jacob	e04da5dee2	[OpenMP] Support for the num_threads-clause on 'target parallel' on the NVPTX device. This patch adds support for the Spmd construct 'target parallel' on the NVPTX device. This involves ignoring the num_threads clause on the device since the number of threads in this combined construct is already set on the host through the call to __tgt_target_teams(). Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29083 llvm-svn: 292999	2017-01-25 01:18:34 +00:00
Arpith Chacko Jacob	33c849a007	[OpenMP] Support for the num_threads-clause on 'target parallel'. The num_threads-clause on the combined directive applies to the 'parallel' region of this construct. We modify the NumThreadsClause class to capture the clause expression within the 'target' region. The offload runtime call for 'target parallel' is changed to __tgt_target_teams() with 1 team and the number of threads set by this clause or a default if none. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29082 llvm-svn: 292997	2017-01-25 00:57:16 +00:00
Arpith Chacko Jacob	1f46b70b5d	[OpenMP] DSAChecker bug fix for combined directives. The DSAChecker code in SemaOpenMP looks at the captured statement associated with an OpenMP directive. A combined directive such as 'target parallel' has nested capture statements, which have to be fully traversed before executing the DSAChecker. This is a patch to perform the traversal for such combined directives. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29026 llvm-svn: 292794	2017-01-23 15:38:49 +00:00
Alexey Bataev	880d8605e3	[OPENMP] Fix for PR31643: Clang crashes when compiling code on Windows with SEH and openmp In some cituations (during codegen for Windows SEH constructs) CodeGenFunction instance may have CurFn equal to nullptr. OpenMP related code does not expect such situation during cleanup. llvm-svn: 292590	2017-01-20 08:57:28 +00:00
Arpith Chacko Jacob	fe4890a68b	[OpenMP] Support for the if-clause on the combined directive 'target parallel'. The if-clause on the combined directive potentially applies to both the 'target' and the 'parallel' regions. Codegen'ing the if-clause on the combined directive requires additional support because the expression in the clause must be captured by the 'target' capture statement but not the 'parallel' capture statement. Note that this situation arises for other clauses such as num_threads. The OMPIfClause class inherits OMPClauseWithPreInit to support capturing of expressions in the clause. A member CaptureRegion is added to OMPClauseWithPreInit to indicate which captured statement (in this case 'target' but not 'parallel') captures these expressions. To ensure correct codegen of captured expressions in the presence of combined 'target' directives, OMPParallelScope was added to 'parallel' codegen. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28781 llvm-svn: 292437	2017-01-18 20:40:48 +00:00
Arpith Chacko Jacob	44a87c9f1b	[OpenMP] Codegen for the 'target parallel' directive on the NVPTX device. This patch adds codegen for the 'target parallel' directive on the NVPTX device. We term offload OpenMP directives such as 'target parallel' and 'target teams distribute parallel for' as SPMD constructs. SPMD constructs, in contrast to Generic ones like the plain 'target', can never contain a serial region. SPMD constructs can be handled more efficiently on the GPU and do not require the Warp Loop of the Generic codegen scheme. This patch adds SPMD codegen support for 'target parallel' on the NVPTX device and can be reused for other SPMD constructs. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28755 llvm-svn: 292428	2017-01-18 19:35:00 +00:00
Arpith Chacko Jacob	19b911cb75	[OpenMP] Codegen support for 'target parallel' on the host. This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292419	2017-01-18 18:18:53 +00:00
Arpith Chacko Jacob	42793e000a	Revert r292374 to debug Windows buildbot failure. llvm-svn: 292400	2017-01-18 15:36:05 +00:00
Arpith Chacko Jacob	68019578a3	[OpenMP] Codegen support for 'target parallel' on the host. This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292374	2017-01-18 15:14:52 +00:00
Richard Smith	fbe2369f1a	Improve handling of instantiated thread_local variables in Itanium C++ ABI. * Do not initialize these variables when initializing the rest of the thread_locals in the TU; they have unordered initialization so they can be initialized by themselves. This fixes a rejects-valid bug: we would make the per-variable initializer function internal, but put it in a comdat keyed off the variable, resulting in link errors when the comdat is selected from a different TU (as the per TU TLS init function tries to call an init function that does not exist). * On Darwin, when we decide that we're not going to emit a thread wrapper function at all, demote its linkage to External. Fixes a verifier failure on explicit instantiation of a thread_local variable on Darwin. llvm-svn: 291865	2017-01-13 00:43:31 +00:00
Kelvin Li	da68118729	[OpenMP] Sema and parsing for 'target teams distribute simd’ pragma This patch is to implement sema and parsing for 'target teams distribute simd’ pragma. Differential Revision: https://reviews.llvm.org/D28252 llvm-svn: 291579	2017-01-10 18:08:18 +00:00
Arpith Chacko Jacob	bb36fe8dba	[OpenMP] Basic support for a parallel directive in a target region on an NVPTX device Summary: This patch introduces support for the execution of parallel constructs in a target region on the NVPTX device. Parallel regions must be in the lexical scope of the target directive. The master thread in the master warp signals parallel work for worker threads in worker warps on encountering a parallel region. Note: The patch does not yet support capture of arguments in a parallel region so the test cases are simple. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28145 llvm-svn: 291565	2017-01-10 15:42:51 +00:00
Kelvin Li	4101032e5f	[OpenMP] Support the 'is_device_ptr' clause with 'target parallel for' pragma This patch is to add support of the 'is_device_ptr' clause with the 'target parallel for' pragma. Differential Revision: https://reviews.llvm.org/D28255 llvm-svn: 291540	2017-01-10 05:15:35 +00:00
Kelvin Li	c4bfc6fa8f	[OpenMP] Support the 'is_device_ptr' clause with 'target parallel for simd' pragma This patch is to add support of the 'is_device_ptr' clause with the 'target parallel for simd' pragma. Differential Revision: https://reviews.llvm.org/D28402 llvm-svn: 291537	2017-01-10 04:26:44 +00:00
Douglas Yung	46d134c4ab	Fixing test to work when the compiler defaults to a different C++ standard version. Differential Revision: https://reviews.llvm.org/D28418 llvm-svn: 291491	2017-01-09 22:20:10 +00:00

1 2 3 4 5 ...

661 Commits