llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexey Bataev	dadf2d1238	[OPENMP] Do not crash on codegen for CXX member functions. Non-static member functions should not be emitted as a standalone functions, this leads to compiler crash. llvm-svn: 331206	2018-04-30 18:09:40 +00:00
Alexey Bataev	64e62dcfff	[OPENMP] Do not crash on incorrect input data. Emit error messages instead of compiler crashing when the target region does not exist in the device code + fix crash when the location comes from macros. llvm-svn: 331195	2018-04-30 16:26:57 +00:00
Alexey Bataev	2c1dffece6	[OPENMP] Allow to use declare target variables in map clauses Global variables marked as declare target are allowed to be used in map clauses. Patch fixes the crash of the compiler on the declare target variables in map clauses. llvm-svn: 330156	2018-04-16 20:34:41 +00:00
Alexey Bataev	a4fa0b880a	[OPENMP] General code improvements. llvm-svn: 330140	2018-04-16 17:59:34 +00:00
Alexey Bataev	43a919f667	[OPENMP] Replace push_back by emplace_back, NFC. llvm-svn: 330042	2018-04-13 17:48:43 +00:00
Alexey Bataev	c0f879bcec	[OPENMP] Additional attributes for the pointer parameters. Added attributes for better optimization of the OpenMP code. llvm-svn: 329751	2018-04-10 20:10:53 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Richard Smith	e78fac5126	PR36992: do not store beyond the dsize of a class object unless we know the tail padding is not reused. We track on the AggValueSlot (and through a couple of other initialization actions) whether we're dealing with an object that might share its tail padding with some other object, so that we can avoid emitting stores into the tail padding if that's the case. We still widen stores into tail padding when we can do so. Differential Revision: https://reviews.llvm.org/D45306 llvm-svn: 329342	2018-04-05 20:52:58 +00:00
Alexey Bataev	03f270c900	[OPENMP] Added emission of offloading data sections for declare target variables. Added emission of the offloading data sections for the variables within declare target regions + fixes emission of the declare target variables marked as declare target not within the declare target region. llvm-svn: 328888	2018-03-30 18:31:07 +00:00
Alexey Bataev	34f8a7043b	[OPENMP] Codegen for ctor\|dtor of declare target variables. When the declare target variables are emitted for the device, constructors\|destructors for these variables must emitted and registered by the runtime in the offloading sections. llvm-svn: 328705	2018-03-28 14:28:54 +00:00
Alexey Bataev	92327c50d3	[OPENMP] Codegen for declare target with link clause. If the link clause is used on the declare target directive, the object should be linked on target or target data directives, not during the codegen. Patch adds support for this clause. llvm-svn: 328544	2018-03-26 16:40:55 +00:00
Alexey Bataev	4f4bf7c348	[OPENMP] Codegen for `omp declare target` construct. Added initial codegen for device side of declarations inside `omp declare target` construct + codegen for implicit `declare target` functions, which are used in the target regions. llvm-svn: 327636	2018-03-15 15:47:20 +00:00
Gheorghe-Teodor Bercea	d3dcf2f05d	[OpenMP] Add OpenMP data sharing infrastructure using global memory Summary: This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure. TODO: add a more detailed description. Reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43660 llvm-svn: 327513	2018-03-14 14:17:45 +00:00
Alexey Bataev	21dab12453	[OPENMP] Fix the address of the original variable in task reductions. If initialization of the task reductions requires pointer to original variable, which is stored in the threadprivate storage, we used the address of this pointer instead. llvm-svn: 327136	2018-03-09 15:20:30 +00:00
Alexey Bataev	1c44e15f6d	[OPENMP] Fix generation of the unique names for task reduction variables. If the task has reduction construct and this construct for some variable requires unique threadprivate storage, we may generate different names for variables used in taskgroup task_reduction clause and in task in_reduction clause. Patch fixes this problem. llvm-svn: 326827	2018-03-06 18:59:43 +00:00
Alexey Bataev	20cf67c233	[OPENMP] Scan all redeclarations looking for `declare simd` attribute. Patch fixes the problem with the functions marked as `declare simd`. If the canonical declaration does not have associated `declare simd` construct, we may not generate required code even if other redeclarations are marked as `declare simd`. llvm-svn: 326594	2018-03-02 18:07:00 +00:00
George Burgess IV	00f70bd933	Remove redundant casts. NFC So I wrote a clang-tidy check to lint out redundant `isa`, `cast`, and `dyn_cast`s for fun. This is a portion of what it found for clang; I plan to do similar cleanups in LLVM and other subprojects when I find time. Because of the volume of changes, I explicitly avoided making any change that wasn't highly local and obviously correct to me (e.g. we still have a number of foo(cast<Bar>(baz)) that I didn't touch, since overloading is a thing and the cast<Bar> did actually change the type -- just up the class hierarchy). I also tried to leave the types we were cast<>ing to somewhere nearby, in cases where it wasn't locally obvious what we were dealing with before. llvm-svn: 326416	2018-03-01 05:43:23 +00:00
Rafael Espindola	51ec5a9ce3	Pass a GlobalDecl to SetInternalFunctionAttributes. NFC. This just reduces the noise in a followup patch. Part of D43900. llvm-svn: 326385	2018-02-28 23:46:35 +00:00
Alexey Bataev	7ef47a67a5	[OPENMP] Require valid SourceLocation in function call, NFC. Removed default empty SourceLocation argument from `emitCall` function and require valid location. llvm-svn: 325812	2018-02-22 18:33:31 +00:00
Sander de Smalen	891af03a55	Recommit rL323952: [DebugInfo] Enable debug information for C99 VLA types. Fixed build issue when building with g++-4.8 (specialization after instantiation). llvm-svn: 324173	2018-02-03 13:55:59 +00:00
Sander de Smalen	4e9a1264dd	Reverting patch rL323952 due to build errors that I haven't encountered in local builds. llvm-svn: 323956	2018-02-01 12:27:13 +00:00
Sander de Smalen	17c4633e7f	[DebugInfo] Enable debug information for C99 VLA types Summary: This patch enables debugging of C99 VLA types by generating more precise LLVM Debug metadata, using the extended DISubrange 'count' field that takes a DIVariable. This should implement: Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's) https://bugs.llvm.org/show_bug.cgi?id=30553 Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits Differential Revision: https://reviews.llvm.org/D41698 llvm-svn: 323952	2018-02-01 11:25:10 +00:00
Ivan A. Kosarev	1860b520a2	[CodeGen] Decorate aggregate accesses with TBAA tags Differential Revision: https://reviews.llvm.org/D41539 llvm-svn: 323421	2018-01-25 14:21:55 +00:00
Alexey Bataev	1e49137d34	[OPENMP] Replace call of EmitLoadOfLValue() by EmitLoadOfScalar(), NFC. Replace calls of EmitLoadOfLValue() by EmitLoadOfScalar() functions if it is known that the value is scalar. llvm-svn: 323236	2018-01-23 18:44:14 +00:00
Alexey Bataev	a9b9cc0d79	[OPENMP] Remove more empty SourceLocations() from the code. Removed more empty SourceLocations() from the OpenMP code and replaced with the correct locations for better debug info emission. llvm-svn: 323232	2018-01-23 18:12:38 +00:00
Jonas Hahnfeld	5e4df288e2	[OpenMP] Correct generation of offloading entries Firstly, each offloading entry must have a unique name or the linker will complain if there are multiple files with target regions. Secondly, the compiler must not introduce padding so mark the struct with a PackedAttr. Differential Revision: https://reviews.llvm.org/D42168 llvm-svn: 322858	2018-01-18 15:38:03 +00:00
Alexey Bataev	647dd84422	[OPENMP] Initial codegen for `target teams distribute parallel for simd`. Added host codegen + codegen for devices with default codegen for `#pragma omp target teams distribute parallel for simd` directive. llvm-svn: 322515	2018-01-15 20:59:40 +00:00
Alexey Bataev	8451efad89	[OPENMP] Add codegen for `depend` clauses on `target` directive. Added basic support for codegen of `depend` clauses on `target` directive. llvm-svn: 322501	2018-01-15 19:06:12 +00:00
Alexey Bataev	475a7440f1	[OPENMP] Replace calls of getAssociatedStmt(). getAssociatedStmt() returns the outermost captured statement for the OpenMP directive. It may return incorrect region in case of combined constructs. Reworked the code to reduce the number of calls of getAssociatedStmt() and used getInnermostCapturedStmt() and getCapturedStmt() functions instead. In case of firstprivate variables it may lead to an extra allocas generation for private copies even if the variable is passed by value into outlined function and could be used directly as private copy. llvm-svn: 322393	2018-01-12 19:39:11 +00:00
Rafael Espindola	cbca487f49	Make internal/private GVs implicitly dso_local. While updating clang tests for having clang set dso_local I noticed that: - There are a lot of tests to update. - Many of the updates are redundant. They are redundant because a GV is "obviously dso_local". This patch starts formalizing that a bit by requiring that internal and private GVs be dso_local too. Since they all are, we don't have to print dso_local to the textual representation, making it a bit more compact and easier to read. llvm-svn: 322318	2018-01-11 22:15:12 +00:00
Alexey Bataev	768f1f219c	[OPENMP] Fix directive kind on stand-alone target data directives, NFC. llvm-svn: 322112	2018-01-09 19:59:25 +00:00
Alexey Bataev	7cae94e74c	[OPENMP] Add debug info for generated functions. Most of the generated functions for the OpenMP were generated with disabled debug info. Patch fixes this for better user experience. llvm-svn: 321816	2018-01-04 19:45:16 +00:00
Carlo Bertolli	52978c3554	[OpenMP] Initial implementation of code generation for pragma 'target teams distribute parallel for' on host https://reviews.llvm.org/D41709 This patch includes code generation and testing for offloading when target device is host. llvm-svn: 321759	2018-01-03 21:12:44 +00:00
Alexey Bataev	a8a9153a37	[OPENMP] Support for -fopenmp-simd option with compilation of simd loops only. Added support for -fopenmp-simd option that allows compilation of simd-based constructs without emission of OpenMP runtime calls. llvm-svn: 321560	2017-12-29 18:07:07 +00:00
Alexey Bataev	d2202caeda	[OPENMP] Support for `depend` clauses on `target data update`. Added codegen for `depend` clauses on `target data update` directives. llvm-svn: 321493	2017-12-27 17:58:32 +00:00
Alexey Bataev	0cc6b8ec61	[OPENMP] Add codegen for target data constructs with `nowait` clause. Added codegen for the `nowait` clause in target data constructs. llvm-svn: 320717	2017-12-14 17:00:17 +00:00
Alexey Bataev	a9f77c6df7	[OPENMP] Add codegen for `nowait` clause in target directives. Added basic codegen for `nowait` clauses in target-based directives. llvm-svn: 320613	2017-12-13 21:04:20 +00:00
Alexey Bataev	fbe17fb8a5	[OPENMP] Initial codegen for `target teams distribute simd` directive. Host + generic device codegen for `target teams distribute simd` directive. llvm-svn: 320608	2017-12-13 19:45:06 +00:00
Alexey Bataev	3f96fe6d44	[OPENMP] Support `reduction` clause on target-based directives. OpenMP 5.0 added support for `reduction` clause in target-based directives. Patch adds this support to clang. llvm-svn: 320596	2017-12-13 17:31:39 +00:00
Alexey Bataev	dfa430f694	[OPENMP] Initial codegen for `target teams distribute` directive. Host + default devices codegen for `target teams distribute` directive. llvm-svn: 320149	2017-12-08 15:03:50 +00:00
Jonas Hahnfeld	273d261b8f	Fix PR35542: Correct adjusting of private reduction variable The adjustment is calculated with CreatePtrDiff() which returns the difference in (base) elements. This is passed to CreateGEP() so make sure that the GEP base has the correct pointer type: It needs to be a pointer to the base type, not a pointer to a constant sized array. Differential Revision: https://reviews.llvm.org/D40911 llvm-svn: 319931	2017-12-06 19:15:28 +00:00
Alexey Bataev	50a1c7860f	[OPENMP] Emit `__tgt_target_teams` for all teams directives. Previously we emitted `__tgt_target_teams` only for standalone teams directives. This patch allows emit this function for all teams-based directives. llvm-svn: 319585	2017-12-01 21:31:08 +00:00
Mandeep Singh Grang	b14fb6a216	[OpenMP] Stable sort Privates to remove non-deterministic ordering Summary: This fixes the following failures uncovered by D39245: Clang :: OpenMP/task_firstprivate_codegen.cpp Clang :: OpenMP/task_private_codegen.cpp Clang :: OpenMP/taskloop_firstprivate_codegen.cpp Clang :: OpenMP/taskloop_lastprivate_codegen.cpp Clang :: OpenMP/taskloop_private_codegen.cpp Clang :: OpenMP/taskloop_simd_firstprivate_codegen.cpp Clang :: OpenMP/taskloop_simd_lastprivate_codegen.cpp Clang :: OpenMP/taskloop_simd_private_codegen.cpp Reviewers: rjmccall, ABataev, AndreyChurbanov Reviewed By: rjmccall, ABataev Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D39947 llvm-svn: 319222	2017-11-28 20:41:13 +00:00
Alexey Bataev	10a5431239	[OPENMP] Improve handling of cancel directives in target-based constructs, NFC. Improved handling of cancel\|cancellation point directives inside target-based for directives. llvm-svn: 319046	2017-11-27 16:54:08 +00:00
George Rokos	63bc9d6f66	[Clang][OpenMP] New clang/libomptarget map interface: new function signatures, clang-side This clang patch changes the __tgt_* API function signatures in preparation for the new map interface. Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits Differential revision: https://reviews.llvm.org/D40281 llvm-svn: 318789	2017-11-21 18:25:12 +00:00
Hans Wennborg	989a65cd29	Fix some -Wunused-variable warnings llvm-svn: 318578	2017-11-18 00:49:18 +00:00
Alexey Bataev	f836537516	[OPENMP] Codegen for `target simd` construct. Added codegen support for `target simd` directive. llvm-svn: 318536	2017-11-17 17:57:25 +00:00
Alexey Bataev	2139ed638b	[OPENMP] Add support for cancelling inside target parallel for directive. Added missed support for cancelling of target parallel for construct. llvm-svn: 318434	2017-11-16 18:20:21 +00:00
Alexey Bataev	5d7edca316	[OPENMP] Codegen for `#pragma omp target parallel for simd`. Added codegen for `#pragma omp target parallel for simd` and clauses. llvm-svn: 317813	2017-11-09 17:32:15 +00:00
Alexey Bataev	fb0ebecf0e	[OPENMP] Codegen for `#pragma omp target parallel for`. llvm-svn: 317719	2017-11-08 20:16:14 +00:00
George Rokos	065755d23d	Clang/libomptarget map interface flag renaming - NFC patch This patch renames some of the flag names of the clang/libomptarget map interface. The old names are slightly misleading, whereas the new ones describe in a better way what each flag is about. Only the macros within the enumeration are renamed, there is no change in functionality therefore there are no updated regression tests. Differential Revision: https://reviews.llvm.org/D39745 llvm-svn: 317598	2017-11-07 18:27:04 +00:00
Alexey Bataev	5d2c9a46fc	[OPENMP] Fix PR35152: Do not use getInvokeDest() function for EH checks. The compiler may crash under some conditions if the getInvokeDest() is used, but later it is not used. Fixed this problem in OpenMP. llvm-svn: 317227	2017-11-02 18:55:05 +00:00
Alexey Bataev	0e1b45897e	[OPENMP] Fix PR35156: Get correct thread id with windows exceptions. If the thread id is requested in windows mode within funclets, we may generate incorrect function call that could lead to broken codegen. llvm-svn: 317208	2017-11-02 14:25:34 +00:00
Ivan A. Kosarev	b9c59f36fc	[CodeGen] Propagate may-alias'ness of lvalues with TBAA info This patch fixes various places in clang to propagate may-alias TBAA access descriptors during construction of lvalues, thus eliminating the need for the LValueBaseInfo::MayAlias flag. This is part of D38126 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D39008 llvm-svn: 316988	2017-10-31 11:05:34 +00:00
Ivan A. Kosarev	9f9d157517	[CodeGen] Generate TBAA info for reference loads Differential Revision: https://reviews.llvm.org/D39177 llvm-svn: 316896	2017-10-30 11:49:31 +00:00
Jonas Hahnfeld	4525c82428	[OpenMP] Avoid VLAs for some reductions on array sections In some cases the compiler can deduce the length of an array section as constants. With this information, VLAs can be avoided in place of a constant sized array or even a scalar value if the length is 1. Example: int a[4], b[2]; pragma omp parallel reduction(+: a[1:2], b[1:1]) { } For chained array sections, this optimization is restricted to cases where all array sections except the last have a constant length 1. This trivially guarantees that there are no holes in the memory region that needs to be privatized. Example: int c[3][4]; pragma omp parallel reduction(+: c[1:1][1:2]) { } This relands commit r316229 that I reverted in r316235 because it failed on some bots. During investigation I found that this was because Clang and GCC evaluate the two arguments to emplace_back() in ReductionCodeGen::emitSharedLValue() in a different order, hence leading to a different order of generated instructions in the final LLVM IR. Fix this by passing in the arguments from temporary variables that are evaluated in a defined order. Differential Revision: https://reviews.llvm.org/D39136 llvm-svn: 316362	2017-10-23 19:01:35 +00:00
Jonas Hahnfeld	c95a6985bd	Revert "[OpenMP] Avoid VLAs for some reductions on array sections" This breaks at least two buildbots: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/1175 http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/10478 This reverts commit r316229 during local investigation. llvm-svn: 316235	2017-10-20 20:16:17 +00:00
Jonas Hahnfeld	b6229be460	[OpenMP] Avoid VLAs for some reductions on array sections In some cases the compiler can deduce the length of an array section as constants. With this information, VLAs can be avoided in place of a constant sized array or even a scalar value if the length is 1. Example: int a[4], b[2]; pragma omp parallel reduction(+: a[1:2], b[1:1]) { } For chained array sections, this optimization is restricted to cases where all array sections except the last have a constant length 1. This trivially guarantees that there are no holes in the memory region that needs to be privatized. Example: int c[3][4]; pragma omp parallel reduction(+: c[1:1][1:2]) { } Differential Revision: https://reviews.llvm.org/D39136 llvm-svn: 316229	2017-10-20 19:40:40 +00:00
Alexey Bataev	a7b19157ba	[OPENMP] Fix PR34927: Emit initializer for reduction array with declare reduction. If the reduction is an array or an array section and reduction operation is declare reduction without initializer, it may lead to crash. llvm-svn: 315611	2017-10-12 20:03:39 +00:00
Alexey Bataev	311a928359	[OPENMP] Fix PR34925: Fix getting thread_id lvalue for inlined regions in C. If we try to get the lvalue for thread_id variables in inlined regions, we did not use the correct version of function. Fixed this bug by adding overrided version of the function getThreadIDVariableLValue for inlined regions. llvm-svn: 315578	2017-10-12 13:51:32 +00:00
Ivan A. Kosarev	f5f204679b	[CodeGen] Generate TBAA info along with LValue base info This patch enables explicit generation of TBAA information in all cases where LValue base info is propagated or constructed in non-trivial ways. Eventually, we will consider each of these cases to make sure the TBAA information is correct and not too conservative. For now, we just fall back to generating TBAA info from the access type. This patch should not bring in any functional changes. This is part of D38126 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D38733 llvm-svn: 315575	2017-10-12 11:29:46 +00:00
Alexey Bataev	3a03a7f636	[OPENMP] Remove extra if, NFC. llvm-svn: 315467	2017-10-11 15:56:38 +00:00
Alexey Bataev	e213f3e61a	[OPENMP] Fix PR34916: Crash on mixing taskloop\|tasks directives. If both taskloop and task directives are used at the same time in one program, we may ran into the situation when the particular type for task directive is reused for taskloop directives. Patch fixes this problem. llvm-svn: 315464	2017-10-11 15:29:40 +00:00
Ivan A. Kosarev	5f8c0ca53d	[CodeGen] Do not construct complete LValue base info in trivial cases Besides obvious code simplification, avoiding explicit creation of LValueBaseInfo objects makes it easier to make TBAA information to be part of such objects. This is part of D38126 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D38695 llvm-svn: 315289	2017-10-10 09:39:32 +00:00
Benjamin Kramer	c24fb0718d	Remove unused variables. No functionality change. llvm-svn: 315196	2017-10-08 21:23:02 +00:00
Benjamin Kramer	16610028ea	Remove unused variables. No functionality change. llvm-svn: 315185	2017-10-08 19:11:02 +00:00
Alexey Bataev	2a007e05a0	[OPENMP] Simplify codegen for non-offloading code. Simplified and generalized codegen for non-offloading part that works if offloading is failed or condition of the `if` clause is `false`. llvm-svn: 314670	2017-10-02 14:20:58 +00:00
Alexey Bataev	f47c4b4184	[OPENMP] Generate implicit map\|firstprivate clauses for target-based directives. If the variable is used in the target-based region but is not found in any private\|mapping clause, then generate implicit firstprivate\|map clauses for these implicitly mapped variables. llvm-svn: 314205	2017-09-26 13:47:31 +00:00
Alexey Bataev	f43f714213	[OPENMP] Fix for PR33922: New ident_t flags for __kmpc_for_static_fini(). Added special flags for calls of __kmpc_for_static_fini(), like previous ly for __kmpc_for_static_init(). Added flag OMP_IDENT_WORK_DISTRIBUTE for distribute cnstruct, OMP_IDENT_WORK_SECTIONS for sections-based constructs and OMP_IDENT_WORK_LOOP for loop-based constructs in location flags. llvm-svn: 312642	2017-09-06 16:17:35 +00:00
Alexey Bataev	070f43aee7	[OPENMP] Fix for PR34445: Reduction initializer segfaults at runtime in move constructor. Previously user-defined reduction initializer was considered as an assignment expression, not as initializer. Fixed this by treating the initializer expression as an initializer. llvm-svn: 312638	2017-09-06 14:49:58 +00:00
Alexey Bataev	aee18557f7	[OPRNMP] Fix for PR33445: ICE: OpenMP target containing ordered for. If exceptions are enabled, there may be a problem with the codegen of the finalization functions from OpenMP runtime. It happens because of the problem with the getting of thread identifier value. Patch tries to fix it by using the result of the call of function __kmpc_global_thread_num() rather than loading of value of outlined function parameter. llvm-svn: 311007	2017-08-16 14:01:00 +00:00
Alexey Bataev	0f87dbee4e	[OPENMP] Fix for PR33922: New ident_t flags for __kmpc_for_static_init(). OpenMP 5.0 will include OpenMP Tools interface that requires distinguishing different worksharing constructs. Since the same entry point (__kmp_for_static_init(ident_t *loc, kmp_int32 global_tid,........)) is called in case static loop/sections/distribute it is suggested using 'flags' field of the ident_t structure to pass the type of the construct. llvm-svn: 310865	2017-08-14 17:56:13 +00:00
Alexey Bataev	6e01dc1b84	[OPENMP][DEBUG] Fix for PR33676: Debug info for OpenMP region is broken. After some changes in clang/LLVM debug info for task-based regions was not generated at all. Patch fixes this problem. llvm-svn: 310850	2017-08-14 16:03:47 +00:00
Alexey Bataev	3c595a6b2c	[OPENMP] Generalization of calls of the outlined functions. General improvement of the outlined functions calls. llvm-svn: 310840	2017-08-14 15:01:03 +00:00
Alexey Bataev	3b8d5586ec	[OPENMP][DEBUG] Set proper address space info if required by target. Arguments, passed to the outlined function, must have correct address space info for proper Debug info support. Patch sets global address space for arguments that are mapped and passed by reference. Also, cuda-gdb does not handle reference types correctly, so reference arguments are represented as pointers. llvm-svn: 310387	2017-08-08 18:04:06 +00:00
Alexey Bataev	2c7eee5b84	[OPENMP] Unify generation of outlined function calls. llvm-svn: 310098	2017-08-04 19:10:54 +00:00
Alexey Bataev	be5a8b42cd	[OPENMP] Codegen for reduction clauses in 'taskloop' directives. Adds codegen for taskloop-based directives. llvm-svn: 308174	2017-07-17 13:30:36 +00:00
Alexey Bataev	5c40bec5eb	[OPENMP] Generalization of codegen for reduction clauses. Reworked codegen for reduction clauses for future support of reductions in task-based directives. llvm-svn: 307910	2017-07-13 13:36:14 +00:00
Alexey Bataev	3344603f7b	[OPENMP] Emit implicit taskgroup block around taskloop directives. If taskloop directive has no associated nogroup clause, it must emitted inside implicit taskgroup block. Runtime supports it, but we need to generate implicit taskgroup block explicitly to support future reductions codegen. llvm-svn: 307822	2017-07-12 18:09:32 +00:00
Alexey Bataev	1fdfdf7155	[OPENMP][DEBUG] Generate second function with correct arg types. Currently, if the some of the parameters are captured by value, this argument is converted to uintptr_t type and thus we loosing the debug info about real type of the argument (captured variable): ``` void @.outlined_function.(uintptr %par); ... %a = alloca i32 %a.casted = alloca uintptr %cast = bitcast uintptr* %a.casted to i32* %a.val = load i32, i32 %a store i32 %a.val, i32 %cast %a.casted.val = load uintptr, uintptr* %a.casted call void @.outlined_function.(uintptr %a.casted.val) ... ``` To resolve this problem, in debug mode a speciall external wrapper function is generated, that calls the outlined function with the correct parameters types: ``` void @.wrapper.(uintptr %par) { %a = alloca i32 %cast = bitcast i32* %a to uintptr* store uintptr %par, uintptr %cast %a.val = load i32, i32 %a call void @.outlined_function.(i32 %a) ret void } void @.outlined_function.(i32 %par); ... %a = alloca i32 %a.casted = alloca uintptr %cast = bitcast uintptr* %a.casted to i32* %a.val = load i32, i32 %a store i32 %a.val, i32 %cast %a.casted.val = load uintptr, uintptr* %a.casted call void @.wrapper.(uintptr %a.casted.val) ... ``` llvm-svn: 306697	2017-06-29 16:43:05 +00:00
Alexey Bataev	5d1c3f6add	[OPENMP] Use MapVector instead of DenseMap for stable codegen, NFC. llvm-svn: 306419	2017-06-27 15:46:42 +00:00
Gheorghe-Teodor Bercea	47633db42e	Add comma to comment. llvm-svn: 305294	2017-06-13 15:35:27 +00:00
Alexey Bataev	56223237b0	[DebugInfo] Add kind of ImplicitParamDecl for emission of FlagObjectPointer. Summary: If the first parameter of the function is the ImplicitParamDecl, codegen automatically marks it as an implicit argument with `this` or `self` pointer. Added internal kind of the ImplicitParamDecl to separate 'this', 'self', 'vtt' and other implicit parameters from other kind of parameters. Reviewers: rjmccall, aaron.ballman Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D33735 llvm-svn: 305075	2017-06-09 13:40:18 +00:00
Mehdi Amini	6aa9e9b41a	IRGen: Add optnone attribute on function during O0 Amongst other, this will help LTO to correctly handle/honor files compiled with O0, helping debugging failures. It also seems in line with how we handle other options, like how -fnoinline adds the appropriate attribute as well. Differential Revision: https://reviews.llvm.org/D28404 llvm-svn: 304127	2017-05-29 05:38:20 +00:00
George Rokos	29d0f00340	[OpenMP] Create COMDAT group for OpenMP offload registration code to avoid multiple copies Thanks to Sergey Dmitriev for submitting the patch. Differential Revision: https://reviews.llvm.org/D33509 llvm-svn: 304056	2017-05-27 03:03:13 +00:00
Krzysztof Parzyszek	8f248234fa	[CodeGen] Propagate LValueBaseInfo instead of AlignmentSource The functions creating LValues propagated information about alignment source. Extend the propagated data to also include information about possible unrestricted aliasing. A new class LValueBaseInfo will contain both AlignmentSource and MayAlias info. This patch should not introduce any functional changes. Differential Revision: https://reviews.llvm.org/D33284 llvm-svn: 303358	2017-05-18 17:07:11 +00:00
Serge Guelton	1d993270b3	Suppress all uses of LLVM_END_WITH_NULL. NFC. Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential revision: https://reviews.llvm.org/D32550 llvm-svn: 302572	2017-05-09 19:31:30 +00:00
Carlo Bertolli	b0ff0a69c3	Recommit of [OpenMP] Initial implementation of code generation for pragma 'distribute parallel for' on host https://reviews.llvm.org/D29508 This patch makes the following additions: It abstracts away loop bound generation code from procedures associated with pragma 'for' and loops in general, in such a way that the same procedures can be used for 'distribute parallel for' without the need for a full re-implementation. It implements code generation for 'distribute parallel for' and adds regression tests. It includes tests for clauses. It is important to notice that most of the clauses are implemented as part of existing procedures. For instance, firstprivate is already implemented for 'distribute' and 'for' as separate pragmas. As the implementation of 'distribute parallel for' is based on the same procedures, then we automatically obtain implementation for such clauses without the need to add new code. However, this requires regression tests that verify correctness of produced code. llvm-svn: 301340	2017-04-25 17:52:12 +00:00
Carlo Bertolli	f09daae75d	Revert r301223 llvm-svn: 301233	2017-04-24 19:50:35 +00:00
Carlo Bertolli	4287d65c10	[OpenMP] Initial implementation of code generation for pragma 'distribute parallel for' on host https://reviews.llvm.org/D29508 This patch makes the following additions: 1. It abstracts away loop bound generation code from procedures associated with pragma 'for' and loops in general, in such a way that the same procedures can be used for 'distribute parallel for' without the need for a full re-implementation. 2. It implements code generation for 'distribute parallel for' and adds regression tests. It includes tests for clauses. It is important to notice that most of the clauses are implemented as part of existing procedures. For instance, firstprivate is already implemented for 'distribute' and 'for' as separate pragmas. As the implementation of 'distribute parallel for' is based on the same procedures, then we automatically obtain implementation for such clauses without the need to add new code. However, this requires regression tests that verify correctness of produced code. Looking forward to comments. llvm-svn: 301223	2017-04-24 19:26:11 +00:00
Simon Pilgrim	2c51880a82	Spelling mistakes in comments. NFCI. (PR27635) llvm-svn: 299083	2017-03-30 14:13:19 +00:00
Reid Kleckner	e258c44002	Use arg_begin() instead of getArgumentList().begin(), the argument list is an implementation detail llvm-svn: 297975	2017-03-16 18:55:46 +00:00
John McCall	5ad740756f	Promote ConstantInitBuilder to be a public CodeGen API; it's a generally useful utility for other frontends. NFC. llvm-svn: 296806	2017-03-02 20:04:19 +00:00
Jonas Hahnfeld	b07931f01d	[OpenMP] Fix cancellation point in task with no cancel With tasks, the cancel may happen in another task. This has a different region info which means that we can't find it here. Differential Revision: https://reviews.llvm.org/D30091 llvm-svn: 295474	2017-02-17 18:32:58 +00:00
Jonas Hahnfeld	20fce72f1b	[OpenMP] Remove barriers at cancel and cancellation point This resolves a deadlock with the cancel directive when there is no explicit cancellation point. In that case, the implicit barrier acts as cancellation point. After removing the barrier after cancel, the now unmatched barrier for the explicit cancellation point has to go as well. This has probably worked before rL255992: With the calls for the explicit barrier, it was sure that all threads passed a barrier before exiting. Reported by Simon Convent and Joachim Protze! Differential Revision: https://reviews.llvm.org/D30088 llvm-svn: 295473	2017-02-17 18:32:51 +00:00
Arpith Chacko Jacob	101e8fb1f3	[OpenMP] Parallel reduction on the NVPTX device. This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295333	2017-02-16 16:20:16 +00:00
Arpith Chacko Jacob	bd6344c0be	Revert r295319 while investigating buildbot failure. llvm-svn: 295323	2017-02-16 14:25:35 +00:00
Arpith Chacko Jacob	8e170fc857	[OpenMP] Parallel reduction on the NVPTX device. This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated by the fact that variables declared in the stack of a CUDA thread cannot be shared with other threads. The patch creates a struct to hold reduction variables and a number of helper functions. The OpenMP runtime on the GPU implements reduction algorithms that uses these helper functions to perform reductions within a team. Variables are shared between CUDA threads using shuffle intrinsics. An implementation of reductions on the NVPTX device is substantially different to that of CPUs. However, this patch is written so that there are minimal changes to the rest of OpenMP codegen. The implemented design allows the compiler and runtime to be decoupled, i.e., the runtime does not need to know of the reduction operation(s), the type of the reduction variable(s), or the number of reductions. The design also allows reuse of host codegen, with appropriate specialization for the NVPTX device. While the patch does introduce a number of abstractions, the expected use case calls for inlining of the GPU OpenMP runtime. After inlining and optimizations in LLVM, these abstractions are unwound and performance of OpenMP reductions is comparable to CUDA-canonical code. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29758 llvm-svn: 295319	2017-02-16 14:03:36 +00:00
Arpith Chacko Jacob	99a1e0eba5	[OpenMP] Codegen support for 'target teams' on the host. This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293005	2017-01-25 02:18:43 +00:00
Arpith Chacko Jacob	86f9e46365	Reverting commit because an NVPTX patch sneaked in. Break up into two patches. llvm-svn: 293003	2017-01-25 01:45:59 +00:00
Arpith Chacko Jacob	4dbf368e14	[OpenMP] Codegen support for 'target teams' on the host. This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293001	2017-01-25 01:38:33 +00:00
Arpith Chacko Jacob	33c849a007	[OpenMP] Support for the num_threads-clause on 'target parallel'. The num_threads-clause on the combined directive applies to the 'parallel' region of this construct. We modify the NumThreadsClause class to capture the clause expression within the 'target' region. The offload runtime call for 'target parallel' is changed to __tgt_target_teams() with 1 team and the number of threads set by this clause or a default if none. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29082 llvm-svn: 292997	2017-01-25 00:57:16 +00:00
Arpith Chacko Jacob	19b911cb75	[OpenMP] Codegen support for 'target parallel' on the host. This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292419	2017-01-18 18:18:53 +00:00
Arpith Chacko Jacob	42793e000a	Revert r292374 to debug Windows buildbot failure. llvm-svn: 292400	2017-01-18 15:36:05 +00:00
Arpith Chacko Jacob	68019578a3	[OpenMP] Codegen support for 'target parallel' on the host. This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292374	2017-01-18 15:14:52 +00:00
Arpith Chacko Jacob	43a8b7bc8c	[OpenMP] Refactor code that calls codegen for target regions on the device. This patch refactors code that calls codegen for target regions. Currently the codebase only supports the 'target' directive. The patch pulls out common target processing code into a static function that can be called by codegen for any target directive. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28752 llvm-svn: 292134	2017-01-16 15:26:02 +00:00
Malcolm Parsons	c6e4583dbb	Remove unused lambda captures. NFC llvm-svn: 291939	2017-01-13 18:55:32 +00:00
Arpith Chacko Jacob	bb36fe8dba	[OpenMP] Basic support for a parallel directive in a target region on an NVPTX device Summary: This patch introduces support for the execution of parallel constructs in a target region on the NVPTX device. Parallel regions must be in the lexical scope of the target directive. The master thread in the master warp signals parallel work for worker threads in worker warps on encountering a parallel region. Note: The patch does not yet support capture of arguments in a parallel region so the test cases are simple. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28145 llvm-svn: 291565	2017-01-10 15:42:51 +00:00
Samuel Antao	f83efdb77a	[OpenMP] Add fields for flags in the offload entry descriptor. Summary: This patch adds two fields to the offload entry descriptor. One field is meant to signal Ctors/Dtors and `link` global variables, and the other is reserved for runtime library use. Currently, these fields are only filled with zeros in the current code generation, but that will change when `declare target` is added. The reason, we are adding these fields now is to make the code generation consistent with the runtime library proposal under review in https://reviews.llvm.org/D14031. Reviewers: ABataev, hfinkel, carlo.bertolli, kkwli0, arpith-jacob, Hahnfeld Subscribers: cfe-commits, caomhin, jholewinski Differential Revision: https://reviews.llvm.org/D28298 llvm-svn: 291124	2017-01-05 16:02:49 +00:00
Chandler Carruth	fcd33149b4	Cleanup the handling of noinline function attributes, -fno-inline, -fno-inline-functions, -O0, and optnone. These were really, really tangled together: - We used the noinline LLVM attribute for -fno-inline - But not for -fno-inline-functions (breaking LTO) - But we did use it for -finline-hint-functions (yay, LTO is happy!) - But we didn't for -O0 (LTO is sad yet again...) - We had weird structuring of CodeGenOpts with both an inlining enumeration and a boolean. They interacted in weird ways and needlessly. - A lot of set smashing went on with setting these, and then got worse when we considered optnone and other inlining-effecting attributes. - A bunch of inline affecting attributes were managed in a completely different place from -fno-inline. - Even with -fno-inline we failed to put the LLVM noinline attribute onto many generated function definitions because they didn't show up as AST-level functions. - If you passed -O0 but -finline-functions we would run the normal inliner pass in LLVM despite it being in the O0 pipeline, which really doesn't make much sense. - Lastly, we used things like '-fno-inline' to manipulate the pass pipeline which forced the pass pipeline to be much more parameterizable than it really needs to be. Instead we can just use the optimization level to select a pipeline and control the rest via attributes. Sadly, this causes a bunch of churn in tests because we don't run the optimizer in the tests and check the contents of attribute sets. It would be awesome if attribute sets were a bit more FileCheck friendly, but oh well. I think this is a significant improvement and should remove the semantic need to change what inliner pass we run in order to comply with the requested inlining semantics by relying completely on attributes. It also cleans up tho optnone and related handling a bit. One unfortunate aspect of this is that for generating alwaysinline routines like those in OpenMP we end up removing noinline and then adding alwaysinline. I tried a bunch of other approaches, but because we recompute function attributes from scratch and don't have a declaration here I couldn't find anything substantially cleaner than this. Differential Revision: https://reviews.llvm.org/D28053 llvm-svn: 290398	2016-12-23 01:24:49 +00:00
Samuel Antao	4b75b8726d	Fix typo and remove unnecessary statement. llvm-svn: 289458	2016-12-12 19:26:31 +00:00
Samuel Antao	4c8035bca4	Fix format and a few typos in comments. llvm-svn: 289450	2016-12-12 18:00:20 +00:00
John McCall	f1788639c5	Hide the result of building a constant initializer. NFC. llvm-svn: 288080	2016-11-28 22:18:30 +00:00
John McCall	23c9dc6585	ConstantBuilder -> ConstantInitBuilder for clarity, and move the member classes up to top level to allow forward declarations to name them. NFC. llvm-svn: 288079	2016-11-28 22:18:27 +00:00
Benjamin Kramer	81cb4b7103	[CodeGen] Pass objects that are expensive to copy by const ref. No functionality change. Found by clang-tidy's performance-unnecessary-value-param. llvm-svn: 287894	2016-11-24 16:01:20 +00:00
John McCall	6c9f1fdb5c	Introduce a helper class for building complex constant initializers. NFC. I've adopted this in most of the places it makes sense, but v-tables and CGObjCMac will need a second pass. llvm-svn: 287437	2016-11-19 08:17:24 +00:00
Peter Collingbourne	d9445c49ad	Bitcode: Change module reader functions to return an llvm::Expected. Differential Revision: https://reviews.llvm.org/D26562 llvm-svn: 286752	2016-11-13 07:00:17 +00:00
Teresa Johnson	ffc4e2420f	Mirror the llvm changes that split Bitcode/ReaderWriter.h The change in D26502 splits ReaderWriter.h, which contains the APIs into both the BitReader and BitWriter libraries, into BitcodeReader.h and BitcodeWriter.h. Change clang uses to the appropriate split header(s). llvm-svn: 286567	2016-11-11 05:35:12 +00:00
Paul Robinson	78fb132af0	Add FIXMEs for MSVC 2013 hacks in r277211. NFC. llvm-svn: 277396	2016-08-01 22:12:46 +00:00
Hans Wennborg	bc1b58d086	Fix VS2013 build of CGOpenMPRuntime.cpp It seems the compiler was getting confused by the in-class initializers in local struct MapInfo, so moving those to a default constructor instead. llvm-svn: 277256	2016-07-30 00:41:37 +00:00
Paul Robinson	15c840052e	Fix CGOpenMPRuntime.cpp for VS2013. NFC. I don't know why these changes work but they do. llvm-svn: 277211	2016-07-29 20:46:16 +00:00
Samuel Antao	44bcdb3731	[OpenMP] Change name of variable in mappble expression. This attempts to fix a failure in Windows bots pottentially related with a reserved keyword. llvm-svn: 276988	2016-07-28 15:31:29 +00:00
Samuel Antao	cf3f83e46b	[OpenMP] Do not use default argument in lambda from mappable expressions handlers. Windows bots were complaining about that. llvm-svn: 276981	2016-07-28 14:47:35 +00:00
Samuel Antao	6890b09634	[OpenMP] Code generation for the is_device_ptr clause Summary: This patch adds support for the is_device_ptr clause. It expands SEMA to use the mappable expression logic that can only be tested with code generation in place and check conflicts with other data sharing related clauses using the mappable expressions infrastructure. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22788 llvm-svn: 276978	2016-07-28 14:25:09 +00:00
Samuel Antao	cc10b85789	[OpenMP] Codegen for use_device_ptr clause. Summary: This patch adds support for the use_device_ptr clause. It includes changes in SEMA that could not be tested without codegen, namely, the use of the first private logic and mappable expressions support. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22691 llvm-svn: 276977	2016-07-28 14:23:26 +00:00
Samuel Antao	03a3cec480	[OpenMP] Add support to map member expressions with references to pointers. Summary: This patch add support to map pointers through references in class members. Although a reference does not have storage that a user can access, it still has to be mapped in order to get the deep copy right and the dereferencing code work properly. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22787 llvm-svn: 276934	2016-07-27 22:52:16 +00:00
Samuel Antao	403ffd409f	[OpenMP] Add support for mapping array sections through pointer references. Summary: This patch fixes a bug in the map of array sections whose base is a reference to a pointer. The existing mapping support was not prepared to deal with it, causing the compiler to crash. Mapping a reference to a pointer enjoys the same characteristics of a regular pointer, i.e., it is passed by value. Therefore, the reference has to be materialized in the target region. Reviewers: hfinkel, carlo.bertolli, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22690 llvm-svn: 276933	2016-07-27 22:49:49 +00:00
David Majnemer	59f7792136	Use more ArrayRefs No functional change is intended, just a small refactoring. llvm-svn: 273647	2016-06-24 04:05:48 +00:00
Samuel Antao	6d0042642a	Re-apply r272900 - [OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects. An issue in one of the regression tests was fixed for 32-bit hosts. llvm-svn: 272931	2016-06-16 18:39:34 +00:00
Samuel Antao	b1f9501242	Revert r272900 - [OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects. Was causing trouble in one of the regression tests for a 32-bit address space. llvm-svn: 272908	2016-06-16 16:06:22 +00:00
Samuel Antao	4951617980	[OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects. Summary: This patch fixes an issue detected when firstprivate variables are passed to an OpenMP outlined function vararg list. Currently they are not compatible with what the runtime library expects causing malfunction in some targets. This patch fixes the issue by moving the casting logic already in place for offloading to the common code that creates the outline function and arguments and updates the regression tests accordingly. Reviewers: hfinkel, arpith-jacob, carlo.bertolli, kkwli0, ABataev Subscribers: cfe-commits, caomhin Differential Revision: http://reviews.llvm.org/D21150 llvm-svn: 272900	2016-06-16 15:09:31 +00:00
Peter Collingbourne	bcf909d737	Update clang for D20348 Differential Revision: http://reviews.llvm.org/D20339 llvm-svn: 272710	2016-06-14 21:02:05 +00:00
Nico Weber	a691689ae8	Remove a few gendered pronouns. llvm-svn: 272415	2016-06-10 18:53:04 +00:00
Alexey Bataev	6cff62484a	[OPENMP 4.5] Additional codegen for statically scheduled loops with 'simd' modifier. Runtime library defines new schedule constant kmp_sch_static_balanced_chunked = 45 for static loop-based directives static with chunk adjustment (e.g., simd). Added codegen for this kind of schedule. llvm-svn: 271204	2016-05-30 13:05:14 +00:00
Alexey Bataev	ad537bb8a0	[OPENMP 4.5] Fixed codegen for 'priority' and destructors in task-based directives. 'kmp_task_t' record type added a new field for 'priority' clause and changed the representation of pointer to destructors for privates used within loop-based directives. Old representation: typedef struct kmp_task { /* GEH: Shouldn't this be aligned somehow? / void shareds; /*< pointer to block of pointers to shared vars / kmp_routine_entry_t routine; /*< pointer to routine to call for executing task / kmp_int32 part_id; /*< part id for the task / kmp_routine_entry_t destructors; /* pointer to function to invoke deconstructors of firstprivate C++ objects / / private vars / } kmp_task_t; New representation: typedef struct kmp_task { / GEH: Shouldn't this be aligned somehow? / void shareds; /*< pointer to block of pointers to shared vars / kmp_routine_entry_t routine; /*< pointer to routine to call for executing task / kmp_int32 part_id; /*< part id for the task / kmp_cmplrdata_t data1; /* Two known optional additions: destructors and priority / kmp_cmplrdata_t data2; / Process destructors first, priority second / / future data / / private vars */ } kmp_task_t; Also excessive initialization of 'destructors' fields to 'null' was removed from codegen if it is known that no destructors shal be used. Currently a special bit is used in 'kmp_tasking_flags_t' bitfields ('destructors_thunk' bitfield). llvm-svn: 271201	2016-05-30 09:06:50 +00:00
Samuel Antao	8d2d730f2a	[OpenMP] Codegen for target update directive. Summary: This patch implements the code generation for the `target update` directive. The implemntation relies on the logic already in place for target data standalone directives, i.e. target enter/exit data. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: http://reviews.llvm.org/D20650 llvm-svn: 270886	2016-05-26 18:30:22 +00:00
Samuel Antao	d486f84c57	[OpenMP] Add support for the 'private pointer' flag to signal variables captured in target regions and used in first-private clauses. Summary: If a variable is implicitly mapped (doesn't show in a map clause), the runtime library has to be informed if the corresponding capture shows up in first-private clause, so that the storage previously allocated in the device is used. This patch adds the support for that. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: http://reviews.llvm.org/D20112 llvm-svn: 270870	2016-05-26 16:53:38 +00:00
Samuel Antao	6782e944d2	[OpenMP] Adjust map type bits according to latest spec and use zero size array sections for pointers. Summary: This patch changes the bits used to specify the map types according to the latest version of the libomptarget document and add the support for zero size array section when pointers are being implicitly mapped. This completes the missing new 4.5 map semantics. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: http://reviews.llvm.org/D20111 llvm-svn: 270868	2016-05-26 16:48:10 +00:00
Alexey Bataev	8b42706a6e	[OPENMP 4.5] Codegen for dacross loop synchronization constructs. OpenMP 4.5 adds support for doacross loop synchronization. Patch implements codegen for this construct. llvm-svn: 270690	2016-05-25 12:36:08 +00:00
Alexey Bataev	7ace49dff1	[OPENMP] Pass scalar firstprivate vars by value. For better performance and to unify code with offloading part we pass scalar firstprivate values by value, instead of by reference. It will remove some extra copying operations. llvm-svn: 269751	2016-05-17 08:55:33 +00:00
Alexey Bataev	1e1e286a6b	[OPENMP 4.5] Initial codegen for 'priority' clause in task-based directives. OpenMP 4.5 supports clause 'priority' in task-based directives. Patch adds initial codegen support for this clause in codegen. llvm-svn: 269050	2016-05-10 12:21:02 +00:00
Alexey Bataev	8a83159731	[OPENMP 4.0] Fixed codegen for destructors in task-based directives. If private variables require destructors call at the deletion of the task, additional flag in task flags must be set. Patch fixes this problem. llvm-svn: 269039	2016-05-10 10:36:51 +00:00
Alexey Bataev	9ebd742748	[OPENMP 4.5] Add codegen support in runtime for '[non]monotonic' schedule modifiers. Runtime library expects some additional data in schedule argument for loop-based directives, that have additional schedule modifiers 'monotonic\|nonmonotonic'. llvm-svn: 269035	2016-05-10 09:57:36 +00:00
Samuel Antao	e49645cf12	[OpenMP] Check for associated statements with hasAssociatedStmt() when scanning for device code. Summary: `getAssociatedStmt()` contains an assertion that assumes the statement always exists. In device code scanning, we need to look into the associated statement therefore we check its existence. This patch replaces `getAssociatedStmt` by `hasAssociatedStmt` so that we do not trigger the assertion for directives that happen not to have an associated statement (e.g target enter/exit data). Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: cfe-commits, caomhin Differential Revision: http://reviews.llvm.org/D19812 llvm-svn: 268870	2016-05-08 06:43:56 +00:00
Alexey Bataev	c7a82b41a7	[OPENMP 4.0] Codegen for 'declare simd' directive. OpenMP 4.0 adds support for elemental functions using declarative directive '#pragma omp declare simd'. Patch adds mangling for simd functions in accordance with https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt llvm-svn: 268721	2016-05-06 09:40:08 +00:00
Alexey Bataev	f93095a003	[OPENMP 4.5] Codegen for 'lastprivate' clauses in 'taskloop' directives. OpenMP 4.5 adds taskloop/taskloop simd directives. These directives allow to use lastprivate clause. Patch adds codegen for this clause. llvm-svn: 268618	2016-05-05 08:46:22 +00:00
Carlo Bertolli	6eee9061ac	[OPENMP] Enable correct generation of runtime call when target directive is separated from teams directive by multiple curly brackets http://reviews.llvm.org/D18474 This patch fixes a bug in code generation of the correct OpenMP runtime library call in presence of target and teams, when target is separated by teams with multiple curly brackets. The current implementation will not be able to see the teams directive inside target and issue a call to tgt_target instead of the correct one tgt_target_teams. llvm-svn: 267972	2016-04-29 01:37:30 +00:00
Alexey Bataev	24b5baed27	[OPENMP] Simplified interface for codegen of tasks, NFC. Reduced number of arguments in member functions of runtime support library for task-based directives. llvm-svn: 267863	2016-04-28 09:23:51 +00:00
Alexey Bataev	2b19a6fe53	[OPENMP 4.5] Codegen for 'grainsize/num_tasks' clauses of 'taskloop' directive. OpenMP 4.5 defines 'taskloop' directive and 2 additional clauses 'grainsize' and 'num_tasks' for this directive. Patch adds codegen for these clauses. These clauses are generated as arguments of the '__kmpc_taskloop' libcall and are encoded the following way: void __kmpc_taskloop(ident_t loc, int gtid, kmp_task_t task, int if_val, kmp_uint64 lb, kmp_uint64 ub, kmp_int64 st, int nogroup, int sched, kmp_uint64 grainsize, void *task_dup); If 'grainsize' is specified, 'sched' argument must be set to '1' and 'grainsize' argument must be set to the value of the 'grainsize' clause. If 'num_tasks' is specified, 'sched' argument must be set to '2' and 'grainsize' argument must be set to the value of the 'num_tasks' clause. It is possible because these 2 clauses are mutually exclusive and can't be used at the same time on the same directive. If none of these clauses is specified, 'sched' argument must be set to '0'. llvm-svn: 267862	2016-04-28 09:15:06 +00:00
Samuel Antao	8dd6628743	[OpenMP] Code generation for target exit data directive Summary: This patch adds support for the target exit data directive code generation. Given that, apart from the employed runtime call, target exit data requires the same code generation pattern as target enter data, the OpenMP codegen entry point was renamed and reused for both. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: cfe-commits, fraggamuffin, caomhin Differential Revision: http://reviews.llvm.org/D17369 llvm-svn: 267814	2016-04-27 23:14:30 +00:00
Samuel Antao	bd0ae2e14c	[OpenMP] Code generation for target enter data directive Summary: This patch adds support for the target enter data directive code generation. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: cfe-commits, fraggamuffin, caomhin Differential Revision: http://reviews.llvm.org/D17368 llvm-svn: 267812	2016-04-27 23:07:29 +00:00
Samuel Antao	df158d5567	[OpenMP] Code generation for target data directive Summary: This patch adds support for the target data directive code generation. Part of the already existent functionality related with data maps is moved to a new function so that it could be reused. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: cfe-commits, fraggamuffin, caomhin Differential Revision: http://reviews.llvm.org/D17367 llvm-svn: 267811	2016-04-27 22:58:19 +00:00
Samuel Antao	86ace55d53	[OpenMP] Map clause codegeneration. Summary: Implement codegen for the map clause. All the new list items in 4.5 specification are supported. Fix bug in the generation of array sections that was exposed by some of the map clause tests: for pointer types the offsets have to be calculated from the pointee not the pointer. Reviewers: hfinkel, kkwli0, carlo.bertolli, arpith-jacob, ABataev Subscribers: ABataev, cfe-commits, caomhin, fraggamuffin Differential Revision: http://reviews.llvm.org/D16749 llvm-svn: 267808	2016-04-27 22:40:57 +00:00
Alexey Bataev	4ba78a46ff	[OPENMP] Fix for codegen of captured variables in inlined directives. Currently there is a problem with codegen of inlined directives inside lambdas, it may cause a crash during codegen because of incorrect capturing of variables. Patch fixes this problem. llvm-svn: 267677	2016-04-27 07:56:03 +00:00
Alexey Bataev	7292c29bb5	[OPENMP 4.5] Codegen for 'taskloop' directive. The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed. The next code will be generated for the taskloop directive: #pragma omp taskloop num_tasks(N) lastprivate(j) for( i=0; i<NGRAINSTRIDE-1; i+=STRIDE ) { int th = omp_get_thread_num(); #pragma omp atomic counter++; #pragma omp atomic th_counter[th]++; j = i; } Generated code: task = __kmpc_omp_task_alloc(NULL,gtid,1,sizeof(struct task),sizeof(struct shar),&task_entry); psh = task->shareds; psh->pth_counter = &th_counter; psh->pcounter = &counter; psh->pj = &j; task->lb = 0; task->ub = NGRAINSTRIDE-2; task->st = STRIDE; __kmpc_taskloop( NULL, // location gtid, // gtid task, // task structure 1, // if clause value &task->lb, // lower bound &task->ub, // upper bound STRIDE, // loop increment 0, // 1 if nogroup specified 2, // schedule type: 0-none, 1-grainsize, 2-num_tasks N, // schedule value (ignored for type 0) (void*)&__task_dup_entry // tasks duplication routine ); llvm-svn: 267395	2016-04-25 12:22:29 +00:00
Adrian Prantl	1858c664de	Debug info: Apply an empty debug location for global OpenMP destructors. LLVM really wants a debug location on every inlinable call in a function with debug info, because it otherwise cannot set up inlining debug info. This change applies an artificial line 0 debug location (which is how DWARF marks automatically generated code that has no corresponding source code) to the .__kmpc_global_dtor_. functions to avoid the LLVM Verifier complaining. llvm-svn: 267369	2016-04-24 22:22:29 +00:00
Alexey Bataev	48591dd98c	[OPENMP] Codegen for untied tasks. If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks. llvm-svn: 266853	2016-04-20 04:01:36 +00:00
Alexey Bataev	995e861ba6	Revert "[OPENMP] Codegen for untied tasks." This reverts commit r266754. llvm-svn: 266755	2016-04-19 16:36:01 +00:00
Alexey Bataev	823acfacdf	[OPENMP] Codegen for untied tasks. If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks. llvm-svn: 266754	2016-04-19 16:27:55 +00:00
Alexey Bataev	bec9572213	Revert "[OPENMP] Codegen for untied tasks." This reverts commit 266722. llvm-svn: 266724	2016-04-19 09:27:38 +00:00
Alexey Bataev	26b2577f6b	[OPENMP] Codegen for untied tasks. If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks. llvm-svn: 266722	2016-04-19 09:10:27 +00:00
JF Bastien	92f4ef1017	NFC: make AtomicOrdering an enum class Summary: See LLVM change D18775 for details, this change depends on it. Reviewers: jyknight, reames Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18776 llvm-svn: 265569	2016-04-06 17:26:42 +00:00
Carlo Bertolli	c687225b43	[OPENMP] Codegen for teams directive for NVPTX This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it: Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region. Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime. Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region. http://reviews.llvm.org/D17963 llvm-svn: 265304	2016-04-04 15:55:02 +00:00
Alexey Bataev	14fa1c6b60	[OPENMP] Allow runtime insert its own code inside OpenMP regions. Solution unifies interface of RegionCodeGenTy type to allow insert runtime-specific code before/after main codegen action defined in CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy for general OpenMP directives, but must be allowed to insert its own (required) code to support target specific codegen. llvm-svn: 264700	2016-03-29 05:34:15 +00:00
Alexey Bataev	f539faa733	Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions." Reverting because of failed tests. llvm-svn: 264577	2016-03-28 12:58:34 +00:00
Alexey Bataev	424be92831	[OPENMP] Allow runtime insert its own code inside OpenMP regions. Solution unifies interface of RegionCodeGenTy type to allow insert runtime-specific code before/after main codegen action defined in CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy for general OpenMP directives, but must be allowed to insert its own (required) code to support target specific codegen. llvm-svn: 264576	2016-03-28 12:52:58 +00:00
Alexey Bataev	f662b5943c	Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions." This reverts commit 3ee791165100607178073f14531a0dc90c622b36. llvm-svn: 264570	2016-03-28 10:12:03 +00:00
Alexey Bataev	b8c425c4f7	[OPENMP] Allow runtime insert its own code inside OpenMP regions. Solution unifies interface of RegionCodeGenTy type to allow insert runtime-specific code before/after main codegen action defined in CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy for general OpenMP directives, but must be allowed to insert its own (required) code to support target specific codegen. llvm-svn: 264569	2016-03-28 09:53:43 +00:00
Arpith Chacko Jacob	5c309e475d	[OpenMP] Base support for target directive codegen on NVPTX device. Summary: This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 Reworked test case after buildbot failure on windows. Updated patch to integrate r263837 and test case nvptx_target_firstprivate_codegen.cpp. llvm-svn: 264018	2016-03-22 01:48:56 +00:00
Carlo Bertolli	b74bfc80a4	[OPENMP] Implementation of codegen for firstprivate clause of target directive This patch implements the following aspects: It extends sema to check that a variable is not reference in both a map clause and firstprivate or private. This is needed to ensure correct functioning at codegen level, apart from being useful for the user. It implements firstprivate for target in codegen. The implementation applies to both host and nvptx devices. It adds regression tests for codegen of firstprivate, host and device side when using the host as device, and nvptx side. Please note that the regression test for nvptx codegen is missing VLAs. This is because VLAs currently require saving and restoring the stack which appears not to be a supported operation by nvptx backend. It adds a check in sema regression tests for target map, firstprivate, and private clauses. http://reviews.llvm.org/D18203 llvm-svn: 263837	2016-03-18 21:43:32 +00:00
Arpith Chacko Jacob	129fa9a048	Revert r263783 as buildbot failure is being investigated. llvm-svn: 263784	2016-03-18 12:39:40 +00:00
Arpith Chacko Jacob	ac563708ab	[OpenMP] Base support for target directive codegen on NVPTX device. Summary: Reworked test case after buildbot failure on windows. This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 llvm-svn: 263783	2016-03-18 11:47:43 +00:00
Alexey Bataev	a839dddf92	[OPENMP 4.0] Use 'declare reduction' constructs in 'reduction' clauses. OpenMP 4.0 allows to define custom reduction operations using '#pragma omp declare reduction' construct. Patch allows to use this custom defined reduction operations in 'reduction' clauses. llvm-svn: 263701	2016-03-17 10:19:46 +00:00
Carlo Bertolli	a03acfa359	[OPENMP] Support for codegen of private clause of target, host side This patch adds support for codegen of private clause of target and a regression test for host code generation, when the host is used as target device. I believe that code generation for nvptx backend would not require anything additional or different to what is done for the host. http://reviews.llvm.org/D18105 llvm-svn: 263654	2016-03-16 19:04:22 +00:00
Arpith Chacko Jacob	9cb61faa61	Revert commit http://reviews.llvm.org/D17877 to fix tests on x86. llvm-svn: 263589	2016-03-15 21:26:34 +00:00
Arpith Chacko Jacob	5e1493b560	[OpenMP] Base support for target directive codegen on NVPTX device. Summary: This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 llvm-svn: 263587	2016-03-15 21:04:57 +00:00
Arpith Chacko Jacob	fc46c25d74	Reverted http://reviews.llvm.org/D17877 to fix tests. llvm-svn: 263555	2016-03-15 16:19:13 +00:00
Arpith Chacko Jacob	c61744c26b	[OpenMP] Base support for target directive codegen on NVPTX device. Summary: This patch adds base support for codegen of the target directive on the NVPTX device. Reviewers: ABataev Differential Revision: http://reviews.llvm.org/D17877 llvm-svn: 263552	2016-03-15 15:24:52 +00:00
John McCall	c56a8b3284	Preserve ExtParameterInfos into CGFunctionInfo. As part of this, make the function-arrangement interfaces a little simpler and more semantic. NFC. llvm-svn: 263191	2016-03-11 04:30:31 +00:00
Carlo Bertolli	fc35ad2bbc	Reapply r262741 [OPENMP] Codegen for distribute directive This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling. http://reviews.llvm.org/D17170 llvm-svn: 262832	2016-03-07 16:04:49 +00:00
Samuel Antao	bf4d18d3d2	Revert r262741 - [OPENMP] Codegen for distribute directive Was causing a failure in one of the buildbot slaves. llvm-svn: 262744	2016-03-04 21:02:14 +00:00
Carlo Bertolli	4a56e3831d	[OPENMP] Codegen for distribute directive This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling. http://reviews.llvm.org/D17170 llvm-svn: 262741	2016-03-04 20:24:58 +00:00
Alexey Bataev	c5b1d320b8	[OPENMP 4.0] Codegen for 'declare reduction' construct. Emit function for 'combiner' part of 'declare reduction' construct and 'initialilzer' part, if any. llvm-svn: 262699	2016-03-04 09:22:22 +00:00
Carlo Bertolli	430d8ecc55	Add code generation for teams directive inside target region llvm-svn: 262652	2016-03-03 20:34:23 +00:00
Samuel Antao	b68e2db8f9	[OpenMP] Code generation for teams - kernel launching Summary: This patch implements the launching of a target region in the presence of a nested teams region, i.e calls tgt_target_teams with the required arguments gathered from the enclosed teams directive. The actual codegen of the region enclosed by the teams construct will be contributed in a separate patch. Reviewers: hfinkel, arpith-jacob, kkwli0, carlo.bertolli, ABataev Subscribers: cfe-commits, caomhin, fraggamuffin Differential Revision: http://reviews.llvm.org/D17019 llvm-svn: 262625	2016-03-03 16:20:23 +00:00
Alexey Bataev	50b3c95992	[OPENMP] Improved layout of CGOpenMPRuntime class, NFC. llvm-svn: 261315	2016-02-19 10:38:26 +00:00
Richard Trieu	cc3949d99a	Remove use of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261271	2016-02-18 22:34:54 +00:00
Samuel Antao	2de62b0c89	[OpenMP] Rename the offload entry points. Summary: Unlike other outlined regions in OpenMP, offloading entry points have to have be visible (external linkage) for the device side. Using dots in the names of the entries can be therefore problematic for some toolchains, e.g. NVPTX. Also the patch drops the column information in the unique name of the entry points. The parsing of directives ignore unknown tokens, preventing several target regions to be implemented in the same line. Therefore, the line information is sufficient for the name to be unique. Also, the preprocessor printer does not preserve the column information, causing offloading-entry detection issues if the host uses an integrated preprocessor and the target doesn't (or vice versa). Reviewers: hfinkel, arpith-jacob, carlo.bertolli, kkwli0, ABataev Subscribers: cfe-commits, fraggamuffin, caomhin Differential Revision: http://reviews.llvm.org/D17179 llvm-svn: 260837	2016-02-13 23:35:10 +00:00
Eugene Zelenko	0a4f3f4373	Fix some Clang-tidy readability-redundant-control-flow warnings; other minor fixes. Differential revision: http://reviews.llvm.org/D17060 llvm-svn: 260414	2016-02-10 19:11:58 +00:00
Alexey Bataev	31300ed0a5	[OPENMP 4.0] Fixed support of array sections/array subscripts. Codegen for array sections/array subscripts worked only for expressions with arrays as base. Patch fixes codegen for bases with pointer/reference types. llvm-svn: 259776	2016-02-04 11:27:03 +00:00
Benjamin Kramer	8c30592e18	Move DebugInfoKind into its own header to cut the cyclic dependency edge from Driver to Frontend. llvm-svn: 259489	2016-02-02 11:06:51 +00:00
Eugene Zelenko	1660a5d298	Fix Clang-tidy modernize-use-nullptr warnings; other minor fixes. Differential revision: http://reviews.llvm.org/D16567 llvm-svn: 258836	2016-01-26 19:01:06 +00:00
Alexey Bataev	1189bd0205	[OPENMP 4.5] Allow arrays in 'reduction' clause. OpenMP 4.5, alogn with array sections, allows to use variables of array type in reductions. llvm-svn: 258804	2016-01-26 12:20:39 +00:00
Alexey Bataev	3015bcc62a	[OPENMP] Generalize codegen for 'sections'-based directive. If 'sections' directive has only one sub-section, the code for 'single'-based directive was emitted. Removed this codegen, because it causes crashes in different cases. llvm-svn: 258495	2016-01-22 08:56:50 +00:00
Alexey Bataev	8524d15954	[OPENMP] Fix crash on reduction for complex variables. reworked codegen for reduction operation for complex types to avoid crash llvm-svn: 258394	2016-01-21 12:35:58 +00:00
Alexey Bataev	9619f04c0e	[OPENMP 4.0] Fix for codegen of 'cancel' directive within 'sections' directive. Allow to emit code for 'cancel' directive within 'sections' directive with single sub-section. llvm-svn: 258307	2016-01-20 12:29:47 +00:00
Hans Wennborg	45c7439d11	Don't store CGOpenMPRegionInfo::CodeGen as a reference (PR26078) The referenced llvm::function_ref<void(CodeGenFunction &)> object can go away before CodeGen is used, resulting in a crash. llvm-svn: 257516	2016-01-12 20:54:36 +00:00
Samuel Antao	ee8fb302f5	[OpenMP] Reapply rL256842: [OpenMP] Offloading descriptor registration and device codegen. This patch attempts to fix the regressions identified when the patch was committed initially. Thanks to Michael Liao for identifying the fix in the offloading metadata generation related with side effects in evaluation of function arguments. llvm-svn: 256933	2016-01-06 13:42:12 +00:00
Samuel Antao	7d5de9a1ee	[OpenMP] Revert rL256842: [OpenMP] Offloading descriptor registration and device codegen. It was causing two regression, so I'm reverting until the cause is found. llvm-svn: 256858	2016-01-05 19:16:13 +00:00
Samuel Antao	4d5f0bbea1	[OpenMP] Offloading descriptor registration and device codegen. Summary: In order to offloading work properly two things need to be in place: - a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library. - all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device. This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented. About offloading descriptor: The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type: ``` struct __tgt_offload_entry{ void addr; char name; int64_t size; }; ``` and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping. The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run. The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple. About target codegen: The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this: ``` !omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6} !0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4} !1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5} !2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6} !3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0} !4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3} !5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1} !6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2} ``` The fields in each metadata entry are (in sequence): Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region". Entry 2) a unique ID of the device where the input source file that contain the target region lives. Entry 3) a unique ID of the file where the input source file that contain the target region lives. Entry 4) a mangled name of the function that encloses the target region. Entries 5) and 6) line and column number where the target region was found. Entry 7) is the order the entry was emitted. Entry 2) and 3) are required to distinguish files that have the same function name. Entry 4) is required to distinguish different instances of the same declaration (usually templated ones) Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. ) This patch replaces http://reviews.llvm.org/D12306. Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits Differential Revision: http://reviews.llvm.org/D12614 llvm-svn: 256842	2016-01-05 16:23:04 +00:00
Alexey Bataev	a636c7f9b9	[OPENMP 4.5] Parsing/sema for 'depend(sink:vec)' clause in 'ordered' directive. OpenMP 4.5 adds 'depend(sink:vec)' in 'ordered' directive for doacross loop synchronization. Patch adds parsing and semantic analysis for this clause. llvm-svn: 256330	2015-12-23 10:27:45 +00:00
Alexey Bataev	29c9209357	[OPENMP] Revert r256238 to fix the problem with tests on Linux. llvm-svn: 256239	2015-12-22 12:44:46 +00:00
Alexey Bataev	ef4c5584d5	[OPENMP 4.5] Parsing/sema for 'depend(sink:vec)' clause in 'ordered' directive. OpenMP 4.5 adds 'depend(sink:vec)' in 'ordered' directive for doacross loop synchronization. Patch adds parsing and semantic analysis for this clause. llvm-svn: 256238	2015-12-22 12:21:47 +00:00
Alexey Bataev	8ef3141127	[OPENMP] Fix for http://llvm.org/PR25878 : Error compiling an OpenMP program OpenMP codegen tried to emit the code for its constructs even if it was detected as a dead-code. Added checks to ensure that the code is emitted if the code is not dead. llvm-svn: 255990	2015-12-18 07:58:25 +00:00
Alexey Bataev	eb48235033	[OPENMP 4.5] Parsing/sema analysis for 'depend(source)' clause in 'ordered' directive. OpenMP 4.5 adds 'depend(source)' clause for 'ordered' directive to support cross-iteration dependence. Patch adds parsing and semantic analysis for this construct. llvm-svn: 255986	2015-12-18 05:05:56 +00:00
Alexey Bataev	fc57d1601d	[OPENMP 4.5] Codegen for 'hint' clause of 'critical' directive OpenMP 4.5 defines 'hint' clause for 'critical' directive. Patch adds codegen for this clause. llvm-svn: 255639	2015-12-15 10:55:09 +00:00
Samuel Antao	4af1b7b693	[OpenMP] Update target directive codegen to use 4.5 implicit data mappings. Summary: This patch implements the 4.5 specification for the implicit data maps. OpenMP 4.5 specification changes the default way data is captured into a target region. All the non-aggregate kinds are passed by value by default. This required activating the capturing by value during SEMA for the target region. All the non-aggregate values that can be encoded in the size of a pointer are properly casted and forwarded to the runtime library. On top of fixing the previous weird behavior for mapping pointers in nested data regions (an explicit map was always required), this also improves performance as the number of allocations/transactions to the device per non-aggregate map are reduced from two to only one - instead of passing a reference and the value, only the value passed. Explicit maps will be added later on once firstprivate, private, and map clauses' SEMA and parsing are available. Reviewers: hfinkel, rjmccall, ABataev Subscribers: cfe-commits, carlo.bertolli Differential Revision: http://reviews.llvm.org/D14940 llvm-svn: 254521	2015-12-02 17:44:43 +00:00
Alexey Bataev	40e36f1f64	[OPENMP] Fix crash on codegen for 'task' directive with no shared variables. If 'task' region does not have shared variables codegen could crash on calculation of size of list of shared variables. llvm-svn: 253977	2015-11-24 13:01:44 +00:00
Alexey Bataev	92e82f9cce	[OPENMP] 'out' dependency for 'task' directives must be the same as 'inout'. Runtime library requires, that codegen for 'depend' clause for 'out' dependency kind must be the same as codegen for 'depend' clause with 'inout' dependency. llvm-svn: 253866	2015-11-23 13:33:42 +00:00
Akira Hatanaka	7791f1a4a9	[CodeGen] Call SetInternalFunctionAttributes to attach function attributes to internal functions. This patch fixes CodeGenModule::CreateGlobalInitOrDestructFunction to use SetInternalFunctionAttributes instead of SetLLVMFunctionAttributes to attach function attributes to internal functions. Also, make sure the correct CGFunctionInfo is passed instead of always passing what arrangeNullaryFunction returns. rdar://problem/20828324 Differential Revision: http://reviews.llvm.org/D13610 llvm-svn: 251734	2015-10-31 01:28:07 +00:00
Benjamin Kramer	e003ca2a03	Put global classes into the appropriate namespace. Most of the cases belong into an anonymous namespace. No functionality change intended. llvm-svn: 251514	2015-10-28 13:54:16 +00:00
Akira Hatanaka	44a59f8976	[CodeGen] Attach function attributes to Objective-C and OpenMP functions. This commit fixes a bug in CGOpenMPRuntime.cpp and CGObjC.cpp where some of the function attributes are not attached to newly created functions. rdar://problem/20828324 Differential Revision: http://reviews.llvm.org/D13928 llvm-svn: 251476	2015-10-28 02:30:47 +00:00
Alexey Bataev	f24e7b1f60	[OPENMP 4.1] Codegen for array sections/subscripts in 'reduction' clause. OpenMP 4.1 adds support for array sections/subscripts in 'reduction' clause. Patch adds codegen for this feature. llvm-svn: 249672	2015-10-08 09:10:53 +00:00
Samuel Antao	bed3c46632	[OpenMP] Target directive host codegen. This patch implements the outlining for offloading functions for code annotated with the OpenMP target directive. It uses a temporary naming of the outlined functions that will have to be updated later on once target side codegen and registration of offloading libraries is implemented - the naming needs to be made unique in the produced library. llvm-svn: 249148	2015-10-02 16:14:20 +00:00
Craig Topper	8674c5cf70	Remove 'const' from some ArrayRef arguments since they're passed by value anyway. NFC llvm-svn: 248774	2015-09-29 04:30:07 +00:00
Alexey Bataev	5f600d6a49	[OPENMP 4.1] Codegen for ‘simd’ clause in ‘ordered’ directive. Description. If the simd clause is specified, the ordered regions encountered by any thread will use only a single SIMD lane to execute the ordered regions in the order of the loop iterations. Restrictions. An ordered construct with the simd clause is the only OpenMP construct that can appear in the simd region. An ordered directive with ‘simd’ clause is generated as an outlined function and corresponding function call to prevent this part of code from vectorization later in backend. llvm-svn: 248772	2015-09-29 03:48:57 +00:00
Alexey Bataev	87933c7ced	[OPENMP 4.0] Add 'if' clause for 'cancel' directive. Add parsing, sema analysis and codegen for 'if' clause in 'cancel' directive. llvm-svn: 247976	2015-09-18 08:07:34 +00:00
Alexey Bataev	25e5b44654	[OPENMP] Emit __kmpc_cancel_barrier() and code for 'cancellation point' only if 'cancel' is found. Patch improves codegen for OpenMP constructs. If the OpenMP region does not have internal 'cancel' construct, a call to 'void __kmpc_barrier()' runtime function is generated for all implicit/explicit barriers. If the region has inner 'cancel' directive, then ``` if (__kmpc_cancel_barrier()) exit from outer construct; ``` code is generated. Also, the code for 'canellation point' directive is not generated if parent directive does not have 'cancel' directive. llvm-svn: 247681	2015-09-15 12:52:43 +00:00
Evgeniy Stepanov	6b2a61d3a5	Revert "Always_inline codegen rewrite" and 2 follow-ups. Revert "Update cxx-irgen.cpp test to allow signext in alwaysinline functions." Revert "[CodeGen] Remove wrapper-free always_inline functions from COMDATs" Revert "Always_inline codegen rewrite." Reason for revert: PR24793. llvm-svn: 247620	2015-09-14 21:35:16 +00:00
Evgeniy Stepanov	93db40a147	Always_inline codegen rewrite. Current implementation may end up emitting an undefined reference for an "inline __attribute__((always_inline))" function by generating an "available_externally alwaysinline" IR function for it and then failing to inline all the calls. This happens when a call to such function is in dead code. As the inliner is an SCC pass, it does not process dead code. Libc++ relies on the compiler never emitting such undefined reference. With this patch, we emit a pair of 1. internal alwaysinline definition (called F.alwaysinline) 2a. A stub F() { musttail call F.alwaysinline } -- or, depending on the linkage -- 2b. A declaration of F. The frontend ensures that F.inlinefunction is only used for direct calls, and the stub is used for everything else (taking the address of the function, really). Declaration (2b) is emitted in the case when "inline" is meant for inlining only (like __gnu_inline__ and some other cases). This approach, among other nice properties, ensures that alwaysinline functions are always internal, making it impossible for a direct call to such function to produce an undefined symbol reference. This patch is based on ideas by Chandler Carruth and Richard Smith. llvm-svn: 247494	2015-09-12 01:07:37 +00:00
Evgeniy Stepanov	67037ee21e	Revert "Specify target triple in alwaysinline tests." Revert "Always_inline codegen rewrite." Breaks gdb & lldb tests. Breaks on Fedora 22 x86_64. llvm-svn: 247491	2015-09-11 23:48:37 +00:00
Evgeniy Stepanov	072e83500e	Always_inline codegen rewrite. Current implementation may end up emitting an undefined reference for an "inline __attribute__((always_inline))" function by generating an "available_externally alwaysinline" IR function for it and then failing to inline all the calls. This happens when a call to such function is in dead code. As the inliner is an SCC pass, it does not process dead code. Libc++ relies on the compiler never emitting such undefined reference. With this patch, we emit a pair of 1. internal alwaysinline definition (called F.alwaysinline) 2a. A stub F() { musttail call F.alwaysinline } -- or, depending on the linkage -- 2b. A declaration of F. The frontend ensures that F.inlinefunction is only used for direct calls, and the stub is used for everything else (taking the address of the function, really). Declaration (2b) is emitted in the case when "inline" is meant for inlining only (like __gnu_inline__ and some other cases). This approach, among other nice properties, ensures that alwaysinline functions are always internal, making it impossible for a direct call to such function to produce an undefined symbol reference. This patch is based on ideas by Chandler Carruth and Richard Smith. llvm-svn: 247465	2015-09-11 20:29:07 +00:00
Alexey Bataev	c71a4099cf	[OPENMP] Preserve alignment of the original variables for the captured references. Patch makes codegen to preserve alignment of the shared variables captured and used in OpenMP regions. llvm-svn: 247401	2015-09-11 10:29:41 +00:00
Hans Wennborg	7eb5464bc5	Re-commit r247218: "Fix Clang-tidy misc-use-override warnings, other minor fixes" This never broke the build; it was the LLVM side, r247216, that caused problems. llvm-svn: 247302	2015-09-10 17:07:54 +00:00
Alexey Bataev	2377fe95c6	[OPENMP] Outlined function for parallel and other regions with list of captured variables. Currently all variables used in OpenMP regions are captured into a record and passed to outlined functions in this record. It may result in some poor performance because of too complex analysis later in optimization passes. Patch makes to emit outlined functions for parallel-based regions with a list of captured variables. It reduces code for 2*n GEPs, stores and loads at least. Codegen for task-based regions remains unchanged because runtime requires that all captured variables are passed in captured record. llvm-svn: 247251	2015-09-10 08:12:02 +00:00
Hans Wennborg	e89c8c8033	Revert r247218: "Fix Clang-tidy misc-use-override warnings, other minor fixes" Seems it broke the Polly build. From http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/11687/steps/compile/logs/stdio: In file included from /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/lib/TableGen/Record.cpp:14:0: /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:369:3: error: looser throw specifier for 'virtual llvm::TypedInit::~TypedInit()' /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:270:11: error: overriding 'virtual llvm::Init::~Init() noexcept (true)' llvm-svn: 247222	2015-09-10 00:37:18 +00:00
Hans Wennborg	60f3e1f466	Fix Clang-tidy misc-use-override warnings, other minor fixes Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12741 llvm-svn: 247218	2015-09-10 00:24:40 +00:00
John McCall	7f416cc426	Compute and preserve alignment more faithfully in IR-generation. Introduce an Address type to bundle a pointer value with an alignment. Introduce APIs on CGBuilderTy to work with Address values. Change core APIs on CGF/CGM to traffic in Address where appropriate. Require alignments to be non-zero. Update a ton of code to compute and propagate alignment information. As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment helper function to CGF and made use of it in a number of places in the expression emitter. The end result is that we should now be significantly more correct when performing operations on objects that are locally known to be under-aligned. Since alignment is not reliably tracked in the type system, there are inherent limits to this, but at least we are no longer confused by standard operations like derived-to-base conversions and array-to-pointer decay. I've also fixed a large number of bugs where we were applying the complete-object alignment to a pointer instead of the non-virtual alignment, although most of these were hidden by the very conservative approach we took with member alignment. Also, because IRGen now reliably asserts on zero alignments, we should no longer be subject to an absurd but frustrating recurring bug where an incomplete type would report a zero alignment and then we'd naively do a alignmentAtOffset on it and emit code using an alignment equal to the largest power-of-two factor of the offset. We should also now be emitting much more aggressive alignment attributes in the presence of over-alignment. In particular, field access now uses alignmentAtOffset instead of min. Several times in this patch, I had to change the existing code-generation pattern in order to more effectively use the Address APIs. For the most part, this seems to be a strict improvement, like doing pointer arithmetic with GEPs instead of ptrtoint. That said, I've tried very hard to not change semantics, but it is likely that I've failed in a few places, for which I apologize. ABIArgInfo now always carries the assumed alignment of indirect and indirect byval arguments. In order to cut down on what was already a dauntingly large patch, I changed the code to never set align attributes in the IR on non-byval indirect arguments. That is, we still generate code which assumes that indirect arguments have the given alignment, but we don't express this information to the backend except where it's semantically required (i.e. on byvals). This is likely a minor regression for those targets that did provide this information, but it'll be trivial to add it back in a later patch. I partially punted on applying this work to CGBuiltin. Please do not add more uses of the CreateDefaultAligned{Load,Store} APIs; they will be going away eventually. llvm-svn: 246985	2015-09-08 08:05:57 +00:00
Benjamin Kramer	9b81903607	[OpenMP] Make helper functoin static. NFC. llvm-svn: 246657	2015-09-02 15:31:05 +00:00
Alexey Bataev	d6fdc8b685	[OPENMP 4.0] Codegen for array sections. Added codegen for array section in 'depend' clause of 'task' directive. It emits to pointers, one for the begin of array section and another for the end of array section. Size of the section is calculated as (end + 1 - start) * sizeof(basic_element_type). llvm-svn: 246422	2015-08-31 07:32:19 +00:00
Daniel Jasper	ad5b7962c9	Revert "[OPENMP 4.0] Codegen for array sections." The test is currently failing on bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/12747/ llvm-svn: 246288	2015-08-28 08:42:22 +00:00
Alexey Bataev	117fb35cf7	[OPENMP 4.0] Codegen for array sections. Added codegen for array section in 'depend' clause of 'task' directive. It emits to pointers, one for the begin of array section and another for the end of array section. Size of the section is calculated as (end + 1 - start) * sizeof(basic_element_type). llvm-svn: 246278	2015-08-28 06:09:05 +00:00
David Blaikie	7e70d6803d	Devirtualize EHScopeStack::Cleanup's dtor because it's never destroyed polymorphically llvm-svn: 245378	2015-08-18 22:40:54 +00:00
Filipe Cabecinhas	7af183d841	Propagate SourceLocations through to get a Loc on float_cast_overflow Summary: float_cast_overflow is the only UBSan check without a source location attached. This patch propagates SourceLocations where necessary to get them to the EmitCheck() call. Reviewers: rsmith, ABataev, rjmccall Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D11757 llvm-svn: 244568	2015-08-11 04:19:28 +00:00
Samuel Antao	f8b5012dfb	[OpenMP] Add TLS-based implementation for threadprivate directive. llvm-svn: 242080	2015-07-13 22:54:53 +00:00
Alexey Bataev	7d5d33ea33	[OPENMP 4.0] Codegen for 'omp cancel' directive. Add the next codegen for 'omp cancel' directive: if (__kmpc_cancel()) { __kmpc_cancel_barrier(); <exit construct>; } llvm-svn: 241429	2015-07-06 05:50:32 +00:00
Alexey Bataev	81c7ea0ec3	[OPENMP 4.0] Fixed codegen for 'cancellation point' construct. Generate the next code for 'cancellation point': if (__kmpc_cancellationpoint()) { __kmpc_cancel_barrier(); <exit construct>; } llvm-svn: 241336	2015-07-03 09:56:58 +00:00
Alexey Bataev	0f34da12e4	[OPENMP 4.0] Codegen for 'cancellation point' directive. The next code is generated for this construct: ``` if (__kmpc_cancellationpoint(ident_t *loc, kmp_int32 global_tid, kmp_int32 cncl_kind) != 0) <exit from outer innermost construct>; ``` llvm-svn: 241239	2015-07-02 04:17:07 +00:00
Alexey Bataev	1d2353d4f3	[OPENMP] Codegen for 'depend' clause (OpenMP 4.0). If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t loc_ref, kmp_int32 gtid, kmp_task_t new_task, kmp_int32 ndeps, kmp_depend_info_t dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t noalias_dep_list) must be called instead of __kmpc_omp_task(). If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called. Array sections are not supported yet. llvm-svn: 240532	2015-06-24 11:01:36 +00:00
Alexey Bataev	d157d47062	Proper changing/restoring for CapturedStmtInfo, NFC. Added special RAII class for proper values changing/restoring in CodeGenFunction::CapturedStmtInfo. llvm-svn: 240517	2015-06-24 03:35:38 +00:00
Alexey Bataev	7f210c6dab	[OPENMP] Codegen for 'proc_bind' clause (4.0). Adds emission of the code for 'proc_bind(master\|close\|spread)' clause: call void @__kmpc_push_proc_bind(<loc>, i32 thread_id, i32 4\|3\|2) llvm-svn: 240018	2015-06-18 13:40:03 +00:00
Alexey Bataev	c30dd2daf9	[OPENMP] Support for '#pragma omp taskgroup' directive. Added parsing, sema analysis and codegen for '#pragma omp taskgroup' directive (OpenMP 4.0). The code for directive is generated the following way: #pragma omp taskgroup <body> void __kmpc_taskgroup(<loc>, thread_id); <body> void __kmpc_end_taskgroup(<loc>, thread_id); llvm-svn: 240011	2015-06-18 12:14:09 +00:00
Alexey Bataev	89e7e8eb0e	[OPENMP] Supported reduction clause in omp simd construct. The following code is generated for reduction clause within 'omp simd' loop construct: #pragma omp simd reduction(op:var) for (...) <body> alloca priv_var priv_var = <initial reduction value>; <loop_start>: <body> // references to original 'var' are replaced by 'priv_var' <loop_end>: var op= priv_var; llvm-svn: 239881	2015-06-17 06:21:39 +00:00
Alexey Bataev	3ae88e2124	[OPENMP] Prepare codegen for privates in tasks for non-capturing of privates in CapturedStmt. Reworked codegen for privates in tasks: call @kmpc_omp_task_alloc(); ... call @kmpc_omp_task(task_proxy); void map_privates(.privates_rec. privs, type1 * priv1_ref, ..., typen *privn_ref) { priv1_ref = &privs->private1; ... privn_ref = &privs->privaten; ret void } i32 task_entry(i32 ThreadId, i32 PartId, void privs, void (void, ...) map_privates, shareds captures) { type1 priv1; ... typen privn; call map_privates(privs, priv1, ..., privn); <Task body with priv1, .., privn instead of the captured variables>. ret i32 } i32 task_proxy(i32 ThreadId, kmp_task_t_with_privates *tt) { call task_entry(ThreadId, tt->task_data.PartId, &tt->privates, map_privates, tt->task_data.shareds); } llvm-svn: 238010	2015-05-22 08:56:35 +00:00
Alexey Bataev	5129d3a4f5	[OPENMP] Fixed codegen for parameters privatization. For parameters we shall take a derived type of parameters, not the original one. llvm-svn: 237882	2015-05-21 09:47:46 +00:00
Alexey Bataev	d7589ffe1d	[OPENMP] Fix codegen for ordered loop directives. loops with ordered clause must be generated the same way as dynamic loops, but with static scheduleing. llvm-svn: 237788	2015-05-20 13:12:48 +00:00
Alexey Bataev	1d9c15cf18	[OPENMP] Fixed codegen for copying/initialization of array variables/parameters. This modification generates proper copyin/initialization sequences for array variables/parameters. Before they were considered as pointers, not arrays. llvm-svn: 237691	2015-05-19 12:31:28 +00:00
Alexey Bataev	8fc69dcf42	[OPENMP] Fix for '#pragma omp task' codegen. Internal task structure must be generated like typedef struct kmp_task { void * shareds; kmp_routine_entry_t routine; kmp_int32 part_id; kmp_routine_entry_t destructors; } kmp_task_t; struct kmp_task_t_with_privates { kmp_task_t task_data; .kmp_private. privates; }; to avoid possible additional alignment bytes in first fields (shareds, routine, part_id and destructors). Runtime library is not aware of such kind additional alignment bytes. llvm-svn: 237561	2015-05-18 07:54:53 +00:00
Alexey Bataev	69a4779965	[OPENMP] Fixed codegen for 'reduction' clause. Fixed codegen for reduction operations min, max, && and \|\|. Codegen for them is quite similar and I was confused by this similarity. Also added a call to kmpc_end_reduce() in atomic part of reduction codegen (call to kmpc_end_reduce_nowait() is not required). Differential Revision: http://reviews.llvm.org/D9513 llvm-svn: 236689	2015-05-07 03:54:03 +00:00
Alexey Bataev	a744ff58f3	[OPENMP] Fixed incorrect work with cleanups, NFC. Destructors are never called for cleanups, so we can't use SmallVector as a member. Differential Revision: http://reviews.llvm.org/D9399 llvm-svn: 236491	2015-05-05 09:24:37 +00:00

... 3 4 5 6 7 ...

519 Commits