Commit Graph

209 Commits

Author SHA1 Message Date
Alexey Bataev 26b2577f6b [OPENMP] Codegen for untied tasks.
If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks.

llvm-svn: 266722
2016-04-19 09:10:27 +00:00
JF Bastien 92f4ef1017 NFC: make AtomicOrdering an enum class
Summary: See LLVM change D18775 for details, this change depends on it.

Reviewers: jyknight, reames

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18776

llvm-svn: 265569
2016-04-06 17:26:42 +00:00
Carlo Bertolli c687225b43 [OPENMP] Codegen for teams directive for NVPTX
This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it:

Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region.
Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime.
Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region.

http://reviews.llvm.org/D17963

llvm-svn: 265304
2016-04-04 15:55:02 +00:00
Alexey Bataev 14fa1c6b60 [OPENMP] Allow runtime insert its own code inside OpenMP regions.
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
 (required) code to support target specific codegen.

llvm-svn: 264700
2016-03-29 05:34:15 +00:00
Alexey Bataev f539faa733 Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions."
Reverting because of failed tests.

llvm-svn: 264577
2016-03-28 12:58:34 +00:00
Alexey Bataev 424be92831 [OPENMP] Allow runtime insert its own code inside OpenMP regions.
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
 (required) code to support target specific codegen.

llvm-svn: 264576
2016-03-28 12:52:58 +00:00
Alexey Bataev f662b5943c Revert "[OPENMP] Allow runtime insert its own code inside OpenMP regions."
This reverts commit 3ee791165100607178073f14531a0dc90c622b36.

llvm-svn: 264570
2016-03-28 10:12:03 +00:00
Alexey Bataev b8c425c4f7 [OPENMP] Allow runtime insert its own code inside OpenMP regions.
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
  (required) code to support target specific codegen.

llvm-svn: 264569
2016-03-28 09:53:43 +00:00
Arpith Chacko Jacob 5c309e475d [OpenMP] Base support for target directive codegen on NVPTX device.
Summary:
This patch adds base support for codegen of the target directive on the NVPTX device.

Reviewers: ABataev

Differential Revision: http://reviews.llvm.org/D17877

Reworked test case after buildbot failure on windows.
Updated patch to integrate r263837 and test case nvptx_target_firstprivate_codegen.cpp.

llvm-svn: 264018
2016-03-22 01:48:56 +00:00
Carlo Bertolli b74bfc80a4 [OPENMP] Implementation of codegen for firstprivate clause of target directive
This patch implements the following aspects:

It extends sema to check that a variable is not reference in both a map clause and firstprivate or private. This is needed to ensure correct functioning at codegen level, apart from being useful for the user.
It implements firstprivate for target in codegen. The implementation applies to both host and nvptx devices.
It adds regression tests for codegen of firstprivate, host and device side when using the host as device, and nvptx side.
Please note that the regression test for nvptx codegen is missing VLAs. This is because VLAs currently require saving and restoring the stack which appears not to be a supported operation by nvptx backend.

It adds a check in sema regression tests for target map, firstprivate, and private clauses.

http://reviews.llvm.org/D18203

llvm-svn: 263837
2016-03-18 21:43:32 +00:00
Arpith Chacko Jacob 129fa9a048 Revert r263783 as buildbot failure is being investigated.
llvm-svn: 263784
2016-03-18 12:39:40 +00:00
Arpith Chacko Jacob ac563708ab [OpenMP] Base support for target directive codegen on NVPTX device.
Summary:
Reworked test case after buildbot failure on windows.

This patch adds base support for codegen of the target directive on the NVPTX device.

Reviewers: ABataev

Differential Revision: http://reviews.llvm.org/D17877

llvm-svn: 263783
2016-03-18 11:47:43 +00:00
Alexey Bataev a839dddf92 [OPENMP 4.0] Use 'declare reduction' constructs in 'reduction' clauses.
OpenMP 4.0 allows to define custom reduction operations using '#pragma
omp declare reduction' construct. Patch allows to use this custom
defined reduction operations in 'reduction' clauses.

llvm-svn: 263701
2016-03-17 10:19:46 +00:00
Carlo Bertolli a03acfa359 [OPENMP] Support for codegen of private clause of target, host side
This patch adds support for codegen of private clause of target and a regression test for host code generation, when the host is used as target device. I believe that code generation for nvptx backend would not require anything additional or different to what is done for the host.

http://reviews.llvm.org/D18105

llvm-svn: 263654
2016-03-16 19:04:22 +00:00
Arpith Chacko Jacob 9cb61faa61 Revert commit http://reviews.llvm.org/D17877 to fix tests on x86.
llvm-svn: 263589
2016-03-15 21:26:34 +00:00
Arpith Chacko Jacob 5e1493b560 [OpenMP] Base support for target directive codegen on NVPTX device.
Summary:
This patch adds base support for codegen of the target directive on the NVPTX device.

Reviewers: ABataev

Differential Revision: http://reviews.llvm.org/D17877

llvm-svn: 263587
2016-03-15 21:04:57 +00:00
Arpith Chacko Jacob fc46c25d74 Reverted http://reviews.llvm.org/D17877 to fix tests.
llvm-svn: 263555
2016-03-15 16:19:13 +00:00
Arpith Chacko Jacob c61744c26b [OpenMP] Base support for target directive codegen on NVPTX device.
Summary:
This patch adds base support for codegen of the target directive on the NVPTX device.

Reviewers: ABataev

Differential Revision: http://reviews.llvm.org/D17877

llvm-svn: 263552
2016-03-15 15:24:52 +00:00
John McCall c56a8b3284 Preserve ExtParameterInfos into CGFunctionInfo.
As part of this, make the function-arrangement interfaces
a little simpler and more semantic.

NFC.

llvm-svn: 263191
2016-03-11 04:30:31 +00:00
Carlo Bertolli fc35ad2bbc Reapply r262741 [OPENMP] Codegen for distribute directive
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.

http://reviews.llvm.org/D17170

llvm-svn: 262832
2016-03-07 16:04:49 +00:00
Samuel Antao bf4d18d3d2 Revert r262741 - [OPENMP] Codegen for distribute directive
Was causing a failure in one of the buildbot slaves.

llvm-svn: 262744
2016-03-04 21:02:14 +00:00
Carlo Bertolli 4a56e3831d [OPENMP] Codegen for distribute directive
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.

http://reviews.llvm.org/D17170

llvm-svn: 262741
2016-03-04 20:24:58 +00:00
Alexey Bataev c5b1d320b8 [OPENMP 4.0] Codegen for 'declare reduction' construct.
Emit function for 'combiner' part of 'declare reduction' construct and
'initialilzer' part, if any.

llvm-svn: 262699
2016-03-04 09:22:22 +00:00
Carlo Bertolli 430d8ecc55 Add code generation for teams directive inside target region
llvm-svn: 262652
2016-03-03 20:34:23 +00:00
Samuel Antao b68e2db8f9 [OpenMP] Code generation for teams - kernel launching
Summary:
This patch implements the launching of a target region in the presence of a nested teams region, i.e calls tgt_target_teams with the required arguments gathered from the enclosed teams directive.

The actual codegen of the region enclosed by the teams construct will be contributed in a separate patch.

Reviewers: hfinkel, arpith-jacob, kkwli0, carlo.bertolli, ABataev

Subscribers: cfe-commits, caomhin, fraggamuffin

Differential Revision: http://reviews.llvm.org/D17019

llvm-svn: 262625
2016-03-03 16:20:23 +00:00
Alexey Bataev 50b3c95992 [OPENMP] Improved layout of CGOpenMPRuntime class, NFC.
llvm-svn: 261315
2016-02-19 10:38:26 +00:00
Richard Trieu cc3949d99a Remove use of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261271
2016-02-18 22:34:54 +00:00
Samuel Antao 2de62b0c89 [OpenMP] Rename the offload entry points.
Summary:
Unlike other outlined regions in OpenMP, offloading entry points have to have be visible (external linkage) for the device side. Using dots in the names of the entries can be therefore problematic for some toolchains, e.g. NVPTX.

Also the patch drops the column information in the unique name of the entry points. The parsing of directives ignore unknown tokens, preventing several target  regions to be implemented in the same line. Therefore, the line information is sufficient for the name to be unique. Also, the preprocessor printer does not preserve the column information, causing offloading-entry detection issues if the host uses an integrated preprocessor and the target doesn't (or vice versa).

Reviewers: hfinkel, arpith-jacob, carlo.bertolli, kkwli0, ABataev

Subscribers: cfe-commits, fraggamuffin, caomhin

Differential Revision: http://reviews.llvm.org/D17179

llvm-svn: 260837
2016-02-13 23:35:10 +00:00
Eugene Zelenko 0a4f3f4373 Fix some Clang-tidy readability-redundant-control-flow warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D17060

llvm-svn: 260414
2016-02-10 19:11:58 +00:00
Alexey Bataev 31300ed0a5 [OPENMP 4.0] Fixed support of array sections/array subscripts.
Codegen for array sections/array subscripts worked only for expressions with arrays as base. Patch fixes codegen for bases with pointer/reference types.

llvm-svn: 259776
2016-02-04 11:27:03 +00:00
Benjamin Kramer 8c30592e18 Move DebugInfoKind into its own header to cut the cyclic dependency edge from Driver to Frontend.
llvm-svn: 259489
2016-02-02 11:06:51 +00:00
Eugene Zelenko 1660a5d298 Fix Clang-tidy modernize-use-nullptr warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D16567

llvm-svn: 258836
2016-01-26 19:01:06 +00:00
Alexey Bataev 1189bd0205 [OPENMP 4.5] Allow arrays in 'reduction' clause.
OpenMP 4.5, alogn with array sections, allows to use variables of array type in reductions.

llvm-svn: 258804
2016-01-26 12:20:39 +00:00
Alexey Bataev 3015bcc62a [OPENMP] Generalize codegen for 'sections'-based directive.
If 'sections' directive has only one sub-section, the code for 'single'-based directive was emitted. Removed this codegen, because it causes crashes in different cases.

llvm-svn: 258495
2016-01-22 08:56:50 +00:00
Alexey Bataev 8524d15954 [OPENMP] Fix crash on reduction for complex variables.
reworked codegen for reduction operation for complex types to avoid crash

llvm-svn: 258394
2016-01-21 12:35:58 +00:00
Alexey Bataev 9619f04c0e [OPENMP 4.0] Fix for codegen of 'cancel' directive within 'sections' directive.
Allow to emit code for 'cancel' directive within 'sections' directive with single sub-section.

llvm-svn: 258307
2016-01-20 12:29:47 +00:00
Hans Wennborg 45c7439d11 Don't store CGOpenMPRegionInfo::CodeGen as a reference (PR26078)
The referenced llvm::function_ref<void(CodeGenFunction &)> object can go
away before CodeGen is used, resulting in a crash.

llvm-svn: 257516
2016-01-12 20:54:36 +00:00
Samuel Antao ee8fb302f5 [OpenMP] Reapply rL256842: [OpenMP] Offloading descriptor registration and device codegen.
This patch attempts to fix the regressions identified when the patch was committed initially. 

Thanks to Michael Liao for identifying the fix in the offloading metadata generation 
related with side effects in evaluation of function arguments. 
 

llvm-svn: 256933
2016-01-06 13:42:12 +00:00
Samuel Antao 7d5de9a1ee [OpenMP] Revert rL256842: [OpenMP] Offloading descriptor registration and device codegen.
It was causing two regression, so I'm reverting until the cause is found.

llvm-svn: 256858
2016-01-05 19:16:13 +00:00
Samuel Antao 4d5f0bbea1 [OpenMP] Offloading descriptor registration and device codegen.
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.

This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.

About offloading descriptor:

The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
 void *addr;
 char *name;
 int64_t size;
};
```  
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.

The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.

The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.


About target codegen:

The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}

!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives. 
Entry 3) a unique ID of the file where the input source file that contain the target region lives. 
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.

Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )

This patch replaces http://reviews.llvm.org/D12306.

Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao

Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits

Differential Revision: http://reviews.llvm.org/D12614

llvm-svn: 256842
2016-01-05 16:23:04 +00:00
Alexey Bataev a636c7f9b9 [OPENMP 4.5] Parsing/sema for 'depend(sink:vec)' clause in 'ordered' directive.
OpenMP 4.5 adds 'depend(sink:vec)' in 'ordered' directive for doacross loop synchronization. Patch adds parsing and semantic analysis for this clause.

llvm-svn: 256330
2015-12-23 10:27:45 +00:00
Alexey Bataev 29c9209357 [OPENMP] Revert r256238 to fix the problem with tests on Linux.
llvm-svn: 256239
2015-12-22 12:44:46 +00:00
Alexey Bataev ef4c5584d5 [OPENMP 4.5] Parsing/sema for 'depend(sink:vec)' clause in 'ordered' directive.
OpenMP 4.5 adds 'depend(sink:vec)' in 'ordered' directive for doacross loop synchronization. Patch adds parsing and semantic analysis for this clause.

llvm-svn: 256238
2015-12-22 12:21:47 +00:00
Alexey Bataev 8ef3141127 [OPENMP] Fix for http://llvm.org/PR25878: Error compiling an OpenMP program
OpenMP codegen tried to emit the code for its constructs even if it was detected as a dead-code. Added checks to ensure that the code is emitted if the code is not dead.

llvm-svn: 255990
2015-12-18 07:58:25 +00:00
Alexey Bataev eb48235033 [OPENMP 4.5] Parsing/sema analysis for 'depend(source)' clause in 'ordered' directive.
OpenMP 4.5 adds 'depend(source)' clause for 'ordered' directive to support cross-iteration dependence. Patch adds parsing and semantic analysis for this construct.

llvm-svn: 255986
2015-12-18 05:05:56 +00:00
Alexey Bataev fc57d1601d [OPENMP 4.5] Codegen for 'hint' clause of 'critical' directive
OpenMP 4.5 defines 'hint' clause for 'critical' directive. Patch adds codegen for this clause.

llvm-svn: 255639
2015-12-15 10:55:09 +00:00
Samuel Antao 4af1b7b693 [OpenMP] Update target directive codegen to use 4.5 implicit data mappings.
Summary:
This patch implements the 4.5 specification for the implicit data maps. OpenMP 4.5 specification changes the default way data is captured into a target region. All the non-aggregate kinds are passed by value by default. This required activating the capturing by value during SEMA for the target region. All the non-aggregate values that can be encoded in the size of a pointer are properly casted and forwarded to the runtime library. On top of fixing the previous weird behavior for mapping pointers in nested data regions (an explicit map was always required), this also improves performance as the number of allocations/transactions to the device per non-aggregate map are reduced from two to only one - instead of passing a reference and the value, only the value passed.

Explicit maps will be added later on once firstprivate, private, and map clauses' SEMA and parsing are available.

Reviewers: hfinkel, rjmccall, ABataev

Subscribers: cfe-commits, carlo.bertolli

Differential Revision: http://reviews.llvm.org/D14940

llvm-svn: 254521
2015-12-02 17:44:43 +00:00
Alexey Bataev 40e36f1f64 [OPENMP] Fix crash on codegen for 'task' directive with no shared variables.
If 'task' region does not have shared variables codegen could crash on calculation of size of list of shared variables.

llvm-svn: 253977
2015-11-24 13:01:44 +00:00
Alexey Bataev 92e82f9cce [OPENMP] 'out' dependency for 'task' directives must be the same as 'inout'.
Runtime library requires, that codegen for 'depend' clause for 'out' dependency kind must be the same as codegen for 'depend' clause with 'inout' dependency.

llvm-svn: 253866
2015-11-23 13:33:42 +00:00
Akira Hatanaka 7791f1a4a9 [CodeGen] Call SetInternalFunctionAttributes to attach function
attributes to internal functions.

This patch fixes CodeGenModule::CreateGlobalInitOrDestructFunction to
use SetInternalFunctionAttributes instead of SetLLVMFunctionAttributes
to attach function attributes to internal functions.

Also, make sure the correct CGFunctionInfo is passed instead of always
passing what arrangeNullaryFunction returns.

rdar://problem/20828324

Differential Revision: http://reviews.llvm.org/D13610

llvm-svn: 251734
2015-10-31 01:28:07 +00:00
Benjamin Kramer e003ca2a03 Put global classes into the appropriate namespace.
Most of the cases belong into an anonymous namespace. No functionality
change intended.

llvm-svn: 251514
2015-10-28 13:54:16 +00:00
Akira Hatanaka 44a59f8976 [CodeGen] Attach function attributes to Objective-C and OpenMP
functions.

This commit fixes a bug in CGOpenMPRuntime.cpp and CGObjC.cpp where
some of the function attributes are not attached to newly created
functions.

rdar://problem/20828324

Differential Revision: http://reviews.llvm.org/D13928

llvm-svn: 251476
2015-10-28 02:30:47 +00:00
Alexey Bataev f24e7b1f60 [OPENMP 4.1] Codegen for array sections/subscripts in 'reduction' clause.
OpenMP 4.1 adds support for array sections/subscripts in 'reduction' clause. Patch adds codegen for this feature.

llvm-svn: 249672
2015-10-08 09:10:53 +00:00
Samuel Antao bed3c46632 [OpenMP] Target directive host codegen.
This patch implements the outlining for offloading functions for code 
annotated with the OpenMP target directive. It uses a temporary naming 
of the outlined functions that will have to be updated later on once 
target side codegen and registration of offloading libraries is 
implemented - the naming needs to be made unique in the produced 
library.

llvm-svn: 249148
2015-10-02 16:14:20 +00:00
Craig Topper 8674c5cf70 Remove 'const' from some ArrayRef arguments since they're passed by value anyway. NFC
llvm-svn: 248774
2015-09-29 04:30:07 +00:00
Alexey Bataev 5f600d6a49 [OPENMP 4.1] Codegen for ‘simd’ clause in ‘ordered’ directive.
Description.
If the simd clause is specified, the ordered regions encountered by any thread will use only a single SIMD lane to execute the ordered regions in the order of the loop iterations.
Restrictions.
An ordered construct with the simd clause is the only OpenMP construct that can appear in the simd region.

An ordered directive with ‘simd’ clause is generated as an outlined function and corresponding function call to prevent this part of code from vectorization later in backend.

llvm-svn: 248772
2015-09-29 03:48:57 +00:00
Alexey Bataev 87933c7ced [OPENMP 4.0] Add 'if' clause for 'cancel' directive.
Add parsing, sema analysis and codegen for 'if' clause in 'cancel' directive.

llvm-svn: 247976
2015-09-18 08:07:34 +00:00
Alexey Bataev 25e5b44654 [OPENMP] Emit __kmpc_cancel_barrier() and code for 'cancellation point' only if 'cancel' is found.
Patch improves codegen for OpenMP constructs. If the OpenMP region does not have internal 'cancel' construct, a call to 'void __kmpc_barrier()' runtime function is generated for all implicit/explicit barriers. If the region has inner 'cancel' directive, then
```
if (__kmpc_cancel_barrier())
  exit from outer construct;
```
code is generated.
Also, the code for 'canellation point' directive is not generated if parent directive does not have 'cancel' directive.

llvm-svn: 247681
2015-09-15 12:52:43 +00:00
Evgeniy Stepanov 6b2a61d3a5 Revert "Always_inline codegen rewrite" and 2 follow-ups.
Revert "Update cxx-irgen.cpp test to allow signext in alwaysinline functions."
Revert "[CodeGen] Remove wrapper-free always_inline functions from COMDATs"
Revert "Always_inline codegen rewrite."

Reason for revert: PR24793.

llvm-svn: 247620
2015-09-14 21:35:16 +00:00
Evgeniy Stepanov 93db40a147 Always_inline codegen rewrite.
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.

Libc++ relies on the compiler never emitting such undefined reference.

With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
  -- or, depending on the linkage --
2b. A declaration of F.

The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).

This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.

This patch is based on ideas by Chandler Carruth and Richard Smith.

llvm-svn: 247494
2015-09-12 01:07:37 +00:00
Evgeniy Stepanov 67037ee21e Revert "Specify target triple in alwaysinline tests."
Revert "Always_inline codegen rewrite."

Breaks gdb & lldb tests.
Breaks on Fedora 22 x86_64.

llvm-svn: 247491
2015-09-11 23:48:37 +00:00
Evgeniy Stepanov 072e83500e Always_inline codegen rewrite.
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.

Libc++ relies on the compiler never emitting such undefined reference.

With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
  -- or, depending on the linkage --
2b. A declaration of F.

The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).

This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.

This patch is based on ideas by Chandler Carruth and Richard Smith.

llvm-svn: 247465
2015-09-11 20:29:07 +00:00
Alexey Bataev c71a4099cf [OPENMP] Preserve alignment of the original variables for the captured references.
Patch makes codegen to preserve alignment of the shared variables captured and used in OpenMP regions.

llvm-svn: 247401
2015-09-11 10:29:41 +00:00
Hans Wennborg 7eb5464bc5 Re-commit r247218: "Fix Clang-tidy misc-use-override warnings, other minor fixes"
This never broke the build; it was the LLVM side, r247216, that caused problems.

llvm-svn: 247302
2015-09-10 17:07:54 +00:00
Alexey Bataev 2377fe95c6 [OPENMP] Outlined function for parallel and other regions with list of captured variables.
Currently all variables used in OpenMP regions are captured into a record and passed to outlined functions in this record. It may result in some poor performance because of too complex analysis later in optimization passes. Patch makes to emit outlined functions for parallel-based regions with a list of captured variables. It reduces code for 2*n GEPs, stores and loads at least.
Codegen for task-based regions remains unchanged because runtime requires that all captured variables are passed in captured record.

llvm-svn: 247251
2015-09-10 08:12:02 +00:00
Hans Wennborg e89c8c8033 Revert r247218: "Fix Clang-tidy misc-use-override warnings, other minor fixes"
Seems it broke the Polly build.
From http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/11687/steps/compile/logs/stdio:

In file included from /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/lib/TableGen/Record.cpp:14:0:
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:369:3: error: looser throw specifier for 'virtual llvm::TypedInit::~TypedInit()'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:270:11: error:   overriding 'virtual llvm::Init::~Init() noexcept (true)'

llvm-svn: 247222
2015-09-10 00:37:18 +00:00
Hans Wennborg 60f3e1f466 Fix Clang-tidy misc-use-override warnings, other minor fixes
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D12741

llvm-svn: 247218
2015-09-10 00:24:40 +00:00
John McCall 7f416cc426 Compute and preserve alignment more faithfully in IR-generation.
Introduce an Address type to bundle a pointer value with an
alignment.  Introduce APIs on CGBuilderTy to work with Address
values.  Change core APIs on CGF/CGM to traffic in Address where
appropriate.  Require alignments to be non-zero.  Update a ton
of code to compute and propagate alignment information.

As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.

The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned.  Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay.  I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.

Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.

We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment.  In particular,
field access now uses alignmentAtOffset instead of min.

Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs.  For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint.  That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.

ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments.  In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments.  That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.

I partially punted on applying this work to CGBuiltin.  Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.

llvm-svn: 246985
2015-09-08 08:05:57 +00:00
Benjamin Kramer 9b81903607 [OpenMP] Make helper functoin static. NFC.
llvm-svn: 246657
2015-09-02 15:31:05 +00:00
Alexey Bataev d6fdc8b685 [OPENMP 4.0] Codegen for array sections.
Added codegen for array section in 'depend' clause of 'task' directive. It emits to pointers, one for the begin of array section and another for the end of array section. Size of the section is calculated as (end + 1 - start) * sizeof(basic_element_type).

llvm-svn: 246422
2015-08-31 07:32:19 +00:00
Daniel Jasper ad5b7962c9 Revert "[OPENMP 4.0] Codegen for array sections."
The test is currently failing on bots:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/12747/

llvm-svn: 246288
2015-08-28 08:42:22 +00:00
Alexey Bataev 117fb35cf7 [OPENMP 4.0] Codegen for array sections.
Added codegen for array section in 'depend' clause of 'task' directive. It emits to pointers, one for the begin of array section and another for the end of array section. Size of the section is calculated as (end + 1 - start) * sizeof(basic_element_type).

llvm-svn: 246278
2015-08-28 06:09:05 +00:00
David Blaikie 7e70d6803d Devirtualize EHScopeStack::Cleanup's dtor because it's never destroyed polymorphically
llvm-svn: 245378
2015-08-18 22:40:54 +00:00
Filipe Cabecinhas 7af183d841 Propagate SourceLocations through to get a Loc on float_cast_overflow
Summary:
float_cast_overflow is the only UBSan check without a source location attached.
This patch propagates SourceLocations where necessary to get them to the
EmitCheck() call.

Reviewers: rsmith, ABataev, rjmccall

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D11757

llvm-svn: 244568
2015-08-11 04:19:28 +00:00
Samuel Antao f8b5012dfb [OpenMP] Add TLS-based implementation for threadprivate directive.
llvm-svn: 242080
2015-07-13 22:54:53 +00:00
Alexey Bataev 7d5d33ea33 [OPENMP 4.0] Codegen for 'omp cancel' directive.
Add the next codegen for 'omp cancel' directive:
if (__kmpc_cancel()) {
  __kmpc_cancel_barrier();
  <exit construct>;
}

llvm-svn: 241429
2015-07-06 05:50:32 +00:00
Alexey Bataev 81c7ea0ec3 [OPENMP 4.0] Fixed codegen for 'cancellation point' construct.
Generate the next code for 'cancellation point':
if (__kmpc_cancellationpoint()) {
  __kmpc_cancel_barrier();
  <exit construct>;
}

llvm-svn: 241336
2015-07-03 09:56:58 +00:00
Alexey Bataev 0f34da12e4 [OPENMP 4.0] Codegen for 'cancellation point' directive.
The next code is generated for this construct:
```
if (__kmpc_cancellationpoint(ident_t *loc, kmp_int32 global_tid, kmp_int32 cncl_kind) != 0)
  <exit from outer innermost construct>;
```

llvm-svn: 241239
2015-07-02 04:17:07 +00:00
Alexey Bataev 1d2353d4f3 [OPENMP] Codegen for 'depend' clause (OpenMP 4.0).
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.

llvm-svn: 240532
2015-06-24 11:01:36 +00:00
Alexey Bataev d157d47062 Proper changing/restoring for CapturedStmtInfo, NFC.
Added special RAII class for proper values changing/restoring in CodeGenFunction::CapturedStmtInfo.

llvm-svn: 240517
2015-06-24 03:35:38 +00:00
Alexey Bataev 7f210c6dab [OPENMP] Codegen for 'proc_bind' clause (4.0).
Adds emission of the code for 'proc_bind(master|close|spread)' clause:
call void @__kmpc_push_proc_bind(<loc>, i32 thread_id, i32 4|3|2)

llvm-svn: 240018
2015-06-18 13:40:03 +00:00
Alexey Bataev c30dd2daf9 [OPENMP] Support for '#pragma omp taskgroup' directive.
Added parsing, sema analysis and codegen for '#pragma omp taskgroup' directive (OpenMP 4.0).
The code for directive is generated the following way:
#pragma omp taskgroup
<body>

void __kmpc_taskgroup(<loc>, thread_id);
<body>
void __kmpc_end_taskgroup(<loc>, thread_id);

llvm-svn: 240011
2015-06-18 12:14:09 +00:00
Alexey Bataev 89e7e8eb0e [OPENMP] Supported reduction clause in omp simd construct.
The following code is generated for reduction clause within 'omp simd' loop construct:
#pragma omp simd reduction(op:var)
for (...)
  <body>

alloca priv_var
priv_var = <initial reduction value>;
<loop_start>:
<body> // references to original 'var' are replaced by 'priv_var'
<loop_end>:
var op= priv_var;

llvm-svn: 239881
2015-06-17 06:21:39 +00:00
Alexey Bataev 3ae88e2124 [OPENMP] Prepare codegen for privates in tasks for non-capturing of privates in CapturedStmt.
Reworked codegen for privates in tasks:

call @kmpc_omp_task_alloc();
...
call @kmpc_omp_task(task_proxy);

void map_privates(.privates_rec. *privs, type1 ** priv1_ref, ..., typen **privn_ref) {
  *priv1_ref = &privs->private1;
  ...
  *privn_ref = &privs->privaten;
  ret void
}

i32 task_entry(i32 ThreadId, i32 PartId, void* privs, void (void*, ...) map_privates, shareds* captures) {
  type1 **priv1;
  ...
  typen **privn;
  call map_privates(privs, priv1, ..., privn);
  <Task body with priv1, .., privn instead of the captured variables>.
  ret i32
}

i32 task_proxy(i32 ThreadId, kmp_task_t_with_privates *tt) {
  call task_entry(ThreadId, tt->task_data.PartId, &tt->privates, map_privates, tt->task_data.shareds);
}

llvm-svn: 238010
2015-05-22 08:56:35 +00:00
Alexey Bataev 5129d3a4f5 [OPENMP] Fixed codegen for parameters privatization.
For parameters we shall take a derived type of parameters, not the original one.

llvm-svn: 237882
2015-05-21 09:47:46 +00:00
Alexey Bataev d7589ffe1d [OPENMP] Fix codegen for ordered loop directives.
loops with ordered clause must be generated the same way as dynamic loops, but with static scheduleing.

llvm-svn: 237788
2015-05-20 13:12:48 +00:00
Alexey Bataev 1d9c15cf18 [OPENMP] Fixed codegen for copying/initialization of array variables/parameters.
This modification generates proper copyin/initialization sequences for array variables/parameters. Before they were considered as pointers, not arrays.

llvm-svn: 237691
2015-05-19 12:31:28 +00:00
Alexey Bataev 8fc69dcf42 [OPENMP] Fix for '#pragma omp task' codegen.
Internal task structure must be generated like
typedef struct kmp_task {
  void *              shareds;
  kmp_routine_entry_t routine;
  kmp_int32           part_id;
  kmp_routine_entry_t destructors;
} kmp_task_t;

struct kmp_task_t_with_privates {
  kmp_task_t task_data;
  .kmp_private. privates;
};
to avoid possible additional alignment bytes in first fields (shareds, routine, part_id and destructors). Runtime library is not aware of such kind additional alignment bytes.

llvm-svn: 237561
2015-05-18 07:54:53 +00:00
Alexey Bataev 69a4779965 [OPENMP] Fixed codegen for 'reduction' clause.
Fixed codegen for reduction operations min, max, && and ||. Codegen for them is quite similar and I was confused by this similarity.
Also added a call to kmpc_end_reduce() in atomic part of reduction codegen (call to kmpc_end_reduce_nowait() is not required).
Differential Revision: http://reviews.llvm.org/D9513

llvm-svn: 236689
2015-05-07 03:54:03 +00:00
Alexey Bataev a744ff58f3 [OPENMP] Fixed incorrect work with cleanups, NFC.
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399

llvm-svn: 236491
2015-05-05 09:24:37 +00:00
Alexey Bataev ce348a49b4 Revert revision 236487: [OPENMP] Fixed incorrect work with cleanups, NFC.
llvm-svn: 236490
2015-05-05 08:48:39 +00:00
Alexey Bataev fc80e26fe6 [OPENMP] Fixed incorrect work with cleanups, NFC.
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399

llvm-svn: 236487
2015-05-05 08:38:22 +00:00
Alexey Bataev 1d526a613d Revert revision 236482: [OPENMP] Fixed incorrect work with cleanups, NFC.
Due to some incompatibilities with Windows.

llvm-svn: 236483
2015-05-05 06:32:45 +00:00
Alexey Bataev 70542831fc [OPENMP] Fixed incorrect work with cleanups, NFC.
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399

llvm-svn: 236482
2015-05-05 06:21:01 +00:00
Alexey Bataev f4497be0bb Revert revision 236480: [OPENMP] Fixed incorrect work with cleanups, NFC.
Due to some incompatibilities with Windows.

llvm-svn: 236481
2015-05-05 04:56:26 +00:00
Alexey Bataev 329731ea75 [OPENMP] Fixed incorrect work with cleanups, NFC.
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399

llvm-svn: 236480
2015-05-05 04:42:07 +00:00
Alexey Bataev 9e03404d8d [OPENMP] Codegen for 'firstprivate' clause in 'task' directive.
For tasks codegen for private/firstprivate variables are different rather than for other directives.

1. Build an internal structure of privates for each private variable:
struct .kmp_privates_t. {
  Ty1 var1;
  ...
  Tyn varn;
};
2. Add a new field to kmp_task_t type with list of privates.
struct kmp_task_t {
  void *              shareds;
  kmp_routine_entry_t routine;
  kmp_int32           part_id;
  kmp_routine_entry_t destructors;
  .kmp_privates_t.    privates;
};
3. Create a function with destructors calls for all privates after end of task region.
kmp_int32 .omp_task_destructor.(kmp_int32 gtid, kmp_task_t *tt) {
  ~Destructor(&tt->privates.var1);
  ...
  ~Destructor(&tt->privates.varn);
  return 0;
}
4. Perform initialization of all firstprivate fields (by simple copying for POD data, copy constructor calls for classes) + provide address of a destructor function after kmpc_omp_task_alloc() and before kmpc_omp_task() calls.
kmp_task_t *new_task = __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry);

CopyConstructor(new_task->privates.var1, *new_task->shareds.var1_ref);
new_task->shareds.var1_ref = &new_task->privates.var1;
...
CopyConstructor(new_task->privates.varn, *new_task->shareds.varn_ref);
new_task->shareds.varn_ref = &new_task->privates.varn;

new_task->destructors = .omp_task_destructor.;
kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task)
Differential Revision: http://reviews.llvm.org/D9370

llvm-svn: 236479
2015-05-05 04:05:12 +00:00
Benjamin Kramer 439ee9d7bc Make helper functions static. NFC.
llvm-svn: 236315
2015-05-01 13:59:53 +00:00
Alexey Bataev 36c1eb95e0 [OPENMP] Codegen for 'private' clause in 'task' directive.
For tasks codegen for private/firstprivate variables are different rather than for other directives.

1. Build an internal structure of privates for each private variable:
struct .kmp_privates_t. {
  Ty1 var1;
  ...
  Tyn varn;
};
2. Add a new field to kmp_task_t type with list of privates.
struct kmp_task_t {
  void *              shareds;
  kmp_routine_entry_t routine;
  kmp_int32           part_id;
  kmp_routine_entry_t destructors;
  .kmp_privates_t.    privates;
};
3. Create a function with destructors calls for all privates after end of task region.
kmp_int32 .omp_task_destructor.(kmp_int32 gtid, kmp_task_t *tt) {
  ~Destructor(&tt->privates.var1);
  ...
  ~Destructor(&tt->privates.varn);
  return 0;
}
4. Perform default initialization of all private fields (no initialization for POD data, default constructor calls for classes) + provide address of a destructor function after kmpc_omp_task_alloc() and before kmpc_omp_task() calls.
kmp_task_t *new_task = __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry);

DefaultConstructor(new_task->privates.var1);
new_task->shareds.var1_ref = &new_task->privates.var1;
...
DefaultConstructor(new_task->privates.varn);
new_task->shareds.varn_ref = &new_task->privates.varn;

new_task->destructors = .omp_task_destructor.;
kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task)


Differential Revision: http://reviews.llvm.org/D9322

llvm-svn: 236207
2015-04-30 06:51:57 +00:00
Alexey Bataev 66beaa9349 [OPENMP] Fixed codegen for 'copyprivate' clause.
Fixed initialization of 'single' region completion + changed type of the third argument of __kmpc_copyprivate() runtime function to size_t.

llvm-svn: 236198
2015-04-30 03:47:32 +00:00