Commit Graph

61 Commits

Author SHA1 Message Date
Artem Belevich 186091094a [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.
This is an artefact of split-mode CUDA compilation that we need to
mimic. HD functions are sometimes allowed to call H or D functions. Due
to split compilation mode device-side compilation will not see host-only
function and thus they will not be considered at all. For clang both H
and D variants will become function overloads visible to
compiler. Normally target attribute is considered only if C++ rules can
not determine which function is better. However in this case we need to
ignore functions that would not be present during current compilation
phase before we apply normal overload resolution rules.

Changes:
* introduced another level of call preference to better describe
  possible call combinations.
* removed WrongSide functions from consideration if the set contains
  SameSide function.
* disabled H->D, D->H and G->H calls. These combinations are
  not allowed by CUDA and we were reluctantly allowing them to work
  around device-side calls to math functions in std namespace.
  We no longer need it after r258880.

Differential Revision: http://reviews.llvm.org/D16870

llvm-svn: 260697
2016-02-12 18:29:18 +00:00
Artem Belevich 97c01c35f8 [CUDA] Do not allow dynamic initialization of global device side variables.
In general CUDA does not allow dynamic initialization of
global device-side variables. One exception is that CUDA allows
records with empty constructors as described in section E2.2.1 of
CUDA 7.5 Programming guide.

This patch applies initializer checks for all device-side variables.
Empty constructors are accepted, but no code is generated for them.

Differential Revision: http://reviews.llvm.org/D15305

llvm-svn: 259592
2016-02-02 22:29:48 +00:00
Justin Lebar c66a10652a [CUDA] Only allow __global__ on free functions and static member functions.
Summary:
Warn for NVCC compatibility if you declare a static member function or
inline function as __global__.

Reviewers: tra

Subscribers: jhen, echristo, cfe-commits

Differential Revision: http://reviews.llvm.org/D16261

llvm-svn: 258263
2016-01-20 00:26:57 +00:00
Artem Belevich 94a55e8169 [CUDA] Allow function overloads in CUDA based on host/device attributes.
The patch makes it possible to parse CUDA files that contain host/device
functions with identical signatures, but different attributes without
having to physically split source into host-only and device-only parts.

This change is needed in order to parse CUDA header files that have
a lot of name clashes with standard include files.

Gory details are in design doc here: https://goo.gl/EXnymm
Feel free to leave comments there or in this review thread.

This feature is controlled with CC1 option -fcuda-target-overloads
and is disabled by default.

Differential Revision: http://reviews.llvm.org/D12453

llvm-svn: 248295
2015-09-22 17:22:59 +00:00
Eli Bendersky 4bdc50eccb Create a frontend flag to disable CUDA cross-target call checks
For CUDA source, Sema checks that the targets of call expressions make sense
(e.g. a host function can't call a device function).

Adding a flag that lets us skip this check. Motivation: for source-to-source
translation tools that have to accept code that's not strictly kosher CUDA but
is still accepted by nvcc. The source-to-source translation tool can then fix
the code and leave calls that are semantically valid for the actual compilation
stage.

Differential Revision: http://reviews.llvm.org/D9036

llvm-svn: 235049
2015-04-15 22:27:06 +00:00
Jacques Pienaar a50178c23e CUDA: Add option to allow host device functions to call host functions
Commiting code from review http://reviews.llvm.org/D7841

llvm-svn: 230385
2015-02-24 21:45:33 +00:00
Jacques Pienaar 5bdd67778f Consider calls from implict host device functions as valid in SemaCUDA.
In SemaCUDA all implicit functions were considered host device, this led to
errors such as the following code snippet failing to compile:

struct Copyable {
  const Copyable& operator=(const Copyable& x) { return *this; }
};

struct Simple {
  Copyable b;
};

void foo() {
  Simple a, b;

  a = b;
}

Above the implicit copy assignment operator was inferred as host device but
there was only a host assignment copy defined which is an error in device
compilation mode.

Differential Revision: http://reviews.llvm.org/D6565

llvm-svn: 224358
2014-12-16 20:12:38 +00:00
Reid Kleckner bbc0178518 CUDA host device code with two code paths
Summary:
Allow CUDA host device functions with two code paths using __CUDA_ARCH__
to differentiate between code path being compiled.

For example:
  __host__ __device__ void host_device_function(void) {
  #ifdef __CUDA_ARCH__
    device_only_function();
  #else
    host_only_function();
  #endif
  }

Patch by Jacques Pienaar.

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D6457

llvm-svn: 223271
2014-12-03 21:53:36 +00:00
Eli Bendersky f2787a0dc0 CUDA: mark the target of implicit intrinsics properly
r218624 implemented target inference for implicit special members. However,
other entities can be implicit - for example intrinsics. These can not have
inference running on them, so they should be marked host device as before. This
is the safest and most flexible setting, since by construction these functions
don't invoke anything, and we'd like them to be invokable from both host and
device code. LLVM's intrinsics definitions (where these intrinsics come from in
the case of CUDA/NVPTX) have no notion of target, so both host and device
intrinsics can be supported this way.

llvm-svn: 218688
2014-09-30 17:38:34 +00:00
Eli Bendersky 9a220fca4a CUDA: Fix incorrect target inference for implicit members.
As PR20495 demonstrates, Clang currenlty infers the CUDA target (host/device,
etc) for implicit members (constructors, etc.) incorrectly. This causes errors
and even assertions in Clang when compiling code (assertions in C++11 mode where
implicit move constructors are added into the mix).

Fix the problem by inferring the target from the methods the implicit member
should call (depending on its base classes and fields).

llvm-svn: 218624
2014-09-29 20:38:29 +00:00
Eli Bendersky 7325e56c10 Split off CUDA-specific Sema parts to a new file
In line with SemaOpenMP.cpp, etc. CUDA-specific semantic analysis code goes into
a separate file. This is in anticipation of adding extra functionality here in
the near future.

No change in functionality.

Review: http://reviews.llvm.org/D5160
llvm-svn: 217043
2014-09-03 15:27:03 +00:00