Commit Graph

67 Commits

Author SHA1 Message Date
Justin Lebar 5fd18d17e5 [CUDA] Add test checking our ability to take a function pointer to a __global__ function on the host side.
Summary: This functionality is used by Thrust.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24581

llvm-svn: 281543
2016-09-14 21:50:11 +00:00
Artem Belevich bed18e9cc4 [CUDA] Do not merge CUDA target attributes.
CUDA target attributes are used for function overloading and must not be merged.

This fixes a bug where attributes were inherited during function template
specialization in CUDA and made it impossible for specialized function
to provide its own target attributes.

Differential Revision: https://reviews.llvm.org/D24522

llvm-svn: 281406
2016-09-13 22:16:30 +00:00
Justin Lebar 26bb31123a [CUDA] Fix "declared here" note on deferred wrong-side errors.
Previously we weren't deferring these "declared here" notes, which is
obviously wrong.

llvm-svn: 278767
2016-08-16 00:48:21 +00:00
Justin Lebar 18e2d82297 [CUDA] Raise an error if a wrong-side call is codegen'ed.
Summary:
Some function calls in CUDA are allowed to appear in
semantically-correct programs but are an error if they're ever
codegen'ed.  Specifically, a host+device function may call a host
function, but it's an error if such a function is ever codegen'ed in
device mode (and vice versa).

Previously, clang made no attempt to catch these errors.  For the most
part, they would be caught by ptxas, and reported as "call to unknown
function 'foo'".

Now we catch these errors and report them the same as we report other
illegal calls (e.g. a call from a host function to a device function).

This has a small change in error-message behavior for calls that were
previously disallowed (e.g. calls from a host to a device function).
Previously, we'd catch disallowed calls fairly early, before doing
additional semantic checking e.g. of the call's arguments.  Now we catch
these illegal calls at the very end of our semantic checks, so we'll
only emit a "illegal CUDA call" error if the call is otherwise
well-formed.

Reviewers: tra, rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D23242

llvm-svn: 278759
2016-08-15 23:00:49 +00:00
Justin Lebar c989c3e784 [CUDA] Reject calls to __device__ functions from host variable global initializers.
Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D23335

llvm-svn: 278196
2016-08-10 01:09:21 +00:00
Justin Lebar 7d078bddbd [CUDA] Print a "previous-decl" note when calling an illegal member fn.
Summary:
When we emit err_ref_bad_target, we should emit a "'method' declared
here" note.  We already do so in most places, just not in
BuildCallToMemberFunction.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D23240

llvm-svn: 278195
2016-08-10 01:09:18 +00:00
Justin Lebar 4c2c6fd1c4 [CUDA] Add additional testcases for EraseUnwantedCUDAMatches.
Summary:
Specifically, this patch adds testcases for all three calls to
EraseUnwantedCUDAMatches.  The addr-of-overloaded-fn test I accidentally
neutered in r264207, which moved much of
CodeGenCUDA/function-overload.cu into SemaCUDA/function-overload.cu.
The coverage from overloaded-delete test is new.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D21913

llvm-svn: 275232
2016-07-12 23:23:12 +00:00
Justin Lebar d35f706cc2 [CUDA] Don't assume that destructors can't be overloaded.
Summary:
You can overload a destructor in CUDA, and SemaOverload needs to be
tweaked not to crash when it sees an explicit call to an overloaded
destructor.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: http://reviews.llvm.org/D21912

llvm-svn: 275231
2016-07-12 23:23:01 +00:00
Justin Bogner 2d5de7e568 NVPTX: Use the nvvm builtins to read SRegs rather than the legacy ptx ones
The ptx spellings were removed from LLVM in r274769.

llvm-svn: 274770
2016-07-07 16:41:08 +00:00
Artem Belevich bcec9dac14 [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.
Fixes clang crash reported in PR27778.

Differential Revision: http://reviews.llvm.org/D20985

llvm-svn: 271951
2016-06-06 22:54:57 +00:00
Reid Kleckner a769fd50ba Avoid depending on test inputes that aren't in Inputs
Some people have weird CI systems that run each test subdirectory
independently without access to other parallel trees.

Unfortunately, this means we have to suffer some duplication until Art
can sort out how to share these types.

llvm-svn: 270164
2016-05-20 00:38:25 +00:00
Artem Belevich 3650bbeebc [CUDA] Do not allow non-empty destructors for global device-side variables.
According to Cuda Programming guide (v7.5, E2.3.1):
> __device__, __constant__ and __shared__ variables defined in namespace
> scope, that are of class type, cannot have a non-empty constructor or a
> non-empty destructor.

Clang already deals with device-side constructors (see D15305).
This patch enforces similar rules for destructors.

Differential Revision: http://reviews.llvm.org/D20140

llvm-svn: 270108
2016-05-19 20:13:53 +00:00
Artem Belevich 85b6f63f42 [CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts.
Codegen tests for device-side variable initialization are subset of test
cases used to verify Sema's part of the job.
Including CodeGenCUDA/device-var-init.cu from SemaCUDA makes it easier to
keep both sides in sync.

Differential Revision: http://reviews.llvm.org/D20139

llvm-svn: 270107
2016-05-19 20:13:39 +00:00
Richard Smith 12e7931d0b Add support for derived class special members hiding functions brought in from
a base class via a using-declaration. If a class has a using-declaration
declaring either a constructor or an assignment operator, eagerly declare its
special members in case they need to displace a shadow declaration from a
using-declaration.

llvm-svn: 269398
2016-05-13 06:47:56 +00:00
Justin Lebar ba122ab42f [CUDA] Make unattributed constexpr functions implicitly host+device.
With this patch, by a constexpr function is implicitly host+device
unless:

 a) it's a variadic function (variadic functions are not allowed on the
    device side), or
 b) it's preceeded by a __device__ overload in a system header.

The restriction on overloading __host__ __device__ functions on the
basis of their CUDA attributes remains in place, but we use (b) to allow
us to define __device__ overloads for constexpr functions in cmath,
which would otherwise be __host__ __device__ and thus not overloadable.

You can disable this behavior with -fno-cuda-host-device-constexpr.

Reviewers: tra, rnk, rsmith

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18380

llvm-svn: 264964
2016-03-30 23:30:21 +00:00
Justin Lebar 25c4a81e79 [CUDA] Remove three obsolete CUDA cc1 flags.
Summary:
* -fcuda-target-overloads

  Previously unconditionally set to true by the driver.  Necessary for
  correct functioning of the compiler -- our CUDA headers wrapper won't
  compile without this.

* -fcuda-disable-target-call-checks

  Previously unconditionally set to true by the driver.  Necessary to
  compile almost any external CUDA code -- almost all libraries assume
  that host+device code can call host or device functions.

* -fcuda-allow-host-calls-from-host-device

  No effect when target overloading is enabled.

Reviewers: tra

Subscribers: rsmith, cfe-commits

Differential Revision: http://reviews.llvm.org/D18416

llvm-svn: 264739
2016-03-29 16:24:16 +00:00
Justin Lebar e5eed04d52 [CUDA] Merge most of CodeGenCUDA/function-overload.cu into SemaCUDA/function-overload.cu.
Summary:
Previously we were using the codegen test to ensure that we choose the
right overload.  But we can do this within sema, with a bit of
cleverness.

I left the constructor/destructor checks in CodeGen, because these
overloads (particularly on the destructors) are hard to check in Sema.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18386

llvm-svn: 264207
2016-03-23 22:42:30 +00:00
Justin Lebar e82caa3055 [CUDA] Simplify SemaCUDA/function-overload.cu test.
Summary:
Principally, don't hardcode the line numbers of various notes.  This
lets us make changes to the test without recomputing linenos everywhere.

Instead, just tell -verify that we may get 0 or more notes pointing to
the relevant function definitions.  Checking that we get exactly the
right note isn't so important (and anyway is checked elsewhere).

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18385

llvm-svn: 264206
2016-03-23 22:42:28 +00:00
Justin Lebar d33adadb0e [CUDA] Don't allow templated variadic functions.
Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18373

llvm-svn: 264106
2016-03-22 22:06:19 +00:00
Artem Belevich 1ef9b59284 [CUDA] do not allow attribute-based overloading for __global__ functions.
__global__ functions are present on both host and device side,
so providing __host__ or __device__ overloads is not going to
do anything useful.

llvm-svn: 261778
2016-02-24 21:54:45 +00:00
Justin Lebar 048a4c1ca9 [CUDA] Don't specify exact line numbers in cuda-builtin-vars.cu.
This makes the test less fragile to changes to cuda_builtin_vars.h.

Test-only change.

llvm-svn: 261775
2016-02-24 21:49:30 +00:00
Artem Belevich 186091094a [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.
This is an artefact of split-mode CUDA compilation that we need to
mimic. HD functions are sometimes allowed to call H or D functions. Due
to split compilation mode device-side compilation will not see host-only
function and thus they will not be considered at all. For clang both H
and D variants will become function overloads visible to
compiler. Normally target attribute is considered only if C++ rules can
not determine which function is better. However in this case we need to
ignore functions that would not be present during current compilation
phase before we apply normal overload resolution rules.

Changes:
* introduced another level of call preference to better describe
  possible call combinations.
* removed WrongSide functions from consideration if the set contains
  SameSide function.
* disabled H->D, D->H and G->H calls. These combinations are
  not allowed by CUDA and we were reluctantly allowing them to work
  around device-side calls to math functions in std namespace.
  We no longer need it after r258880.

Differential Revision: http://reviews.llvm.org/D16870

llvm-svn: 260697
2016-02-12 18:29:18 +00:00
Justin Lebar 1eac5948db [CUDA] Add -fcuda-allow-variadic-functions.
Summary:
Turns out the variadic function checking added in r258643 was too strict
for some existing users; give them an escape valve.  When
-fcuda-allow-variadic-functions is passed, the front-end makes no
attempt to disallow C-style variadic functions.  Calls to va_arg are
still not allowed.

Reviewers: tra

Subscribers: cfe-commits, jhen, echristo, bkramer

Differential Revision: http://reviews.llvm.org/D16559

llvm-svn: 258822
2016-01-26 17:47:20 +00:00
Justin Lebar e48cd6c530 [CUDA] Disallow variadic functions other than printf in device code.
Reviewers: tra

Subscribers: cfe-commits, echristo, jhen

Differential Revision: http://reviews.llvm.org/D16484

llvm-svn: 258643
2016-01-23 21:28:17 +00:00
Justin Lebar a8f0254bc1 [CUDA] Reject the alias attribute in CUDA device code.
Summary: CUDA (well, strictly speaking, NVPTX) doesn't support aliases.

Reviewers: echristo

Subscribers: cfe-commits, jhen, tra

Differential Revision: http://reviews.llvm.org/D16502

llvm-svn: 258641
2016-01-23 21:28:10 +00:00
Justin Lebar 6644e366b0 [CUDA] Bail, rather than crash, on va_arg in device code.
Reviewers: tra

Subscribers: echristo, jhen, cfe-commits

Differential Revision: http://reviews.llvm.org/D16331

llvm-svn: 258264
2016-01-20 00:27:00 +00:00
Justin Lebar c66a10652a [CUDA] Only allow __global__ on free functions and static member functions.
Summary:
Warn for NVCC compatibility if you declare a static member function or
inline function as __global__.

Reviewers: tra

Subscribers: jhen, echristo, cfe-commits

Differential Revision: http://reviews.llvm.org/D16261

llvm-svn: 258263
2016-01-20 00:26:57 +00:00
Justin Lebar f8bdacbadc [CUDA] Warn undeclared identifiers in CUDA kernel calls
Value, type, and instantiation dependence were not being handled
correctly for CUDAKernelCallExpr AST nodes. As a result, if an
undeclared identifier was used in the triple-angle-bracket kernel call
configuration, there would be no error during parsing, and there would
be a crash during code gen. This patch makes sure that an error will be
issued during parsing in this case, just as there would be for any other
use of an undeclared identifier in C++.

Patch by Jason Henline.

Reviewers: jlebar, rsmith

Differential Revision: http://reviews.llvm.org/D15858

llvm-svn: 257839
2016-01-14 23:31:30 +00:00
Justin Lebar 3eaaf86397 [CUDA] Report an error if code tries to mix incompatible CUDA attributes.
Summary: Thanks to jhen for helping me figure this out.

Reviewers: tra, echristo

Subscribers: jhen

Differential Revision: http://reviews.llvm.org/D16129

llvm-svn: 257554
2016-01-13 01:07:35 +00:00
Akira Hatanaka 8c26ea663d Produce a better diagnostic for global register variables.
Currently, when there is a global register variable in a program that
is bound to an invalid register, clang/llvm prints an error message that
is not very user-friendly.

This commit improves the diagnostic and moves the check that used to be
in the backend to Sema. In addition, it makes changes to error out if
the size of the register doesn't match the declared variable size.

e.g., volatile register int B asm ("rbp");

rdar://problem/23084219

Differential Revision: http://reviews.llvm.org/D13834

llvm-svn: 253405
2015-11-18 00:15:28 +00:00
Artem Belevich 5e2a3ecd48 [CUDA] use -aux-triple to pass target triple of opposite side of compilation
Clang needs to know target triple for both sides of compilation so that
preprocessor macros and target builtins from both sides are available.

This change augments Compilation class to carry information about
toolchains used during different CUDA compilation passes and refactors
BuildActions to use it when it constructs CUDA jobs.

Removed DeviceTriple from CudaHostAction/CudaDeviceAction as it's no
longer needed.

Differential Revision: http://reviews.llvm.org/D13144

llvm-svn: 253385
2015-11-17 22:28:40 +00:00
Artem Belevich b5bc923af4 [CUDA] Allow parsing of host and device code simultaneously.
* adds -aux-triple option to specify target triple
 * propagates aux target info to AST context and Preprocessor
 * pulls in target specific preprocessor macros.
 * pulls in target-specific builtins from aux target.
 * sets appropriate host or device attribute on builtins.

Differential Revision: http://reviews.llvm.org/D12917

llvm-svn: 248299
2015-09-22 17:23:22 +00:00
Artem Belevich 9674a64cd9 [CUDA] Add appropriate host/device attribute to builtins.
The changes are part of attribute-based CUDA function overloading (D12453)
and as such are only enabled when it's in effect (-fcuda-target-overloads).

Differential Revision: http://reviews.llvm.org/D12122

llvm-svn: 248296
2015-09-22 17:23:05 +00:00
Artem Belevich 94a55e8169 [CUDA] Allow function overloads in CUDA based on host/device attributes.
The patch makes it possible to parse CUDA files that contain host/device
functions with identical signatures, but different attributes without
having to physically split source into host-only and device-only parts.

This change is needed in order to parse CUDA header files that have
a lot of name clashes with standard include files.

Gory details are in design doc here: https://goo.gl/EXnymm
Feel free to leave comments there or in this review thread.

This feature is controlled with CC1 option -fcuda-target-overloads
and is disabled by default.

Differential Revision: http://reviews.llvm.org/D12453

llvm-svn: 248295
2015-09-22 17:22:59 +00:00
Artem Belevich 5ef02c2db7 [CUDA] Check register names on appropriate side of cuda compilation only.
Differential Revision: http://reviews.llvm.org/D11950

llvm-svn: 246193
2015-08-27 19:54:21 +00:00
Artem Belevich 7230a22d5e Revert r245496 "[CUDA] Add appropriate host/device attribute to builtins."
It's breaking internal test.

llvm-svn: 245592
2015-08-20 18:28:56 +00:00
Artem Belevich 39259ffc65 [CUDA] Add appropriate host/device attribute to builtins.
Differential Revision: http://reviews.llvm.org/D12122

llvm-svn: 245496
2015-08-19 20:48:20 +00:00
Artem Belevich 194ba60fe2 [CUDA] Added stubs for new attributes used by CUDA headers.
The main purpose is to avoid errors and warnings while parsing CUDA
header files. The attributes are currently unused otherwise.

Differential version: http://reviews.llvm.org/D11690

llvm-svn: 244497
2015-08-10 20:33:56 +00:00
Artem Belevich a0473a5479 [cuda] Preserve TLS storage class of host variable even if it's a
device-side compilation.

llvm-svn: 236029
2015-04-28 20:31:49 +00:00
Artem Belevich fa62ad4087 [cuda] Ignore "TLS unsupported by target" errors for host variables during device compilation.
During device-side CUDA compilation clang currently complains about
all TLS variables, regardless of whether they are __host__ or
__device__.

This patch suppresses "TLS unsupported" errors for host variables
during device compilation and for device variables during host
compilation.

Differential Revision: http://reviews.llvm.org/D9269

llvm-svn: 235907
2015-04-27 19:37:53 +00:00
Artem Belevich 7093e40641 [cuda] Allow using integral non-type template parameters as launch_bounds attribute arguments.
- Changed CUDALaunchBounds arguments from integers to Expr* so they can
   be saved in AST for instantiation.
 - Added support for template instantiation of launch_bounds attrubute.
 - Moved evaluation of launch_bounds arguments to NVPTXTargetCodeGenInfo::
   SetTargetAttributes() where it can be done after template instantiation.
 - Added a warning on negative launch_bounds arguments.
 - Amended test cases.

Differential Revision: http://reviews.llvm.org/D8985

llvm-svn: 235452
2015-04-21 22:55:54 +00:00
Artem Belevich 4e192df778 [cuda] Added support for CUDA built-in variables.
Added cuda_builtin_vars.h which implements built-in CUDA variables
using __declattr(property).

Fields of built-in variables (except for warpSize) are implemented
using __declattr(property) which replaces read/write of a member field
with a call to a getter/setter member function, in this case with
appropriate NVPTX builtin.

Added a test case to check diagnostics on attempt to construct or
improperly access a built-in variable.

Differential Revision: http://reviews.llvm.org/D9064

llvm-svn: 235448
2015-04-21 22:14:13 +00:00
Artem Belevich a050112bba Revert r235398 "[cuda] Added support for CUDA built-in variables."
r235398 was causing buildbot break due to missing Makefile changes.

llvm-svn: 235401
2015-04-21 18:36:42 +00:00
Artem Belevich d0a2ae054f [cuda] Added support for CUDA built-in variables.
Added cuda_builtin_vars.h which implements built-in CUDA variables
using __declattr(property).

Fields of built-in variables (except for warpSize) are implemented
using __declattr(property) which replaces read/write of a member field
with a call to a getter/setter member function, in this case with
appropriate NVPTX builtin.

Added a test case to check diagnostics on attempt to construct or
improperly access a built-in variable.

Differential Revision: http://reviews.llvm.org/D9064

llvm-svn: 235398
2015-04-21 17:39:06 +00:00
Eli Bendersky 4bdc50eccb Create a frontend flag to disable CUDA cross-target call checks
For CUDA source, Sema checks that the targets of call expressions make sense
(e.g. a host function can't call a device function).

Adding a flag that lets us skip this check. Motivation: for source-to-source
translation tools that have to accept code that's not strictly kosher CUDA but
is still accepted by nvcc. The source-to-source translation tool can then fix
the code and leave calls that are semantically valid for the actual compilation
stage.

Differential Revision: http://reviews.llvm.org/D9036

llvm-svn: 235049
2015-04-15 22:27:06 +00:00
Artem Belevich 5196fe7c19 Ignore device-side asm constraint errors while compiling CUDA code for host and vice versa.
Differential Revision: http://reviews.llvm.org/D8392

llvm-svn: 232747
2015-03-19 18:40:25 +00:00
Jacques Pienaar a50178c23e CUDA: Add option to allow host device functions to call host functions
Commiting code from review http://reviews.llvm.org/D7841

llvm-svn: 230385
2015-02-24 21:45:33 +00:00
Jacques Pienaar 5bdd67778f Consider calls from implict host device functions as valid in SemaCUDA.
In SemaCUDA all implicit functions were considered host device, this led to
errors such as the following code snippet failing to compile:

struct Copyable {
  const Copyable& operator=(const Copyable& x) { return *this; }
};

struct Simple {
  Copyable b;
};

void foo() {
  Simple a, b;

  a = b;
}

Above the implicit copy assignment operator was inferred as host device but
there was only a host assignment copy defined which is an error in device
compilation mode.

Differential Revision: http://reviews.llvm.org/D6565

llvm-svn: 224358
2014-12-16 20:12:38 +00:00
Matt Arsenault b9e9dc5e89 Workaround attribute ordering issue with kernel only attributes
Placing the attribute after the kernel keyword would incorrectly
reject the attribute, so use the smae workaround that other
kernel only attributes use.

Also add a FIXME because there are two different phrasings now
for the same error, althoug amdgpu_num_[sv]gpr uses a consistent one.

llvm-svn: 223490
2014-12-05 18:03:58 +00:00
Matt Arsenault 43fae6c855 Add attributes for AMDGPU register limits.
This is a performance hint that can be applied to kernels
to attempt to limit the number of used registers.

llvm-svn: 223384
2014-12-04 20:38:18 +00:00