llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun Liu	aa24601f98	[CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes There are HIP applications e.g. Tensorflow 1.3 using amdgpu kernel attributes, however currently they are only allowed on OpenCL kernel functions. This patch will allow amdgpu kernel attributes to be applied to CUDA/HIP __global__ functions. Differential Revision: https://reviews.llvm.org/D47958 llvm-svn: 334561	2018-06-12 23:58:59 +00:00
Artem Belevich	e9fa53a09b	[CUDA] Check initializers of instantiated template variables. We were already performing checks on non-template variables, but the checks on templated ones were missing. Differential Revision: https://reviews.llvm.org/D45231 llvm-svn: 334143	2018-06-06 22:37:25 +00:00
Justin Lebar	5489f85fda	[CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces. Summary: Previously this triggered a -Wundefined-internal warning. But it's not an undefined variable -- any variable of this form is a pointer to the base of GPU core's shared memory. Reviewers: tra Subscribers: sanjoy, rsmith Differential Revision: https://reviews.llvm.org/D46782 llvm-svn: 332621	2018-05-17 16:15:07 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Artem Belevich	67d22c8a84	Revert "[CUDA] Check initializers of instantiated template variables." This (temporarily) reverts commit r329127 due to the problems it exposed in TensorFlow. llvm-svn: 329229	2018-04-04 20:48:42 +00:00
Artem Belevich	d9189d1e76	[CUDA] Check initializers of instantiated template variables. We were already performing checks on non-template variables, but the checks on templated ones were missing. Differential Revision: https://reviews.llvm.org/D45231 llvm-svn: 329127	2018-04-03 22:41:06 +00:00
Richard Trieu	b402580616	Fix some handling of AST nodes with diagnostics. The diagnostic system for Clang can already handle many AST nodes. Instead of converting them to strings first, just hand the AST node directly to the diagnostic system and let it handle the output. Minor changes in some diagnostic output. llvm-svn: 328688	2018-03-28 04:16:13 +00:00
Artem Belevich	e2ae8b5510	[CUDA] Fixed false error reporting in case of calling H->G->HD->D. Launching a kernel from the host code does not generate code for the kernel itself. This fixes an issue with clang erroneously reporting an error for a HD->D call from within the kernel. Differential Revision: https://reviews.llvm.org/D44837 llvm-svn: 328362	2018-03-23 19:49:03 +00:00
Jonas Hahnfeld	ee47d8cb96	[CUDA] Allow external variables in separate compilation According to the CUDA Programming Guide this is prohibited in whole program compilation mode. This makes sense because external references cannot be satisfied in that mode anyway. However, such variables are allowed in separate compilation mode which is a valid use case. Differential Revision: https://reviews.llvm.org/D42923 llvm-svn: 325136	2018-02-14 16:04:03 +00:00
Artem Belevich	50e6e54587	[CUDA] Report "unsupported VLA" errors only on device side. This fixes erroneously reported CUDA compilation errors in host-side code during device-side compilation. I've also restricted OpenMP-specific checks to trigger only if we're compiling with OpenMP enabled. Differential Revision: https://reviews.llvm.org/D40275 llvm-svn: 319201	2017-11-28 18:51:42 +00:00
Aaron Ballman	adf66b6174	Determine the attribute subject for diagnostics based on declarative information in DeclNodes.td. This greatly reduces the number of enumerated values used for more complex diagnostics; these are now only required when the "attribute only applies to" diagnostic needs to be generated manually as part of semantic processing. This also clarifies some terminology used by the diagnostic (methods -> Objective-C methods, fields -> non-static data members, etc). Many of the tests needed to be updated in multiple places for the diagnostic wording tweaks. The first instance of the diagnostic for that attribute is fully specified and subsequent instances cut off the complete list (to make it easier if additional subjects are added in the future for the attribute). llvm-svn: 319002	2017-11-26 20:01:12 +00:00
Justin Lebar	78137ec868	[CUDA] When compilation fails, print the compilation mode. Summary: That is, instead of "1 error generated", we now say "1 error generated when compiling for sm_35". This (partially) solves a usability foogtun wherein e.g. users call a function that's only defined on sm_60 when compiling for sm_35, and they get an unhelpful error message. Reviewers: tra Subscribers: sanjoy, cfe-commits Differential Revision: https://reviews.llvm.org/D37548 llvm-svn: 312736	2017-09-07 18:37:16 +00:00
Richard Smith	6c716116df	PR34163: Don't cache an incorrect key function for a class if queried between the class becoming complete and its inline methods being parsed. This replaces the hack of using the "late parsed template" flag to track member functions with bodies we've not parsed yet; instead we now use the "will have body" flag, which carries the desired implication that the function declaration is a definition, and that we've just not parsed its body yet. llvm-svn: 310776	2017-08-12 01:46:03 +00:00
Justin Lebar	86c4e63ff9	[CUDA] Let NVPTX inherit the host's calling conventions. Summary: When compiling device code, we may still see host code with explicit calling conventions. NVPTX needs to claim that it supports these CCs, so that (a) we don't raise noisy warnings, and (b) we don't break existing code which relies on the existence of these CCs when specializing templates. (If a CC doesn't exist, clang ignores it, so two template specializations which are different only insofar as one specifies a CC are considered identical and therefore are an error if that CC is not supported.) Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28323 llvm-svn: 291136	2017-01-05 16:53:38 +00:00
Justin Lebar	0203f2c26e	[CUDA] Add __declspec spellings for CUDA attributes. Summary: CUDA attributes are spelled __declspec(__foo__) on Windows. Reviewers: tra Subscribers: cfe-commits, rnk Differential Revision: https://reviews.llvm.org/D28321 llvm-svn: 291134	2017-01-05 16:53:04 +00:00
Artem Belevich	64135c35f7	[CUDA] Ignore implicit target attributes during function template instantiation. Some functions and templates are treated as __host__ __device__ even when they don't have explicitly specified target attributes. What's worse, this treatment may change depending on command line options (-fno-cuda-host-device-constexpr) or #pragma clang force_cuda_host_device. Combined with strict checking for matching function target that comes with D25809(r288962), it makes it hard to write code which would explicitly instantiate or specialize some functions regardless of pragmas or command line options in effect. This patch changes the way we match target attributes of base template vs attributes used in explicit instantiation or specialization so that only explicitly specified attributes are considered. This makes base template selection behave consistently regardless of pragma of command line options that may affect CUDA target. Differential Revision: https://reviews.llvm.org/D25845 llvm-svn: 289091	2016-12-08 19:38:13 +00:00
Artem Belevich	13e9b4d768	[CUDA] Improve target attribute checking for function templates. * __host__ __device__ functions are no longer considered to be redeclarations of __host__ or __device__ functions. This prevents unintentional merging of target attributes across them. * Function target attributes are not considered (and must match) during explicit instantiation and specialization of function templates. Differential Revision: https://reviews.llvm.org/D25809 llvm-svn: 288962	2016-12-07 19:27:16 +00:00
Justin Lebar	2d56c26504	[CUDA] Use only the GVALinkage on function definitions. Summary: Previously we'd look at the GVALinkage of whatever FunctionDecl you happened to be calling. This is not right. In the absence of the gnu_inline attribute, to be handled separately, the function definition determines the function's linkage. So we need to wait until we get a def before we can know whether something is known-emitted. Reviewers: tra Subscribers: cfe-commits, rsmith Differential Revision: https://reviews.llvm.org/D26268 llvm-svn: 286313	2016-11-08 23:45:51 +00:00
Justin Lebar	2b42ccc78b	[CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on functions without bodies. Summary: In CUDA compilation, we call isInlineDefinitionExternallyVisible (via getGVALinkageForFunction) on functions while parsing their definitions. At the point in time when we call getGVALinkageForFunction, we haven't yet added the body to the function, so we trip this assert. But as far as I can tell, this is harmless. To work around this, we add a new flag to FunctionDecl, "WillHaveBody". There was other code that was working around the existing assert with a really awful hack -- this change lets us get rid of that hack. Reviewers: rsmith, tra Subscribers: aemerson, cfe-commits Differential Revision: https://reviews.llvm.org/D25640 llvm-svn: 285410	2016-10-28 16:26:26 +00:00
Justin Lebar	4d38a5cf74	[CUDA] Simplify some repeated diagnostic expectations in CUDA tests. Instead of repeating the diagnostic, use "expected-note N". Test-only change. llvm-svn: 284882	2016-10-21 20:50:47 +00:00
Artem Belevich	07db5cf6c8	Declare H and H new/delete. llvm-svn: 284879	2016-10-21 20:34:05 +00:00
Justin Lebar	6c86e9160d	[CUDA] When we emit an error that might have been deferred, also print a callstack. Summary: Previously, when you did something not allowed in a host+device function and then caused it to be codegen'ed, we would print out an error telling you that you did something bad, but we wouldn't tell you how we decided that the function needed to be codegen'ed. This change causes us to print out a callstack when emitting deferred errors. This is immensely helpful when debugging highly-templated code, where it's often unclear how a function became known-emitted. We only print the callstack once per function, after we print the all deferred errors. This patch also switches all of our hashtables to using canonical FunctionDecls instead of regular FunctionDecls. This prevents a number of bugs, some of which are caught by tests added here, in which we assume that two FDs for the same function have the same pointer value. Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D25704 llvm-svn: 284647	2016-10-19 21:15:01 +00:00
Justin Lebar	9730ae943f	[CUDA] Emit errors for wrong-side calls made on the same line as non-wrong-side calls. Summary: This fixes two related bugs: 1) Previously, if you had a non-wrong side call at some source code location L, we wouldn't emit errors for wrong-side calls that appeared at L. 2) We'd only emit one wrong-side error per source code location, when we actually want to emit it twice if we hit this line more than once due to e.g. template instantiation. Reviewers: tra Subscribers: rnk, cfe-commits Differential Revision: https://reviews.llvm.org/D25702 llvm-svn: 284643	2016-10-19 21:03:38 +00:00
Justin Lebar	d3fd70dedd	[CUDA] Rework tests now that we emit deferred diagnostics during sema. Test-only change. Summary: Previously we had to split out a lot of our tests into a test that checked only immediate errors and a test that checked only deferred errors. This was because, if you emitted any immediate errors, we wouldn't run codegen, where the deferred errors were emitted. We've fixed this, and now emit deferred errors during sema. This lets us merge a bunch of tests, and lets us convert some other tests to -fsyntax-only. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25755 llvm-svn: 284553	2016-10-19 00:06:49 +00:00
Justin Lebar	d692dfb65e	[CUDA] Fix false-positive in known-emitted handling. Previously: When compiling for host, our constructed call graph went through kernel calls. This meant that if we had host calls kernel calls HD we would incorrectly mark the HD function as known-emitted on the host side, and thus perform host-side checks on it. Fixing this exposed another issue, wherein when marking a function as known-emitted, we also need to traverse the callgraph of its template, because non-dependent calls are attached to a function's template, not its instantiation. llvm-svn: 284355	2016-10-17 02:25:55 +00:00
Justin Lebar	23d954241b	[CUDA] Emit deferred diagnostics during Sema rather than during codegen. Summary: Emitting deferred diagnostics during codegen was a hack. It did work, but usability was poor, both for us as compiler devs and for users. We don't codegen if there are any sema errors, so for users this meant that they wouldn't see deferred errors if there were any non-deferred errors. For devs, this meant that we had to carefully split up our tests so that when we tested deferred errors, we didn't emit any non-deferred errors. This change moves checking for deferred errors into Sema. See the big comment in SemaCUDA.cpp for an overview of the idea. This checking adds overhead to compilation, because we have to maintain a partial call graph. As a result, this change makes deferred errors a CUDA-only concept (whereas before they were a general concept). If anyone else wants to use this framework for something other than CUDA, we can generalize at that time. This patch makes the minimal set of test changes -- after this lands, I'll go back through and do a cleanup of the tests that we no longer have to split up. Reviewers: rnk Subscribers: cfe-commits, rsmith, tra Differential Revision: https://reviews.llvm.org/D25541 llvm-svn: 284158	2016-10-13 20:52:12 +00:00
Justin Lebar	44f547aa3f	[CUDA] Allow static variables in __host__ __device__ functions, so long as they're never codegen'ed for device. Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25150 llvm-svn: 284145	2016-10-13 18:45:17 +00:00
Justin Lebar	aa370bd0d7	[CUDA] Disallow __shared__ variables in host functions. Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25143 llvm-svn: 284144	2016-10-13 18:45:13 +00:00
Justin Lebar	179bdce72a	[CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIf{Device,Host}Code(). Summary: Together these let you easily create diagnostics that - are never emitted for host code - are always emitted for __device__ and __global__ functions, and - are emitted for __host__ __device__ functions iff these functions are codegen'ed. At the moment there are only three diagnostics that need this treatment, but I have more to add, and it's not sustainable to write code for emitting every such diagnostic twice, and from a special wrapper in SemaCUDA.cpp. While we're at it, don't emit the function name in err_cuda_device_exceptions: It's not necessary to print it, and making this work in the new framework in the face of a null value for dyn_cast<FunctionDecl>(CurContext) isn't worth the effort. Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D25139 llvm-svn: 284143	2016-10-13 18:45:08 +00:00
Artem Belevich	89997ecd8f	Added REQUIRED triples to the test that fails on some ARM buildbots. llvm-svn: 283964	2016-10-12 02:08:08 +00:00
Justin Lebar	0254c46300	[CUDA] Make touching a kernel from a __host__ __device__ function a deferred error. Previously, this was an immediate, don't pass go, don't collect $200 error. But this precludes us from writing code like __host__ __device__ void launch_kernel() { kernel<<<...>>>(); } Such code isn't wrong, following our notions of right and wrong in CUDA, unless it's codegen'ed. llvm-svn: 283963	2016-10-12 01:30:08 +00:00
Richard Smith	f75dcbef20	Aligned allocation versus CUDA: make deallocation function preference order match other CUDA preference orders, per discussion with jlebar. We now model this in an attempt to match overload resolution as closely as possible: - First, we throw out all non-callable (due to CUDA host/device mismatch) operator delete functions. - Then we apply sizedness / alignedness preferences based on whether the type is overaligned and whether the deallocation function is a member. - Finally, we use the CUDA callability preference as a tiebreaker. llvm-svn: 283830	2016-10-11 00:21:10 +00:00
Justin Lebar	2dfbe9a3b4	[CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h. Summary: This matches the idiom we use for our other CUDA wrapper headers. Reviewers: tra Subscribers: beanz, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D24978 llvm-svn: 283679	2016-10-08 22:16:08 +00:00
Justin Lebar	9fdb46e71c	[CUDA] Do a better job at detecting wrong-side calls. Summary: Move CheckCUDACall from ActOnCallExpr and BuildDeclRefExpr to DiagnoseUseOfDecl. This lets us catch some edge cases we were missing, specifically around class operators. This necessitates a few other changes: - Avoid emitting duplicate deferred diags in CheckCUDACall. Previously we'd carefully placed our call to CheckCUDACall such that it would only ever run once for a particular callsite. But now this isn't the case. - Emit deferred diagnostics from a template specialization/instantiation's primary template, in addition to from the specialization/instantiation itself. DiagnoseUseOfDecl ends up putting the deferred diagnostics on the template, rather than the specialization, so we need to check both. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D24573 llvm-svn: 283637	2016-10-08 01:07:11 +00:00
Justin Lebar	e060feb7b1	[CUDA] Disallow overloading destructors. Summary: We'd attempted to allow this, but turns out we were doing a very bad job. :) Making this work properly would be a giant change in clang. For example, we'd need to make CXXRecordDecl::getDestructor() context-sensitive, because the destructor you end up with depends on where you're calling it from. For now (and hopefully for ever), just disallow overloading of destructors in CUDA. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D24571 llvm-svn: 283120	2016-10-03 16:48:23 +00:00
Justin Lebar	281ce2af17	[CUDA] Allow extern __shared__ on empty-length arrays. "extern __shared__ int x[]" is OK. llvm-svn: 283068	2016-10-02 15:24:50 +00:00
Justin Lebar	e71b2fa4c9	[CUDA] Disallow __constant__ local variables. Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25129 llvm-svn: 282986	2016-09-30 23:57:34 +00:00
Justin Lebar	1041101953	[CUDA] Disallow 'extern __shared__' variables. Also add a test that we disallow __constant__ __shared__ int x; because it's possible to break this without breaking __shared__ __constant__ int x; Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D25125 llvm-svn: 282985	2016-09-30 23:57:30 +00:00
Justin Lebar	20614e0d9c	[CUDA] Fix implicit-device-lambda.cu after r282911. This commit added a warning that we're (correctly) hitting in this test. Just ignore it. llvm-svn: 282927	2016-09-30 20:17:37 +00:00
Justin Lebar	7ca116cacf	[CUDA] Make lambdas inherit __host__ and __device__ attributes from the scope in which they're created. Summary: NVCC compat. Fixes bug 30567. Reviewers: tra Subscribers: cfe-commits, rnk Differential Revision: https://reviews.llvm.org/D25105 llvm-svn: 282880	2016-09-30 17:14:53 +00:00
Justin Lebar	b17840de33	[CUDA] Disallow variable-length arrays in CUDA device code. Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: https://reviews.llvm.org/D25050 llvm-svn: 282647	2016-09-28 22:45:58 +00:00
Justin Lebar	2a8db34044	[CUDA] Disallow exceptions in device code. Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: https://reviews.llvm.org/D25036 llvm-svn: 282646	2016-09-28 22:45:54 +00:00
Konstantin Zhuravlyov	5b48d725a0	[AMDGPU] Expose flat work group size, register and wave control attributes __attribute__((amdgpu_flat_work_group_size(<min>, <max>))) - request minimum and maximum flat work group size __attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) - request minimum and/or maximum waves per execution unit Differential Revision: https://reviews.llvm.org/D24513 llvm-svn: 282371	2016-09-26 01:02:57 +00:00
Justin Lebar	5fd18d17e5	[CUDA] Add test checking our ability to take a function pointer to a __global__ function on the host side. Summary: This functionality is used by Thrust. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24581 llvm-svn: 281543	2016-09-14 21:50:11 +00:00
Artem Belevich	bed18e9cc4	[CUDA] Do not merge CUDA target attributes. CUDA target attributes are used for function overloading and must not be merged. This fixes a bug where attributes were inherited during function template specialization in CUDA and made it impossible for specialized function to provide its own target attributes. Differential Revision: https://reviews.llvm.org/D24522 llvm-svn: 281406	2016-09-13 22:16:30 +00:00
Justin Lebar	26bb31123a	[CUDA] Fix "declared here" note on deferred wrong-side errors. Previously we weren't deferring these "declared here" notes, which is obviously wrong. llvm-svn: 278767	2016-08-16 00:48:21 +00:00
Justin Lebar	18e2d82297	[CUDA] Raise an error if a wrong-side call is codegen'ed. Summary: Some function calls in CUDA are allowed to appear in semantically-correct programs but are an error if they're ever codegen'ed. Specifically, a host+device function may call a host function, but it's an error if such a function is ever codegen'ed in device mode (and vice versa). Previously, clang made no attempt to catch these errors. For the most part, they would be caught by ptxas, and reported as "call to unknown function 'foo'". Now we catch these errors and report them the same as we report other illegal calls (e.g. a call from a host function to a device function). This has a small change in error-message behavior for calls that were previously disallowed (e.g. calls from a host to a device function). Previously, we'd catch disallowed calls fairly early, before doing additional semantic checking e.g. of the call's arguments. Now we catch these illegal calls at the very end of our semantic checks, so we'll only emit a "illegal CUDA call" error if the call is otherwise well-formed. Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23242 llvm-svn: 278759	2016-08-15 23:00:49 +00:00
Justin Lebar	c989c3e784	[CUDA] Reject calls to __device__ functions from host variable global initializers. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23335 llvm-svn: 278196	2016-08-10 01:09:21 +00:00
Justin Lebar	7d078bddbd	[CUDA] Print a "previous-decl" note when calling an illegal member fn. Summary: When we emit err_ref_bad_target, we should emit a "'method' declared here" note. We already do so in most places, just not in BuildCallToMemberFunction. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23240 llvm-svn: 278195	2016-08-10 01:09:18 +00:00
Justin Lebar	4c2c6fd1c4	[CUDA] Add additional testcases for EraseUnwantedCUDAMatches. Summary: Specifically, this patch adds testcases for all three calls to EraseUnwantedCUDAMatches. The addr-of-overloaded-fn test I accidentally neutered in r264207, which moved much of CodeGenCUDA/function-overload.cu into SemaCUDA/function-overload.cu. The coverage from overloaded-delete test is new. Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D21913 llvm-svn: 275232	2016-07-12 23:23:12 +00:00
Justin Lebar	d35f706cc2	[CUDA] Don't assume that destructors can't be overloaded. Summary: You can overload a destructor in CUDA, and SemaOverload needs to be tweaked not to crash when it sees an explicit call to an overloaded destructor. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: http://reviews.llvm.org/D21912 llvm-svn: 275231	2016-07-12 23:23:01 +00:00
Justin Bogner	2d5de7e568	NVPTX: Use the nvvm builtins to read SRegs rather than the legacy ptx ones The ptx spellings were removed from LLVM in r274769. llvm-svn: 274770	2016-07-07 16:41:08 +00:00
Artem Belevich	bcec9dac14	[CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue. Fixes clang crash reported in PR27778. Differential Revision: http://reviews.llvm.org/D20985 llvm-svn: 271951	2016-06-06 22:54:57 +00:00
Reid Kleckner	a769fd50ba	Avoid depending on test inputes that aren't in Inputs Some people have weird CI systems that run each test subdirectory independently without access to other parallel trees. Unfortunately, this means we have to suffer some duplication until Art can sort out how to share these types. llvm-svn: 270164	2016-05-20 00:38:25 +00:00
Artem Belevich	3650bbeebc	[CUDA] Do not allow non-empty destructors for global device-side variables. According to Cuda Programming guide (v7.5, E2.3.1): > __device__, __constant__ and __shared__ variables defined in namespace > scope, that are of class type, cannot have a non-empty constructor or a > non-empty destructor. Clang already deals with device-side constructors (see D15305). This patch enforces similar rules for destructors. Differential Revision: http://reviews.llvm.org/D20140 llvm-svn: 270108	2016-05-19 20:13:53 +00:00
Artem Belevich	85b6f63f42	[CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts. Codegen tests for device-side variable initialization are subset of test cases used to verify Sema's part of the job. Including CodeGenCUDA/device-var-init.cu from SemaCUDA makes it easier to keep both sides in sync. Differential Revision: http://reviews.llvm.org/D20139 llvm-svn: 270107	2016-05-19 20:13:39 +00:00
Richard Smith	12e7931d0b	Add support for derived class special members hiding functions brought in from a base class via a using-declaration. If a class has a using-declaration declaring either a constructor or an assignment operator, eagerly declare its special members in case they need to displace a shadow declaration from a using-declaration. llvm-svn: 269398	2016-05-13 06:47:56 +00:00
Justin Lebar	ba122ab42f	[CUDA] Make unattributed constexpr functions implicitly host+device. With this patch, by a constexpr function is implicitly host+device unless: a) it's a variadic function (variadic functions are not allowed on the device side), or b) it's preceeded by a __device__ overload in a system header. The restriction on overloading __host__ __device__ functions on the basis of their CUDA attributes remains in place, but we use (b) to allow us to define __device__ overloads for constexpr functions in cmath, which would otherwise be __host__ __device__ and thus not overloadable. You can disable this behavior with -fno-cuda-host-device-constexpr. Reviewers: tra, rnk, rsmith Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18380 llvm-svn: 264964	2016-03-30 23:30:21 +00:00
Justin Lebar	25c4a81e79	[CUDA] Remove three obsolete CUDA cc1 flags. Summary: * -fcuda-target-overloads Previously unconditionally set to true by the driver. Necessary for correct functioning of the compiler -- our CUDA headers wrapper won't compile without this. * -fcuda-disable-target-call-checks Previously unconditionally set to true by the driver. Necessary to compile almost any external CUDA code -- almost all libraries assume that host+device code can call host or device functions. * -fcuda-allow-host-calls-from-host-device No effect when target overloading is enabled. Reviewers: tra Subscribers: rsmith, cfe-commits Differential Revision: http://reviews.llvm.org/D18416 llvm-svn: 264739	2016-03-29 16:24:16 +00:00
Justin Lebar	e5eed04d52	[CUDA] Merge most of CodeGenCUDA/function-overload.cu into SemaCUDA/function-overload.cu. Summary: Previously we were using the codegen test to ensure that we choose the right overload. But we can do this within sema, with a bit of cleverness. I left the constructor/destructor checks in CodeGen, because these overloads (particularly on the destructors) are hard to check in Sema. Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18386 llvm-svn: 264207	2016-03-23 22:42:30 +00:00
Justin Lebar	e82caa3055	[CUDA] Simplify SemaCUDA/function-overload.cu test. Summary: Principally, don't hardcode the line numbers of various notes. This lets us make changes to the test without recomputing linenos everywhere. Instead, just tell -verify that we may get 0 or more notes pointing to the relevant function definitions. Checking that we get exactly the right note isn't so important (and anyway is checked elsewhere). Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18385 llvm-svn: 264206	2016-03-23 22:42:28 +00:00
Justin Lebar	d33adadb0e	[CUDA] Don't allow templated variadic functions. Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18373 llvm-svn: 264106	2016-03-22 22:06:19 +00:00
Artem Belevich	1ef9b59284	[CUDA] do not allow attribute-based overloading for __global__ functions. __global__ functions are present on both host and device side, so providing __host__ or __device__ overloads is not going to do anything useful. llvm-svn: 261778	2016-02-24 21:54:45 +00:00
Justin Lebar	048a4c1ca9	[CUDA] Don't specify exact line numbers in cuda-builtin-vars.cu. This makes the test less fragile to changes to cuda_builtin_vars.h. Test-only change. llvm-svn: 261775	2016-02-24 21:49:30 +00:00
Artem Belevich	186091094a	[CUDA] Tweak attribute-based overload resolution to match nvcc behavior. This is an artefact of split-mode CUDA compilation that we need to mimic. HD functions are sometimes allowed to call H or D functions. Due to split compilation mode device-side compilation will not see host-only function and thus they will not be considered at all. For clang both H and D variants will become function overloads visible to compiler. Normally target attribute is considered only if C++ rules can not determine which function is better. However in this case we need to ignore functions that would not be present during current compilation phase before we apply normal overload resolution rules. Changes: * introduced another level of call preference to better describe possible call combinations. * removed WrongSide functions from consideration if the set contains SameSide function. * disabled H->D, D->H and G->H calls. These combinations are not allowed by CUDA and we were reluctantly allowing them to work around device-side calls to math functions in std namespace. We no longer need it after r258880. Differential Revision: http://reviews.llvm.org/D16870 llvm-svn: 260697	2016-02-12 18:29:18 +00:00
Justin Lebar	1eac5948db	[CUDA] Add -fcuda-allow-variadic-functions. Summary: Turns out the variadic function checking added in r258643 was too strict for some existing users; give them an escape valve. When -fcuda-allow-variadic-functions is passed, the front-end makes no attempt to disallow C-style variadic functions. Calls to va_arg are still not allowed. Reviewers: tra Subscribers: cfe-commits, jhen, echristo, bkramer Differential Revision: http://reviews.llvm.org/D16559 llvm-svn: 258822	2016-01-26 17:47:20 +00:00
Justin Lebar	e48cd6c530	[CUDA] Disallow variadic functions other than printf in device code. Reviewers: tra Subscribers: cfe-commits, echristo, jhen Differential Revision: http://reviews.llvm.org/D16484 llvm-svn: 258643	2016-01-23 21:28:17 +00:00
Justin Lebar	a8f0254bc1	[CUDA] Reject the alias attribute in CUDA device code. Summary: CUDA (well, strictly speaking, NVPTX) doesn't support aliases. Reviewers: echristo Subscribers: cfe-commits, jhen, tra Differential Revision: http://reviews.llvm.org/D16502 llvm-svn: 258641	2016-01-23 21:28:10 +00:00
Justin Lebar	6644e366b0	[CUDA] Bail, rather than crash, on va_arg in device code. Reviewers: tra Subscribers: echristo, jhen, cfe-commits Differential Revision: http://reviews.llvm.org/D16331 llvm-svn: 258264	2016-01-20 00:27:00 +00:00
Justin Lebar	c66a10652a	[CUDA] Only allow __global__ on free functions and static member functions. Summary: Warn for NVCC compatibility if you declare a static member function or inline function as __global__. Reviewers: tra Subscribers: jhen, echristo, cfe-commits Differential Revision: http://reviews.llvm.org/D16261 llvm-svn: 258263	2016-01-20 00:26:57 +00:00
Justin Lebar	f8bdacbadc	[CUDA] Warn undeclared identifiers in CUDA kernel calls Value, type, and instantiation dependence were not being handled correctly for CUDAKernelCallExpr AST nodes. As a result, if an undeclared identifier was used in the triple-angle-bracket kernel call configuration, there would be no error during parsing, and there would be a crash during code gen. This patch makes sure that an error will be issued during parsing in this case, just as there would be for any other use of an undeclared identifier in C++. Patch by Jason Henline. Reviewers: jlebar, rsmith Differential Revision: http://reviews.llvm.org/D15858 llvm-svn: 257839	2016-01-14 23:31:30 +00:00
Justin Lebar	3eaaf86397	[CUDA] Report an error if code tries to mix incompatible CUDA attributes. Summary: Thanks to jhen for helping me figure this out. Reviewers: tra, echristo Subscribers: jhen Differential Revision: http://reviews.llvm.org/D16129 llvm-svn: 257554	2016-01-13 01:07:35 +00:00
Akira Hatanaka	8c26ea663d	Produce a better diagnostic for global register variables. Currently, when there is a global register variable in a program that is bound to an invalid register, clang/llvm prints an error message that is not very user-friendly. This commit improves the diagnostic and moves the check that used to be in the backend to Sema. In addition, it makes changes to error out if the size of the register doesn't match the declared variable size. e.g., volatile register int B asm ("rbp"); rdar://problem/23084219 Differential Revision: http://reviews.llvm.org/D13834 llvm-svn: 253405	2015-11-18 00:15:28 +00:00
Artem Belevich	5e2a3ecd48	[CUDA] use -aux-triple to pass target triple of opposite side of compilation Clang needs to know target triple for both sides of compilation so that preprocessor macros and target builtins from both sides are available. This change augments Compilation class to carry information about toolchains used during different CUDA compilation passes and refactors BuildActions to use it when it constructs CUDA jobs. Removed DeviceTriple from CudaHostAction/CudaDeviceAction as it's no longer needed. Differential Revision: http://reviews.llvm.org/D13144 llvm-svn: 253385	2015-11-17 22:28:40 +00:00
Artem Belevich	b5bc923af4	[CUDA] Allow parsing of host and device code simultaneously. * adds -aux-triple option to specify target triple * propagates aux target info to AST context and Preprocessor * pulls in target specific preprocessor macros. * pulls in target-specific builtins from aux target. * sets appropriate host or device attribute on builtins. Differential Revision: http://reviews.llvm.org/D12917 llvm-svn: 248299	2015-09-22 17:23:22 +00:00
Artem Belevich	9674a64cd9	[CUDA] Add appropriate host/device attribute to builtins. The changes are part of attribute-based CUDA function overloading (D12453) and as such are only enabled when it's in effect (-fcuda-target-overloads). Differential Revision: http://reviews.llvm.org/D12122 llvm-svn: 248296	2015-09-22 17:23:05 +00:00
Artem Belevich	94a55e8169	[CUDA] Allow function overloads in CUDA based on host/device attributes. The patch makes it possible to parse CUDA files that contain host/device functions with identical signatures, but different attributes without having to physically split source into host-only and device-only parts. This change is needed in order to parse CUDA header files that have a lot of name clashes with standard include files. Gory details are in design doc here: https://goo.gl/EXnymm Feel free to leave comments there or in this review thread. This feature is controlled with CC1 option -fcuda-target-overloads and is disabled by default. Differential Revision: http://reviews.llvm.org/D12453 llvm-svn: 248295	2015-09-22 17:22:59 +00:00
Artem Belevich	5ef02c2db7	[CUDA] Check register names on appropriate side of cuda compilation only. Differential Revision: http://reviews.llvm.org/D11950 llvm-svn: 246193	2015-08-27 19:54:21 +00:00
Artem Belevich	7230a22d5e	Revert r245496 "[CUDA] Add appropriate host/device attribute to builtins." It's breaking internal test. llvm-svn: 245592	2015-08-20 18:28:56 +00:00
Artem Belevich	39259ffc65	[CUDA] Add appropriate host/device attribute to builtins. Differential Revision: http://reviews.llvm.org/D12122 llvm-svn: 245496	2015-08-19 20:48:20 +00:00
Artem Belevich	194ba60fe2	[CUDA] Added stubs for new attributes used by CUDA headers. The main purpose is to avoid errors and warnings while parsing CUDA header files. The attributes are currently unused otherwise. Differential version: http://reviews.llvm.org/D11690 llvm-svn: 244497	2015-08-10 20:33:56 +00:00
Artem Belevich	a0473a5479	[cuda] Preserve TLS storage class of host variable even if it's a device-side compilation. llvm-svn: 236029	2015-04-28 20:31:49 +00:00
Artem Belevich	fa62ad4087	[cuda] Ignore "TLS unsupported by target" errors for host variables during device compilation. During device-side CUDA compilation clang currently complains about all TLS variables, regardless of whether they are __host__ or __device__. This patch suppresses "TLS unsupported" errors for host variables during device compilation and for device variables during host compilation. Differential Revision: http://reviews.llvm.org/D9269 llvm-svn: 235907	2015-04-27 19:37:53 +00:00
Artem Belevich	7093e40641	[cuda] Allow using integral non-type template parameters as launch_bounds attribute arguments. - Changed CUDALaunchBounds arguments from integers to Expr* so they can be saved in AST for instantiation. - Added support for template instantiation of launch_bounds attrubute. - Moved evaluation of launch_bounds arguments to NVPTXTargetCodeGenInfo:: SetTargetAttributes() where it can be done after template instantiation. - Added a warning on negative launch_bounds arguments. - Amended test cases. Differential Revision: http://reviews.llvm.org/D8985 llvm-svn: 235452	2015-04-21 22:55:54 +00:00
Artem Belevich	4e192df778	[cuda] Added support for CUDA built-in variables. Added cuda_builtin_vars.h which implements built-in CUDA variables using __declattr(property). Fields of built-in variables (except for warpSize) are implemented using __declattr(property) which replaces read/write of a member field with a call to a getter/setter member function, in this case with appropriate NVPTX builtin. Added a test case to check diagnostics on attempt to construct or improperly access a built-in variable. Differential Revision: http://reviews.llvm.org/D9064 llvm-svn: 235448	2015-04-21 22:14:13 +00:00
Artem Belevich	a050112bba	Revert r235398 "[cuda] Added support for CUDA built-in variables." r235398 was causing buildbot break due to missing Makefile changes. llvm-svn: 235401	2015-04-21 18:36:42 +00:00
Artem Belevich	d0a2ae054f	[cuda] Added support for CUDA built-in variables. Added cuda_builtin_vars.h which implements built-in CUDA variables using __declattr(property). Fields of built-in variables (except for warpSize) are implemented using __declattr(property) which replaces read/write of a member field with a call to a getter/setter member function, in this case with appropriate NVPTX builtin. Added a test case to check diagnostics on attempt to construct or improperly access a built-in variable. Differential Revision: http://reviews.llvm.org/D9064 llvm-svn: 235398	2015-04-21 17:39:06 +00:00
Eli Bendersky	4bdc50eccb	Create a frontend flag to disable CUDA cross-target call checks For CUDA source, Sema checks that the targets of call expressions make sense (e.g. a host function can't call a device function). Adding a flag that lets us skip this check. Motivation: for source-to-source translation tools that have to accept code that's not strictly kosher CUDA but is still accepted by nvcc. The source-to-source translation tool can then fix the code and leave calls that are semantically valid for the actual compilation stage. Differential Revision: http://reviews.llvm.org/D9036 llvm-svn: 235049	2015-04-15 22:27:06 +00:00
Artem Belevich	5196fe7c19	Ignore device-side asm constraint errors while compiling CUDA code for host and vice versa. Differential Revision: http://reviews.llvm.org/D8392 llvm-svn: 232747	2015-03-19 18:40:25 +00:00
Jacques Pienaar	a50178c23e	CUDA: Add option to allow host device functions to call host functions Commiting code from review http://reviews.llvm.org/D7841 llvm-svn: 230385	2015-02-24 21:45:33 +00:00
Jacques Pienaar	5bdd67778f	Consider calls from implict host device functions as valid in SemaCUDA. In SemaCUDA all implicit functions were considered host device, this led to errors such as the following code snippet failing to compile: struct Copyable { const Copyable& operator=(const Copyable& x) { return *this; } }; struct Simple { Copyable b; }; void foo() { Simple a, b; a = b; } Above the implicit copy assignment operator was inferred as host device but there was only a host assignment copy defined which is an error in device compilation mode. Differential Revision: http://reviews.llvm.org/D6565 llvm-svn: 224358	2014-12-16 20:12:38 +00:00
Matt Arsenault	b9e9dc5e89	Workaround attribute ordering issue with kernel only attributes Placing the attribute after the kernel keyword would incorrectly reject the attribute, so use the smae workaround that other kernel only attributes use. Also add a FIXME because there are two different phrasings now for the same error, althoug amdgpu_num_[sv]gpr uses a consistent one. llvm-svn: 223490	2014-12-05 18:03:58 +00:00
Matt Arsenault	43fae6c855	Add attributes for AMDGPU register limits. This is a performance hint that can be applied to kernels to attempt to limit the number of used registers. llvm-svn: 223384	2014-12-04 20:38:18 +00:00
Reid Kleckner	bbc0178518	CUDA host device code with two code paths Summary: Allow CUDA host device functions with two code paths using __CUDA_ARCH__ to differentiate between code path being compiled. For example: __host__ __device__ void host_device_function(void) { #ifdef __CUDA_ARCH__ device_only_function(); #else host_only_function(); #endif } Patch by Jacques Pienaar. Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D6457 llvm-svn: 223271	2014-12-03 21:53:36 +00:00
Eli Bendersky	f2787a0dc0	CUDA: mark the target of implicit intrinsics properly r218624 implemented target inference for implicit special members. However, other entities can be implicit - for example intrinsics. These can not have inference running on them, so they should be marked host device as before. This is the safest and most flexible setting, since by construction these functions don't invoke anything, and we'd like them to be invokable from both host and device code. LLVM's intrinsics definitions (where these intrinsics come from in the case of CUDA/NVPTX) have no notion of target, so both host and device intrinsics can be supported this way. llvm-svn: 218688	2014-09-30 17:38:34 +00:00
Eli Bendersky	9a220fca4a	CUDA: Fix incorrect target inference for implicit members. As PR20495 demonstrates, Clang currenlty infers the CUDA target (host/device, etc) for implicit members (constructors, etc.) incorrectly. This causes errors and even assertions in Clang when compiling code (assertions in C++11 mode where implicit move constructors are added into the mix). Fix the problem by inferring the target from the methods the implicit member should call (depending on its base classes and fields). llvm-svn: 218624	2014-09-29 20:38:29 +00:00
Eli Bendersky	291a57e2c2	Fix PR20886 - enforce CUDA target match in method calls http://reviews.llvm.org/D5298 llvm-svn: 218482	2014-09-25 23:59:08 +00:00
Aaron Ballman	8ed8dbd96a	Automate attribute argument count semantic checking when there are variadic or optional arguments present. With this, the only time you should have to manually check attribute argument counts is when HasCustomParsing is set to true, or when you have variadic arguments that aren't really variadic (like ownership_holds and friends). Updating the diagnostics in the launch_bounds test since they have been improved in that case. Adding a test for nonnull since it has little test coverage, but has truly variadic arguments. llvm-svn: 214407	2014-07-31 16:37:04 +00:00
Eli Bendersky	3468d9d929	Move all CUDA testing inputs to Inputs/ subdirectory inside the tests. llvm-svn: 207453	2014-04-28 22:21:28 +00:00
Alp Toker	07508405f4	Disallow driver use in more Sema tests There are now only a handful of Sema tests remaining that use %clang in SemaCXX, SemaObjC and SemaTemplate. llvm-svn: 206688	2014-04-19 19:07:31 +00:00
Aaron Ballman	05e420abad	Updated the wording of two attribute-related diagnostics so that they print the offending attribute name. Also updates the associated test cases. llvm-svn: 198355	2014-01-02 21:26:14 +00:00
Aaron Ballman	66039937e8	Added a comment about the launch_bounds attribute's AST node being required. Since there were no test cases for the attribute, some have been added. This promptly demonstrated a bug with the semantic handling, which is also fixed. llvm-svn: 197637	2013-12-19 00:41:31 +00:00
Peter Collingbourne	7277fe8aed	CUDA: diagnose invalid calls across targets llvm-svn: 140978	2011-10-02 23:49:40 +00:00
Peter Collingbourne	619a8c7df3	CUDA: add separate diagnostics for too few/many exec config args llvm-svn: 140977	2011-10-02 23:49:29 +00:00
Peter Collingbourne	34a20b081e	CUDA: diagnose unconfigured calls to global functions llvm-svn: 140975	2011-10-02 23:49:15 +00:00
Peter Collingbourne	4b66c47a16	Sema: diagnose kernel calls to non-global functions llvm-svn: 126292	2011-02-23 01:53:29 +00:00
Peter Collingbourne	5eec5f0422	Parse: add support for parsing CUDA kernel calls llvm-svn: 125219	2011-02-09 21:12:02 +00:00
Peter Collingbourne	9e2c81f00a	AST, Sema, Serialization: keep track of cudaConfigureCall llvm-svn: 125216	2011-02-09 21:04:32 +00:00
Peter Collingbourne	e8cfaf4258	Sema: diagnose kernel functions with non-void return type llvm-svn: 121653	2010-12-12 23:02:57 +00:00
Peter Collingbourne	6ab610ce57	Basic, Sema: add support for CUDA location attributes llvm-svn: 120545	2010-12-01 03:15:31 +00:00

1 2 3 4 5

210 Commits