llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	622eaa4a4c	[HIP] Support __managed__ attribute This patch implements codegen for __managed__ variable attribute for HIP. Diagnostics will be added later. Differential Revision: https://reviews.llvm.org/D94814	2021-01-22 11:43:58 -05:00
Yaxun (Sam) Liu	acb6f80d96	[CUDA][HIP] Fix overloading resolution This patch implements correct hostness based overloading resolution in isBetterOverloadCandidate. Based on hostness, if one candidate is emittable whereas the other candidate is not emittable, the emittable candidate is better. If both candidates are emittable, or neither is emittable based on hostness, then other rules should be used to determine which is better. This is because hostness based overloading resolution is mostly for determining viability of a function. If two functions are both viable, other factors should take precedence in preference. If other rules cannot determine which is better, CUDA preference will be used again to determine which is better. However, correct hostness based overloading resolution requires overloading resolution diagnostics to be deferred, which is not on by default. The rationale is that deferring overloading resolution diagnostics may hide overloading reslolutions issues in header files. An option -fgpu-exclude-wrong-side-overloads is added, which is off by default. When -fgpu-exclude-wrong-side-overloads is off, keep the original behavior, that is, exclude wrong side overloads only if there are same side overloads. This may result in incorrect overloading resolution when there are no same side candates, but is sufficient for most CUDA/HIP applications. When -fgpu-exclude-wrong-side-overloads is on, enable deferring overloading resolution diagnostics and enable correct hostness based overloading resolution, i.e., always exclude wrong side overloads. Differential Revision: https://reviews.llvm.org/D80450	2020-12-02 16:33:33 -05:00
Yaxun (Sam) Liu	9275e14379	recommit `4fc752b30b` [CUDA][HIP] Always defer diagnostics for wrong-sided reference Fixed regression in test builtin-amdgcn-atomic-inc-dec-failure.cpp.	2020-07-17 09:14:39 -04:00
Yaxun (Sam) Liu	a46ef7d42d	Revert "[CUDA][HIP] Always defer diagnostics for wrong-sided reference" This reverts commit `4fc752b30b`.	2020-07-17 08:10:56 -04:00
Yaxun (Sam) Liu	4fc752b30b	[CUDA][HIP] Always defer diagnostics for wrong-sided reference When a device function calls a host function or vice versa, this is wrong-sided reference. Currently clang immediately diagnose it. This is different from nvcc behavior, where it is diagnosed only if the function is really emitted. Current clang behavior causes false alarms for valid use cases. This patch let clang always defer diagnostics for wrong-sided reference. Differential Revision: https://reviews.llvm.org/D83893	2020-07-17 07:51:55 -04:00
Fangrui Song	dfc0d94755	Revert D80450 "[CUDA][HIP] Fix implicit HD function resolution" This reverts commit `263390d4f5`. This can still cause bogus errors: eigen3/Eigen/src/Core/CoreEvaluators.h:94:38: error: call to implicitly-deleted copy constructor of 'unary_evaluator<Eigen::Inverse<Eigen::Matrix<double, 4, 4, 0, 4, 4>>>' thrust/system/detail/generic/for_each.h:49:3: error: implicit instantiation of undefined template 'thrust::detail::STATIC_ASSERTION_FAILURE<false>'	2020-06-10 17:42:28 -07:00
Yaxun (Sam) Liu	263390d4f5	[CUDA][HIP] Fix implicit HD function resolution recommit `e03394c6a6` with fix When implicit HD function calls a function in device compilation, if one candidate is an implicit HD function, current resolution rule is: D wins over HD and H HD and H are equal this caused regression when there is an otherwise worse D candidate This patch changes that to D, HD and H are all equal The rationale is that we already know for host compilation there is already a valid candidate in HD and H candidates that will not cause error. Allowing HD and H gives us a fall back candidate that will not cause error. If D wins, that means D has to be a better match otherwise, therefore D should also be a valid candidate that will not cause error. In this way, we can guarantee no regression. Differential Revision: https://reviews.llvm.org/D80450	2020-06-04 16:54:52 -04:00
Artem Belevich	ef649e8fd5	Revert "[CUDA][HIP] Workaround for resolving host device function against wrong-sided function" Still breaks CUDA compilation. This reverts commit `e03394c6a6`.	2020-05-18 12:22:55 -07:00
Yaxun (Sam) Liu	e03394c6a6	[CUDA][HIP] Workaround for resolving host device function against wrong-sided function recommit `c77a4078e0` with fix https://reviews.llvm.org/D77954 caused regressions due to diagnostics in implicit host device functions. For now, it seems the most feasible workaround is to treat implicit host device function and explicit host device function differently. Basically in device compilation for implicit host device functions, keep the old behavior, i.e. give host device candidates and wrong-sided candidates equal preference. For explicit host device functions, favor host device candidates against wrong-sided candidates. The rationale is that explicit host device functions are blessed by the user to be valid host device functions, that is, they should not cause diagnostics in both host and device compilation. If diagnostics occur, user is able to fix them. However, there is no guarantee that implicit host device function can be compiled in device compilation, therefore we need to preserve its overloading resolution in device compilation. Differential Revision: https://reviews.llvm.org/D79526	2020-05-12 08:27:50 -04:00
Artem Belevich	bf6a26b066	Revert D77954 -- it breaks Eigen & Tensorflow. This reverts commit `55bcb96f31`.	2020-05-05 14:07:31 -07:00
Yaxun (Sam) Liu	55bcb96f31	recommit `c77a4078e0` with fix https://reviews.llvm.org/D77954 caused a regression about ambiguity of new operator in file scope. This patch recovered the previous behavior for comparison without a caller. This is a workaround. For real fix we need D71227 https://reviews.llvm.org/D78970	2020-04-28 09:14:13 -04:00
Dmitri Gribenko	8c8aae852b	Revert "recommit c77a4078e01033aa2206c31a579d217c8a07569b" This reverts commit `b46b1a916d`. It broke overload resolution for operator 'new' -- see reproducer in https://reviews.llvm.org/D77954.	2020-04-27 16:41:35 +02:00
Yaxun (Sam) Liu	b46b1a916d	recommit `c77a4078e0`	2020-04-24 16:53:18 -04:00
Yaxun (Sam) Liu	7eae00477f	Revert "[CUDA][HIP] Fix host/device based overload resolution" This reverts commit `c77a4078e0`.	2020-04-24 14:57:10 -04:00
Yaxun (Sam) Liu	c77a4078e0	[CUDA][HIP] Fix host/device based overload resolution Currently clang fails to compile the following CUDA program in device compilation: __host__ int foo(int x) { return 1; } template<class T> __device__ __host__ int foo(T x) { return 2; } __device__ __host__ int bar() { return foo(1); } __global__ void test(int a) { a = bar(); } This is due to foo is resolved to the __host__ foo instead of __device__ __host__ foo. This seems to be a bug since __device__ __host__ foo is a viable callee for foo whereas clang is unable to choose it. This patch fixes that. Differential Revision: https://reviews.llvm.org/D77954	2020-04-24 14:55:18 -04:00
Michael Liao	b8fc6a9116	[CUDA][HIP] Re-apply part of r372318. - r372318 causes violation of `use-of-uninitialized-value` detected by MemorySanitizer. Once `Viable` field is set to false, `FailureKind` needs setting as well as it will be checked during destruction if `Viable` is not true. - Revert the part trying to skip `std::vector` erasing. llvm-svn: 372356	2019-09-19 21:26:18 +00:00
Mitch Phillips	08f938bd1a	Revert "[CUDA][HIP] Fix typo in `BestViableFunction`" Broke the msan buildbots (see comments on rL372318 for more details). This reverts commit `eb231d1582`. llvm-svn: 372353	2019-09-19 21:11:28 +00:00
Michael Liao	eb231d1582	[CUDA][HIP] Fix typo in `BestViableFunction` Summary: - Should consider viable ones only when checking SameSide candidates. - Replace erasing with clearing viable flag to reduce data moving/copying. - Add one and revise another one as the diagnostic message are more relevant compared to previous one. Reviewers: tra Subscribers: cfe-commits, yaxunl Tags: #clang Differential Revision: https://reviews.llvm.org/D67730 llvm-svn: 372318	2019-09-19 13:14:03 +00:00
Richard Trieu	b402580616	Fix some handling of AST nodes with diagnostics. The diagnostic system for Clang can already handle many AST nodes. Instead of converting them to strings first, just hand the AST node directly to the diagnostic system and let it handle the output. Minor changes in some diagnostic output. llvm-svn: 328688	2018-03-28 04:16:13 +00:00
Richard Smith	6c716116df	PR34163: Don't cache an incorrect key function for a class if queried between the class becoming complete and its inline methods being parsed. This replaces the hack of using the "late parsed template" flag to track member functions with bodies we've not parsed yet; instead we now use the "will have body" flag, which carries the desired implication that the function declaration is a definition, and that we've just not parsed its body yet. llvm-svn: 310776	2017-08-12 01:46:03 +00:00
Artem Belevich	13e9b4d768	[CUDA] Improve target attribute checking for function templates. * __host__ __device__ functions are no longer considered to be redeclarations of __host__ or __device__ functions. This prevents unintentional merging of target attributes across them. * Function target attributes are not considered (and must match) during explicit instantiation and specialization of function templates. Differential Revision: https://reviews.llvm.org/D25809 llvm-svn: 288962	2016-12-07 19:27:16 +00:00
Justin Lebar	d3fd70dedd	[CUDA] Rework tests now that we emit deferred diagnostics during sema. Test-only change. Summary: Previously we had to split out a lot of our tests into a test that checked only immediate errors and a test that checked only deferred errors. This was because, if you emitted any immediate errors, we wouldn't run codegen, where the deferred errors were emitted. We've fixed this, and now emit deferred errors during sema. This lets us merge a bunch of tests, and lets us convert some other tests to -fsyntax-only. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25755 llvm-svn: 284553	2016-10-19 00:06:49 +00:00
Justin Lebar	23d954241b	[CUDA] Emit deferred diagnostics during Sema rather than during codegen. Summary: Emitting deferred diagnostics during codegen was a hack. It did work, but usability was poor, both for us as compiler devs and for users. We don't codegen if there are any sema errors, so for users this meant that they wouldn't see deferred errors if there were any non-deferred errors. For devs, this meant that we had to carefully split up our tests so that when we tested deferred errors, we didn't emit any non-deferred errors. This change moves checking for deferred errors into Sema. See the big comment in SemaCUDA.cpp for an overview of the idea. This checking adds overhead to compilation, because we have to maintain a partial call graph. As a result, this change makes deferred errors a CUDA-only concept (whereas before they were a general concept). If anyone else wants to use this framework for something other than CUDA, we can generalize at that time. This patch makes the minimal set of test changes -- after this lands, I'll go back through and do a cleanup of the tests that we no longer have to split up. Reviewers: rnk Subscribers: cfe-commits, rsmith, tra Differential Revision: https://reviews.llvm.org/D25541 llvm-svn: 284158	2016-10-13 20:52:12 +00:00
Justin Lebar	0254c46300	[CUDA] Make touching a kernel from a __host__ __device__ function a deferred error. Previously, this was an immediate, don't pass go, don't collect $200 error. But this precludes us from writing code like __host__ __device__ void launch_kernel() { kernel<<<...>>>(); } Such code isn't wrong, following our notions of right and wrong in CUDA, unless it's codegen'ed. llvm-svn: 283963	2016-10-12 01:30:08 +00:00
Justin Lebar	e060feb7b1	[CUDA] Disallow overloading destructors. Summary: We'd attempted to allow this, but turns out we were doing a very bad job. :) Making this work properly would be a giant change in clang. For example, we'd need to make CXXRecordDecl::getDestructor() context-sensitive, because the destructor you end up with depends on where you're calling it from. For now (and hopefully for ever), just disallow overloading of destructors in CUDA. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D24571 llvm-svn: 283120	2016-10-03 16:48:23 +00:00
Artem Belevich	bed18e9cc4	[CUDA] Do not merge CUDA target attributes. CUDA target attributes are used for function overloading and must not be merged. This fixes a bug where attributes were inherited during function template specialization in CUDA and made it impossible for specialized function to provide its own target attributes. Differential Revision: https://reviews.llvm.org/D24522 llvm-svn: 281406	2016-09-13 22:16:30 +00:00
Justin Lebar	25c4a81e79	[CUDA] Remove three obsolete CUDA cc1 flags. Summary: * -fcuda-target-overloads Previously unconditionally set to true by the driver. Necessary for correct functioning of the compiler -- our CUDA headers wrapper won't compile without this. * -fcuda-disable-target-call-checks Previously unconditionally set to true by the driver. Necessary to compile almost any external CUDA code -- almost all libraries assume that host+device code can call host or device functions. * -fcuda-allow-host-calls-from-host-device No effect when target overloading is enabled. Reviewers: tra Subscribers: rsmith, cfe-commits Differential Revision: http://reviews.llvm.org/D18416 llvm-svn: 264739	2016-03-29 16:24:16 +00:00
Justin Lebar	e5eed04d52	[CUDA] Merge most of CodeGenCUDA/function-overload.cu into SemaCUDA/function-overload.cu. Summary: Previously we were using the codegen test to ensure that we choose the right overload. But we can do this within sema, with a bit of cleverness. I left the constructor/destructor checks in CodeGen, because these overloads (particularly on the destructors) are hard to check in Sema. Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18386 llvm-svn: 264207	2016-03-23 22:42:30 +00:00
Justin Lebar	e82caa3055	[CUDA] Simplify SemaCUDA/function-overload.cu test. Summary: Principally, don't hardcode the line numbers of various notes. This lets us make changes to the test without recomputing linenos everywhere. Instead, just tell -verify that we may get 0 or more notes pointing to the relevant function definitions. Checking that we get exactly the right note isn't so important (and anyway is checked elsewhere). Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18385 llvm-svn: 264206	2016-03-23 22:42:28 +00:00
Artem Belevich	1ef9b59284	[CUDA] do not allow attribute-based overloading for __global__ functions. __global__ functions are present on both host and device side, so providing __host__ or __device__ overloads is not going to do anything useful. llvm-svn: 261778	2016-02-24 21:54:45 +00:00
Artem Belevich	186091094a	[CUDA] Tweak attribute-based overload resolution to match nvcc behavior. This is an artefact of split-mode CUDA compilation that we need to mimic. HD functions are sometimes allowed to call H or D functions. Due to split compilation mode device-side compilation will not see host-only function and thus they will not be considered at all. For clang both H and D variants will become function overloads visible to compiler. Normally target attribute is considered only if C++ rules can not determine which function is better. However in this case we need to ignore functions that would not be present during current compilation phase before we apply normal overload resolution rules. Changes: * introduced another level of call preference to better describe possible call combinations. * removed WrongSide functions from consideration if the set contains SameSide function. * disabled H->D, D->H and G->H calls. These combinations are not allowed by CUDA and we were reluctantly allowing them to work around device-side calls to math functions in std namespace. We no longer need it after r258880. Differential Revision: http://reviews.llvm.org/D16870 llvm-svn: 260697	2016-02-12 18:29:18 +00:00
Artem Belevich	94a55e8169	[CUDA] Allow function overloads in CUDA based on host/device attributes. The patch makes it possible to parse CUDA files that contain host/device functions with identical signatures, but different attributes without having to physically split source into host-only and device-only parts. This change is needed in order to parse CUDA header files that have a lot of name clashes with standard include files. Gory details are in design doc here: https://goo.gl/EXnymm Feel free to leave comments there or in this review thread. This feature is controlled with CC1 option -fcuda-target-overloads and is disabled by default. Differential Revision: http://reviews.llvm.org/D12453 llvm-svn: 248295	2015-09-22 17:22:59 +00:00

32 Commits