Commit Graph

32 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu 622eaa4a4c [HIP] Support __managed__ attribute
This patch implements codegen for __managed__ variable attribute for HIP.

Diagnostics will be added later.

Differential Revision: https://reviews.llvm.org/D94814
2021-01-22 11:43:58 -05:00
Yaxun (Sam) Liu acb6f80d96 [CUDA][HIP] Fix overloading resolution
This patch implements correct hostness based overloading resolution
in isBetterOverloadCandidate.

Based on hostness, if one candidate is emittable whereas the other
candidate is not emittable, the emittable candidate is better.

If both candidates are emittable, or neither is emittable based on hostness, then
other rules should be used to determine which is better. This is because
hostness based overloading resolution is mostly for determining
viability of a function. If two functions are both viable, other factors
should take precedence in preference.

If other rules cannot determine which is better, CUDA preference will be
used again to determine which is better.

However, correct hostness based overloading resolution
requires overloading resolution diagnostics to be deferred,
which is not on by default. The rationale is that deferring
overloading resolution diagnostics may hide overloading reslolutions
issues in header files.

An option -fgpu-exclude-wrong-side-overloads is added, which is off by
default.

When -fgpu-exclude-wrong-side-overloads is off, keep the original behavior,
that is, exclude wrong side overloads only if there are same side overloads.
This may result in incorrect overloading resolution when there are no
same side candates, but is sufficient for most CUDA/HIP applications.

When -fgpu-exclude-wrong-side-overloads is on, enable deferring
overloading resolution diagnostics and enable correct hostness
based overloading resolution, i.e., always exclude wrong side overloads.

Differential Revision: https://reviews.llvm.org/D80450
2020-12-02 16:33:33 -05:00
Yaxun (Sam) Liu 9275e14379 recommit 4fc752b30b [CUDA][HIP] Always defer diagnostics for wrong-sided reference
Fixed regression in test builtin-amdgcn-atomic-inc-dec-failure.cpp.
2020-07-17 09:14:39 -04:00
Yaxun (Sam) Liu a46ef7d42d Revert "[CUDA][HIP] Always defer diagnostics for wrong-sided reference"
This reverts commit 4fc752b30b.
2020-07-17 08:10:56 -04:00
Yaxun (Sam) Liu 4fc752b30b [CUDA][HIP] Always defer diagnostics for wrong-sided reference
When a device function calls a host function or vice versa, this is wrong-sided
reference. Currently clang immediately diagnose it. This is different from nvcc
behavior, where it is diagnosed only if the function is really emitted.

Current clang behavior causes false alarms for valid use cases.

This patch let clang always defer diagnostics for wrong-sided
reference.

Differential Revision: https://reviews.llvm.org/D83893
2020-07-17 07:51:55 -04:00
Fangrui Song dfc0d94755 Revert D80450 "[CUDA][HIP] Fix implicit HD function resolution"
This reverts commit 263390d4f5.

This can still cause bogus errors:

eigen3/Eigen/src/Core/CoreEvaluators.h:94:38: error: call to implicitly-deleted copy constructor of 'unary_evaluator<Eigen::Inverse<Eigen::Matrix<double, 4, 4, 0, 4, 4>>>'

thrust/system/detail/generic/for_each.h:49:3: error: implicit instantiation of undefined template
'thrust::detail::STATIC_ASSERTION_FAILURE<false>'
2020-06-10 17:42:28 -07:00
Yaxun (Sam) Liu 263390d4f5 [CUDA][HIP] Fix implicit HD function resolution
recommit e03394c6a6 with fix

When implicit HD function calls a function in device compilation,
if one candidate is an implicit HD function, current resolution rule is:

D wins over HD and H
HD and H are equal

this caused regression when there is an otherwise worse D candidate

This patch changes that to

D, HD and H are all equal

The rationale is that we already know for host compilation there is already
a valid candidate in HD and H candidates that will not cause error. Allowing
HD and H gives us a fall back candidate that will not cause error. If D wins,
that means D has to be a better match otherwise, therefore D should also
be a valid candidate that will not cause error. In this way, we can guarantee
no regression.

Differential Revision: https://reviews.llvm.org/D80450
2020-06-04 16:54:52 -04:00
Artem Belevich ef649e8fd5 Revert "[CUDA][HIP] Workaround for resolving host device function against wrong-sided function"
Still breaks CUDA compilation.

This reverts commit e03394c6a6.
2020-05-18 12:22:55 -07:00
Yaxun (Sam) Liu e03394c6a6 [CUDA][HIP] Workaround for resolving host device function against wrong-sided function
recommit c77a4078e0 with fix

https://reviews.llvm.org/D77954 caused regressions due to diagnostics in implicit
host device functions.

For now, it seems the most feasible workaround is to treat implicit host device function and explicit host
device function differently. Basically in device compilation for implicit host device functions, keep the
old behavior, i.e. give host device candidates and wrong-sided candidates equal preference. For explicit
host device functions, favor host device candidates against wrong-sided candidates.

The rationale is that explicit host device functions are blessed by the user to be valid host device functions,
that is, they should not cause diagnostics in both host and device compilation. If diagnostics occur, user is
able to fix them. However, there is no guarantee that implicit host device function can be compiled in
device compilation, therefore we need to preserve its overloading resolution in device compilation.

Differential Revision: https://reviews.llvm.org/D79526
2020-05-12 08:27:50 -04:00
Artem Belevich bf6a26b066 Revert D77954 -- it breaks Eigen & Tensorflow.
This reverts commit 55bcb96f31.
2020-05-05 14:07:31 -07:00
Yaxun (Sam) Liu 55bcb96f31 recommit c77a4078e0 with fix
https://reviews.llvm.org/D77954 caused a regression about ambiguity of new operator
in file scope.

This patch recovered the previous behavior for comparison without a caller.

This is a workaround. For real fix we need D71227

https://reviews.llvm.org/D78970
2020-04-28 09:14:13 -04:00
Dmitri Gribenko 8c8aae852b Revert "recommit c77a4078e01033aa2206c31a579d217c8a07569b"
This reverts commit b46b1a916d. It broke
overload resolution for operator 'new' -- see reproducer in
https://reviews.llvm.org/D77954.
2020-04-27 16:41:35 +02:00
Yaxun (Sam) Liu b46b1a916d recommit c77a4078e0 2020-04-24 16:53:18 -04:00
Yaxun (Sam) Liu 7eae00477f Revert "[CUDA][HIP] Fix host/device based overload resolution"
This reverts commit c77a4078e0.
2020-04-24 14:57:10 -04:00
Yaxun (Sam) Liu c77a4078e0 [CUDA][HIP] Fix host/device based overload resolution
Currently clang fails to compile the following CUDA program in device compilation:

__host__ int foo(int x) {
     return 1;
}

template<class T>
__device__ __host__ int foo(T x) {
    return 2;
}

__device__ __host__ int bar() {
    return foo(1);
}

__global__ void test(int *a) {
    *a = bar();
}

This is due to foo is resolved to the __host__ foo instead of __device__ __host__ foo.
This seems to be a bug since __device__ __host__ foo is a viable callee for foo whereas
clang is unable to choose it.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D77954
2020-04-24 14:55:18 -04:00
Michael Liao b8fc6a9116 [CUDA][HIP] Re-apply part of r372318.
- r372318 causes violation of `use-of-uninitialized-value` detected by
  MemorySanitizer. Once `Viable` field is set to false, `FailureKind`
  needs setting as well as it will be checked during destruction if
  `Viable` is not true.
- Revert the part trying to skip `std::vector` erasing.

llvm-svn: 372356
2019-09-19 21:26:18 +00:00
Mitch Phillips 08f938bd1a Revert "[CUDA][HIP] Fix typo in `BestViableFunction`"
Broke the msan buildbots (see comments on rL372318 for more details).

This reverts commit eb231d1582.

llvm-svn: 372353
2019-09-19 21:11:28 +00:00
Michael Liao eb231d1582 [CUDA][HIP] Fix typo in `BestViableFunction`
Summary:
- Should consider viable ones only when checking SameSide candidates.
- Replace erasing with clearing viable flag to reduce data
  moving/copying.
- Add one and revise another one as the diagnostic message are more
  relevant compared to previous one.

Reviewers: tra

Subscribers: cfe-commits, yaxunl

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67730

llvm-svn: 372318
2019-09-19 13:14:03 +00:00
Richard Trieu b402580616 Fix some handling of AST nodes with diagnostics.
The diagnostic system for Clang can already handle many AST nodes.  Instead
of converting them to strings first, just hand the AST node directly to
the diagnostic system and let it handle the output.  Minor changes in some
diagnostic output.

llvm-svn: 328688
2018-03-28 04:16:13 +00:00
Richard Smith 6c716116df PR34163: Don't cache an incorrect key function for a class if queried between
the class becoming complete and its inline methods being parsed.

This replaces the hack of using the "late parsed template" flag to track member
functions with bodies we've not parsed yet; instead we now use the "will have
body" flag, which carries the desired implication that the function declaration
*is* a definition, and that we've just not parsed its body yet.

llvm-svn: 310776
2017-08-12 01:46:03 +00:00
Artem Belevich 13e9b4d768 [CUDA] Improve target attribute checking for function templates.
* __host__ __device__ functions are no longer considered to be
  redeclarations of __host__ or __device__ functions. This prevents
  unintentional merging of target attributes across them.
* Function target attributes are not considered (and must match) during
  explicit instantiation and specialization of function templates.

Differential Revision: https://reviews.llvm.org/D25809

llvm-svn: 288962
2016-12-07 19:27:16 +00:00
Justin Lebar d3fd70dedd [CUDA] Rework tests now that we emit deferred diagnostics during sema. Test-only change.
Summary:
Previously we had to split out a lot of our tests into a test that
checked only immediate errors and a test that checked only deferred
errors.  This was because, if you emitted any immediate errors, we
wouldn't run codegen, where the deferred errors were emitted.

We've fixed this, and now emit deferred errors during sema.  This lets
us merge a bunch of tests, and lets us convert some other tests to
-fsyntax-only.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25755

llvm-svn: 284553
2016-10-19 00:06:49 +00:00
Justin Lebar 23d954241b [CUDA] Emit deferred diagnostics during Sema rather than during codegen.
Summary:
Emitting deferred diagnostics during codegen was a hack.  It did work,
but usability was poor, both for us as compiler devs and for users.  We
don't codegen if there are any sema errors, so for users this meant that
they wouldn't see deferred errors if there were any non-deferred errors.
For devs, this meant that we had to carefully split up our tests so that
when we tested deferred errors, we didn't emit any non-deferred errors.

This change moves checking for deferred errors into Sema.  See the big
comment in SemaCUDA.cpp for an overview of the idea.

This checking adds overhead to compilation, because we have to maintain
a partial call graph.  As a result, this change makes deferred errors a
CUDA-only concept (whereas before they were a general concept).  If
anyone else wants to use this framework for something other than CUDA,
we can generalize at that time.

This patch makes the minimal set of test changes -- after this lands,
I'll go back through and do a cleanup of the tests that we no longer
have to split up.

Reviewers: rnk

Subscribers: cfe-commits, rsmith, tra

Differential Revision: https://reviews.llvm.org/D25541

llvm-svn: 284158
2016-10-13 20:52:12 +00:00
Justin Lebar 0254c46300 [CUDA] Make touching a kernel from a __host__ __device__ function a deferred error.
Previously, this was an immediate, don't pass go, don't collect $200
error.  But this precludes us from writing code like

  __host__ __device__ void launch_kernel() {
    kernel<<<...>>>();
  }

Such code isn't wrong, following our notions of right and wrong in CUDA,
unless it's codegen'ed.

llvm-svn: 283963
2016-10-12 01:30:08 +00:00
Justin Lebar e060feb7b1 [CUDA] Disallow overloading destructors.
Summary:
We'd attempted to allow this, but turns out we were doing a very bad
job.  :)

Making this work properly would be a giant change in clang.  For
example, we'd need to make CXXRecordDecl::getDestructor()
context-sensitive, because the destructor you end up with depends on
where you're calling it from.

For now (and hopefully for ever), just disallow overloading of
destructors in CUDA.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D24571

llvm-svn: 283120
2016-10-03 16:48:23 +00:00
Artem Belevich bed18e9cc4 [CUDA] Do not merge CUDA target attributes.
CUDA target attributes are used for function overloading and must not be merged.

This fixes a bug where attributes were inherited during function template
specialization in CUDA and made it impossible for specialized function
to provide its own target attributes.

Differential Revision: https://reviews.llvm.org/D24522

llvm-svn: 281406
2016-09-13 22:16:30 +00:00
Justin Lebar 25c4a81e79 [CUDA] Remove three obsolete CUDA cc1 flags.
Summary:
* -fcuda-target-overloads

  Previously unconditionally set to true by the driver.  Necessary for
  correct functioning of the compiler -- our CUDA headers wrapper won't
  compile without this.

* -fcuda-disable-target-call-checks

  Previously unconditionally set to true by the driver.  Necessary to
  compile almost any external CUDA code -- almost all libraries assume
  that host+device code can call host or device functions.

* -fcuda-allow-host-calls-from-host-device

  No effect when target overloading is enabled.

Reviewers: tra

Subscribers: rsmith, cfe-commits

Differential Revision: http://reviews.llvm.org/D18416

llvm-svn: 264739
2016-03-29 16:24:16 +00:00
Justin Lebar e5eed04d52 [CUDA] Merge most of CodeGenCUDA/function-overload.cu into SemaCUDA/function-overload.cu.
Summary:
Previously we were using the codegen test to ensure that we choose the
right overload.  But we can do this within sema, with a bit of
cleverness.

I left the constructor/destructor checks in CodeGen, because these
overloads (particularly on the destructors) are hard to check in Sema.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18386

llvm-svn: 264207
2016-03-23 22:42:30 +00:00
Justin Lebar e82caa3055 [CUDA] Simplify SemaCUDA/function-overload.cu test.
Summary:
Principally, don't hardcode the line numbers of various notes.  This
lets us make changes to the test without recomputing linenos everywhere.

Instead, just tell -verify that we may get 0 or more notes pointing to
the relevant function definitions.  Checking that we get exactly the
right note isn't so important (and anyway is checked elsewhere).

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18385

llvm-svn: 264206
2016-03-23 22:42:28 +00:00
Artem Belevich 1ef9b59284 [CUDA] do not allow attribute-based overloading for __global__ functions.
__global__ functions are present on both host and device side,
so providing __host__ or __device__ overloads is not going to
do anything useful.

llvm-svn: 261778
2016-02-24 21:54:45 +00:00
Artem Belevich 186091094a [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.
This is an artefact of split-mode CUDA compilation that we need to
mimic. HD functions are sometimes allowed to call H or D functions. Due
to split compilation mode device-side compilation will not see host-only
function and thus they will not be considered at all. For clang both H
and D variants will become function overloads visible to
compiler. Normally target attribute is considered only if C++ rules can
not determine which function is better. However in this case we need to
ignore functions that would not be present during current compilation
phase before we apply normal overload resolution rules.

Changes:
* introduced another level of call preference to better describe
  possible call combinations.
* removed WrongSide functions from consideration if the set contains
  SameSide function.
* disabled H->D, D->H and G->H calls. These combinations are
  not allowed by CUDA and we were reluctantly allowing them to work
  around device-side calls to math functions in std namespace.
  We no longer need it after r258880.

Differential Revision: http://reviews.llvm.org/D16870

llvm-svn: 260697
2016-02-12 18:29:18 +00:00
Artem Belevich 94a55e8169 [CUDA] Allow function overloads in CUDA based on host/device attributes.
The patch makes it possible to parse CUDA files that contain host/device
functions with identical signatures, but different attributes without
having to physically split source into host-only and device-only parts.

This change is needed in order to parse CUDA header files that have
a lot of name clashes with standard include files.

Gory details are in design doc here: https://goo.gl/EXnymm
Feel free to leave comments there or in this review thread.

This feature is controlled with CC1 option -fcuda-target-overloads
and is disabled by default.

Differential Revision: http://reviews.llvm.org/D12453

llvm-svn: 248295
2015-09-22 17:22:59 +00:00