Commit Graph

91 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu e355110040 [CUDA][HIP] Fix checking dependent initalizer
Defer constant checking of dependent initializer to template instantiation
since it cannot be done for dependent values.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D95840
2021-02-04 18:04:54 -05:00
Artem Belevich 127091bfd5 [CUDA] Normalize handling of defauled dtor.
Defaulted destructor was treated inconsistently, compared to other
compiler-generated functions.

When Sema::IdentifyCUDATarget() got called on just-created dtor which didn't
have implicit __host__ __device__ attributes applied yet, it would treat it as a
host function.  That happened to (sometimes) hide the error when dtor referred
to a host-only functions.

Even when we had identified defaulted dtor as a HD function, we still treated it
inconsistently during selection of usual deallocators, where we did not allow
referring to wrong-side functions, while it is allowed for other HD functions.

This change brings handling of defaulted dtors in line with other HD functions.

Differential Revision: https://reviews.llvm.org/D94732
2021-01-21 10:48:07 -08:00
Yaxun (Sam) Liu 5c8911d0ba [CUDA][HIP] Diagnose reference of host variable
This patch diagnoses invalid references of global host variables in device,
global, or host device functions.

Differential Revision: https://reviews.llvm.org/D91281
2020-12-02 10:15:56 -05:00
Artem Belevich be86b6773b [CUDA] Allow local static variables with target attributes.
While CUDA documentation claims that such variables are not allowed[1], NVCC has
been accepting them since CUDA-10.0[2] and some headers in CUDA-11 rely on this
working.

1. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#static-variables-function
2. https://godbolt.org/z/zsodzc

Differential Revision: https://reviews.llvm.org/D88345
2020-11-03 10:30:38 -08:00
Yaxun (Sam) Liu 52bcd691cb Recommit "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions"
This recommits 7f1f89ec8d and
40df06cdaf with bug fixes for
memory sanitizer failure and Tensile build failure.
2020-10-19 17:48:04 -04:00
Reid Kleckner 3453b6928d Revert "Recommit "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions""
This reverts commit e39da8ab6a.

This depends on a change that needs additional design review and needs
to be reverted.
2020-09-24 11:16:54 -07:00
Yaxun (Sam) Liu e39da8ab6a Recommit "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions"
This recommits 7f1f89ec8d and
40df06cdaf after fixing memory
sanitizer failure.
2020-09-24 08:44:37 -04:00
Yaxun (Sam) Liu 772bd8a7d9 Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions"
This reverts commit 7f1f89ec8d.

This reverts commit 40df06cdaf.
2020-09-17 13:55:31 -04:00
Yaxun (Sam) Liu 40df06cdaf [CUDA][HIP] Defer overloading resolution diagnostics for host device functions
In CUDA/HIP a function may become implicit host device function by
pragma or constexpr. A host device function is checked in both
host and device compilation. However it may be emitted only
on host or device side, therefore the diagnostics should be
deferred until it is known to be emitted.

Currently clang is only able to defer certain diagnostics. This causes
false alarms and limits the usefulness of host device functions.

This patch lets clang defer all overloading resolution diagnostics for host device functions.

An option -fgpu-defer-diag is added to control this behavior. By default
it is off.

It is NFC for other languages.

Differential Revision: https://reviews.llvm.org/D84364
2020-09-17 11:30:42 -04:00
Yaxun (Sam) Liu 9275e14379 recommit 4fc752b30b [CUDA][HIP] Always defer diagnostics for wrong-sided reference
Fixed regression in test builtin-amdgcn-atomic-inc-dec-failure.cpp.
2020-07-17 09:14:39 -04:00
Yaxun (Sam) Liu a46ef7d42d Revert "[CUDA][HIP] Always defer diagnostics for wrong-sided reference"
This reverts commit 4fc752b30b.
2020-07-17 08:10:56 -04:00
Yaxun (Sam) Liu 4fc752b30b [CUDA][HIP] Always defer diagnostics for wrong-sided reference
When a device function calls a host function or vice versa, this is wrong-sided
reference. Currently clang immediately diagnose it. This is different from nvcc
behavior, where it is diagnosed only if the function is really emitted.

Current clang behavior causes false alarms for valid use cases.

This patch let clang always defer diagnostics for wrong-sided
reference.

Differential Revision: https://reviews.llvm.org/D83893
2020-07-17 07:51:55 -04:00
Yaxun (Sam) Liu 1eaad01046 [CUDA][HIP] Let lambda be host device by default
This patch let lambda be host device by default and adds diagnostics for
capturing host variable by reference in device lambda.

Differential Revision: https://reviews.llvm.org/D78655
2020-07-08 13:10:26 -04:00
Fangrui Song b3d10920e1 Restore part of D80450 [CUDA][HIP] Fix implicit HD function resolution
The "if (S.getLangOpts().CUDA && Cand1.Function && Cand2.Function) {"
part is known to be problematic but the root cause isn't clear yet.
2020-06-10 22:33:33 -07:00
Fangrui Song dfc0d94755 Revert D80450 "[CUDA][HIP] Fix implicit HD function resolution"
This reverts commit 263390d4f5.

This can still cause bogus errors:

eigen3/Eigen/src/Core/CoreEvaluators.h:94:38: error: call to implicitly-deleted copy constructor of 'unary_evaluator<Eigen::Inverse<Eigen::Matrix<double, 4, 4, 0, 4, 4>>>'

thrust/system/detail/generic/for_each.h:49:3: error: implicit instantiation of undefined template
'thrust::detail::STATIC_ASSERTION_FAILURE<false>'
2020-06-10 17:42:28 -07:00
Yaxun (Sam) Liu 263390d4f5 [CUDA][HIP] Fix implicit HD function resolution
recommit e03394c6a6 with fix

When implicit HD function calls a function in device compilation,
if one candidate is an implicit HD function, current resolution rule is:

D wins over HD and H
HD and H are equal

this caused regression when there is an otherwise worse D candidate

This patch changes that to

D, HD and H are all equal

The rationale is that we already know for host compilation there is already
a valid candidate in HD and H candidates that will not cause error. Allowing
HD and H gives us a fall back candidate that will not cause error. If D wins,
that means D has to be a better match otherwise, therefore D should also
be a valid candidate that will not cause error. In this way, we can guarantee
no regression.

Differential Revision: https://reviews.llvm.org/D80450
2020-06-04 16:54:52 -04:00
Yaxun (Sam) Liu 049d860707 [CUDA][HIP] Fix constexpr variables for C++17
constexpr variables are compile time constants and implicitly const, therefore
they are safe to emit on both device and host side. Besides, in many cases
they are intended for both device and host, therefore it makes sense
to emit them on both device and host sides if necessary.

In most cases constexpr variables are used as rvalue and the variables
themselves do not need to be emitted. However if their address is taken,
then they need to be emitted.

For C++14, clang is able to handle that since clang emits them with
available_externally linkage together with the initializer.

However for C++17, the constexpr static data member of a class or template class
become inline variables implicitly. Therefore they become definitions with
linkonce_odr or weak_odr linkages. As such, they can not have available_externally
linkage.

This patch fixes that by adding implicit constant attribute to
file scope constexpr variables and constexpr static data members
in device compilation.

Differential Revision: https://reviews.llvm.org/D79237
2020-06-03 21:56:52 -04:00
Artem Belevich ef649e8fd5 Revert "[CUDA][HIP] Workaround for resolving host device function against wrong-sided function"
Still breaks CUDA compilation.

This reverts commit e03394c6a6.
2020-05-18 12:22:55 -07:00
Yaxun (Sam) Liu e03394c6a6 [CUDA][HIP] Workaround for resolving host device function against wrong-sided function
recommit c77a4078e0 with fix

https://reviews.llvm.org/D77954 caused regressions due to diagnostics in implicit
host device functions.

For now, it seems the most feasible workaround is to treat implicit host device function and explicit host
device function differently. Basically in device compilation for implicit host device functions, keep the
old behavior, i.e. give host device candidates and wrong-sided candidates equal preference. For explicit
host device functions, favor host device candidates against wrong-sided candidates.

The rationale is that explicit host device functions are blessed by the user to be valid host device functions,
that is, they should not cause diagnostics in both host and device compilation. If diagnostics occur, user is
able to fix them. However, there is no guarantee that implicit host device function can be compiled in
device compilation, therefore we need to preserve its overloading resolution in device compilation.

Differential Revision: https://reviews.llvm.org/D79526
2020-05-12 08:27:50 -04:00
Yaxun (Sam) Liu d75a6e93ae [CUDA][HIP] Fix empty ctor/dtor check for union
union ctor does not call ctors of its data members. union dtor does not call dtors of its data members.
Also union does not have base class.

Currently when clang checks whether union has an empty ctor/dtor, it checks the ctors/dtors of its
data members. This causes incorrectly diagnose device side global variables and shared variables as
having non-empty ctors/dtors.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D79367
2020-05-04 21:52:04 -04:00
Yaxun (Sam) Liu b670ab7b6b recommit 1b978ddba0 [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese
Differential Revision: https://reviews.llvm.org/D70172
2020-03-23 12:09:07 -04:00
Reid Kleckner d7c5037e6b Prune TargetInfo.h include from ParsedAttr.h, NFC
Saves ~400 includes of related headers:

$ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \
    | grep '^[-+] ' | sort | uniq -c | sort -nr
    468 -    llvm-project/clang/include/clang/Basic/TargetInfo.h
    468 -    llvm-project/clang/include/clang/Basic/TargetCXXABI.h
    368 -    llvm-project/llvm/include/llvm/Support/CodeGen.h
    368 -    llvm-project/clang/include/clang/Basic/XRayInstr.h
    368 -    llvm-project/clang/include/clang/Basic/CodeGenOptions.h
    368 -    llvm-project/clang/include/clang/Basic/CodeGenOptions.def
    367 -    llvm-project/llvm/include/llvm/ADT/FloatingPointMode.h
    367 -    llvm-project/clang/include/clang/Basic/DebugInfoOptions.h
2020-03-11 20:47:11 -07:00
Yaxun (Sam) Liu bcadb1f2e6 Revert "[CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese"
This reverts commit 1b978ddba0.
2020-02-18 14:45:34 -05:00
Yaxun (Sam) Liu 1b978ddba0 [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese
This patch removes the explicit call graph for CUDA/HIP/OpenMP deferred
diagnostics generated during parsing since it is error prone due to
incomplete information about function declarations during parsing. In stead,
this patch does a post-parsing AST traverse and emits deferred diagnostics
based on the use graph implicitly generated during the traverse.

Differential Revision: https://reviews.llvm.org/D70172
2020-02-16 22:44:33 -05:00
Yaxun (Sam) Liu 68f5ca4e19 [HIP] Add option -fgpu-allow-device-init
Add this option to allow device side class type global variables
with non-trivial ctor/dtor. device side init/fini functions will
be emitted, which will be executed by HIP runtime when
the fat binary is loaded/unloaded.

This feature is to facilitate implementation of device side
sanitizer which requires global vars with non-trival ctors.

By default this option is disabled.

Differential Revision: https://reviews.llvm.org/D69268
2019-10-22 16:06:20 -04:00
Yaxun Liu 229c78d3a5 [CUDA][HIP] Fix host/device check with -fopenmp
CUDA/HIP program may be compiled with -fopenmp. In this case, -fopenmp is only passed to host compilation
to take advantages of multi-threads computation.

CUDA/HIP and OpenMP both use Sema::DeviceCallGraph to store functions to be analyzed and remove them
once they decide the function is sure to be emitted. CUDA/HIP and OpenMP have different functions to determine
if a function is sure to be emitted.

To check host/device correctly for CUDA/HIP when -fopenmp is enabled, there needs a unified logic to determine
whether a function is to be emitted. The logic needs to be aware of both CUDA and OpenMP logic.

Differential Revision: https://reviews.llvm.org/D67837

llvm-svn: 374263
2019-10-09 23:54:10 +00:00
Yaxun Liu 1282889347 [HIP] Support new kernel launching API
Differential Revision: https://reviews.llvm.org/D67947

llvm-svn: 372773
2019-09-24 19:16:40 +00:00
Yaxun Liu 27a8039171 Revert assertion added by r372394
The assertion added by r372394 causes CUDA test in test-suite to assert.

The assertion was not there originally, so revert it.

llvm-svn: 372452
2019-09-21 02:51:44 +00:00
Yaxun Liu e5d17c511f [CUDA][HIP] Fix hostness of defaulted constructor
Clang does not respect the explicit device host attributes of defaulted special members.
Also clang does not respect the hostness of special members determined by their
first declarations.
Clang also adds duplicate implicit device or host attributes in certain cases.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D67509

llvm-svn: 372394
2019-09-20 14:28:09 +00:00
Richard Smith 255b85f03c Split ActOnCallExpr into an ActOnCallExpr to be called by the parser,
and a BuildCallExpr to be called internally within Sema to build /
rebuild calls.

llvm-svn: 360217
2019-05-08 01:36:36 +00:00
Yaxun Liu c5be267003 [CUDA][HIP][Sema] Fix template kernel with function as template parameter
If a kernel template has a function as its template parameter, a device function should be
allowed as template argument since a kernel can call a device function. However,
currently if the kernel template is instantiated in a host function, clang will emit an error
message saying the device function is an invalid candidate for the template parameter.

This happens because clang checks the reference to the device function during parsing
the template arguments. At this point, the template is not instantiated yet. Clang incorrectly
assumes the device function is called by the host function and emits the error message.

This patch fixes the issue by disabling checking of device function during parsing template
arguments and deferring the check to the instantion of the template. At that point, the
template decl is already available, therefore the check can be done against the instantiated
function template decl.

Differential Revision: https://reviews.llvm.org/D56411

llvm-svn: 355421
2019-03-05 18:19:35 +00:00
Alexey Bataev 8972133989 [SEMA]Generalize deferred diagnostic interface, NFC.
Summary:
Deferred diagnostic interface is going to be used for OpenMP device
compilation. Generalized previously existed deferred diagnostic
interface for CUDA to be used with OpenMP and, possibly, other models.

Reviewers: rjmccall, tra

Subscribers: caomhin, cfe-commits, kkwli0

Tags: #clang

Differential Revision: https://reviews.llvm.org/D57908

llvm-svn: 353456
2019-02-07 19:46:42 +00:00
Artem Belevich c62214da3d [CUDA] add support for the new kernel launch API in CUDA-9.2+.
Instead of calling CUDA runtime to arrange function arguments,
the new API constructs arguments in a local array and the kernels
are launched with __cudaLaunchKernel().

The old API has been deprecated and is expected to go away
in the next CUDA release.

Differential Revision: https://reviews.llvm.org/D57488

llvm-svn: 352799
2019-01-31 21:34:03 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Bruno Ricci 5fc4db7579 [AST][NFC] Pass the AST context to one of the ctor of DeclRefExpr.
All of the other constructors already take a reference to the AST context.
This avoids calling Decl::getASTContext in most cases. Additionally move
the definition of the constructor from Expr.h to Expr.cpp since it is calling
DeclRefExpr::computeDependence. NFC.

llvm-svn: 349901
2018-12-21 14:10:18 +00:00
Erich Keane e891aa971a [NFC] Rename clang::AttributeList to clang::ParsedAttr
Since The type no longer contains the 'next' item anymore, it isn't a list,
so rename it to ParsedAttr to be more accurate.

llvm-svn: 337005
2018-07-13 15:07:47 +00:00
Erich Keane c480f30580 AttributeList de-listifying:
Basically, "AttributeList" loses all list-like mechanisms, ParsedAttributes is
switched to use a TinyPtrVector (and a ParsedAttributesView is created to
have a non-allocating attributes list). DeclaratorChunk gets the later kind,
Declarator/DeclSpec keep ParsedAttributes.

Iterators are added to the ParsedAttribute types so that for-loops work.

llvm-svn: 336945
2018-07-12 21:09:05 +00:00
Artem Belevich e9fa53a09b [CUDA] Check initializers of instantiated template variables.
We were already performing checks on non-template variables,
but the checks on templated ones were missing.

Differential Revision: https://reviews.llvm.org/D45231

llvm-svn: 334143
2018-06-06 22:37:25 +00:00
Adrian Prantl 9fc8faf9e6 Remove \brief commands from doxygen comments.
This is similar to the LLVM change https://reviews.llvm.org/D46290.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46320

llvm-svn: 331834
2018-05-09 01:00:01 +00:00
Yaxun Liu 887c569bcb [HIP] Add hip input kind and codegen for kernel launching
HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ).
The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference
is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source
implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP).

This patch adds support of input kind and language standard hip.

When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA
in most cases and only special handling of hip program is needed LangOpts.HIP is checked.

This patch also adds support of kernel launching of HIP program using HIP host API.

When -x hip is not specified, there is no behaviour change for CUDA.

Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D44984

llvm-svn: 330790
2018-04-25 01:10:37 +00:00
Artem Belevich 67d22c8a84 Revert "[CUDA] Check initializers of instantiated template variables."
This (temporarily) reverts commit r329127 due to the problems
it exposed in TensorFlow.

llvm-svn: 329229
2018-04-04 20:48:42 +00:00
Artem Belevich d9189d1e76 [CUDA] Check initializers of instantiated template variables.
We were already performing checks on non-template variables,
but the checks on templated ones were missing.

Differential Revision: https://reviews.llvm.org/D45231

llvm-svn: 329127
2018-04-03 22:41:06 +00:00
Richard Trieu b402580616 Fix some handling of AST nodes with diagnostics.
The diagnostic system for Clang can already handle many AST nodes.  Instead
of converting them to strings first, just hand the AST node directly to
the diagnostic system and let it handle the output.  Minor changes in some
diagnostic output.

llvm-svn: 328688
2018-03-28 04:16:13 +00:00
Artem Belevich e2ae8b5510 [CUDA] Fixed false error reporting in case of calling H->G->HD->D.
Launching a kernel from the host code does not generate code for the
kernel itself. This fixes an issue with clang erroneously reporting
an error for a HD->D call from within the kernel.

Differential Revision: https://reviews.llvm.org/D44837

llvm-svn: 328362
2018-03-23 19:49:03 +00:00
Serge Pavlov 25dbe1a16e Function with unparsed body is a definition
While a function body is being parsed, the function declaration is not considered
as a definition because it does not have a body yet. In some cases it leads to
incorrect interpretation, the case is presented in
https://bugs.llvm.org/show_bug.cgi?id=14785:
```
    template<typename T> struct Somewhat {
      void internal() const {}
      friend void operator+(int const &, Somewhat<T> const &) {}
    };
void operator+(int const &, Somewhat<char> const &x) { x.internal(); }
```
When statement `x.internal()` in the body of global `operator+` is parsed, the type
of `x` must be completed, so the instantiation of `Somewhat<char>` is started. It
instantiates the declaration of `operator+` defined inline, and makes a check for
redefinition. The check does not detect another definition because the declaration
of `operator+` is still not defining as does not have a body yet.

To solves this problem the function `isThisDeclarationADefinition` considers
a function declaration as a definition if it has flag `WillHaveBody` set.

This change fixes PR14785.

Differential Revision: https://reviews.llvm.org/D30375

This is a recommit of 305379, reverted in 305381, with small changes.

llvm-svn: 305903
2017-06-21 12:46:57 +00:00
Serge Pavlov a4ab1b1c59 Reverted 305379 (Function with unparsed body is a definition)
It broke clang-x86_64-linux-selfhost-modules-2 and some other buildbots.

llvm-svn: 305381
2017-06-14 10:57:56 +00:00
Serge Pavlov c73c81be5c Function with unparsed body is a definition
While a function body is being parsed, the function declaration is not considered
as a definition because it does not have a body yet. In some cases it leads to
incorrect interpretation, the case is presented in
https://bugs.llvm.org/show_bug.cgi?id=14785:
```
    template<typename T> struct Somewhat {
      void internal() const {}
      friend void operator+(int const &, Somewhat<T> const &) {}
    };
void operator+(int const &, Somewhat<char> const &x) { x.internal(); }
```
When statement `x.internal()` in the body of global `operator+` is parsed, the type
of `x` must be completed, so the instantiation of `Somewhat<char>` is started. It
instantiates the declaration of `operator+` defined inline, and makes a check for
redefinition. The check does not detect another definition because the declaration
of `operator+` is still not defining as does not have a body yet.

To solves this problem the function `isThisDeclarationADefinition` considers
a function declaration as a definition if it has flag `WillHaveBody` set.

This change fixes PR14785.

Differential Revision: https://reviews.llvm.org/D30375

llvm-svn: 305379
2017-06-14 10:07:02 +00:00
Richard Smith 8bae1be0fa Factor out some common code between SpecialMemberExceptionSpecInfo and SpecialMemberDeletionInfo.
To simplify this, convert SpecialMemberOverloadResult to a value type.

llvm-svn: 296073
2017-02-24 02:07:20 +00:00
George Burgess IV 8684b0352a [Sema] Replace remove_if+erase with erase_if. NFC.
llvm-svn: 290991
2017-01-04 19:16:29 +00:00
Artem Belevich 64135c35f7 [CUDA] Ignore implicit target attributes during function template instantiation.
Some functions and templates are treated as __host__ __device__ even
when they don't have explicitly specified target attributes.
What's worse, this treatment may change depending on command line
options (-fno-cuda-host-device-constexpr) or
#pragma clang force_cuda_host_device.

Combined with strict checking for matching function target that comes
with D25809(r288962), it makes it hard to write code which would
explicitly instantiate or specialize some functions regardless of
pragmas or command line options in effect.

This patch changes the way we match target attributes of base template
vs attributes used in explicit instantiation or specialization so that
only explicitly specified attributes are considered. This makes base
template selection behave consistently regardless of pragma of command
line options that may affect CUDA target.

Differential Revision: https://reviews.llvm.org/D25845

llvm-svn: 289091
2016-12-08 19:38:13 +00:00