Defer constant checking of dependent initializer to template instantiation
since it cannot be done for dependent values.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D95840
Defaulted destructor was treated inconsistently, compared to other
compiler-generated functions.
When Sema::IdentifyCUDATarget() got called on just-created dtor which didn't
have implicit __host__ __device__ attributes applied yet, it would treat it as a
host function. That happened to (sometimes) hide the error when dtor referred
to a host-only functions.
Even when we had identified defaulted dtor as a HD function, we still treated it
inconsistently during selection of usual deallocators, where we did not allow
referring to wrong-side functions, while it is allowed for other HD functions.
This change brings handling of defaulted dtors in line with other HD functions.
Differential Revision: https://reviews.llvm.org/D94732
This patch diagnoses invalid references of global host variables in device,
global, or host device functions.
Differential Revision: https://reviews.llvm.org/D91281
In CUDA/HIP a function may become implicit host device function by
pragma or constexpr. A host device function is checked in both
host and device compilation. However it may be emitted only
on host or device side, therefore the diagnostics should be
deferred until it is known to be emitted.
Currently clang is only able to defer certain diagnostics. This causes
false alarms and limits the usefulness of host device functions.
This patch lets clang defer all overloading resolution diagnostics for host device functions.
An option -fgpu-defer-diag is added to control this behavior. By default
it is off.
It is NFC for other languages.
Differential Revision: https://reviews.llvm.org/D84364
When a device function calls a host function or vice versa, this is wrong-sided
reference. Currently clang immediately diagnose it. This is different from nvcc
behavior, where it is diagnosed only if the function is really emitted.
Current clang behavior causes false alarms for valid use cases.
This patch let clang always defer diagnostics for wrong-sided
reference.
Differential Revision: https://reviews.llvm.org/D83893
This patch let lambda be host device by default and adds diagnostics for
capturing host variable by reference in device lambda.
Differential Revision: https://reviews.llvm.org/D78655
This reverts commit 263390d4f5.
This can still cause bogus errors:
eigen3/Eigen/src/Core/CoreEvaluators.h:94:38: error: call to implicitly-deleted copy constructor of 'unary_evaluator<Eigen::Inverse<Eigen::Matrix<double, 4, 4, 0, 4, 4>>>'
thrust/system/detail/generic/for_each.h:49:3: error: implicit instantiation of undefined template
'thrust::detail::STATIC_ASSERTION_FAILURE<false>'
recommit e03394c6a6 with fix
When implicit HD function calls a function in device compilation,
if one candidate is an implicit HD function, current resolution rule is:
D wins over HD and H
HD and H are equal
this caused regression when there is an otherwise worse D candidate
This patch changes that to
D, HD and H are all equal
The rationale is that we already know for host compilation there is already
a valid candidate in HD and H candidates that will not cause error. Allowing
HD and H gives us a fall back candidate that will not cause error. If D wins,
that means D has to be a better match otherwise, therefore D should also
be a valid candidate that will not cause error. In this way, we can guarantee
no regression.
Differential Revision: https://reviews.llvm.org/D80450
constexpr variables are compile time constants and implicitly const, therefore
they are safe to emit on both device and host side. Besides, in many cases
they are intended for both device and host, therefore it makes sense
to emit them on both device and host sides if necessary.
In most cases constexpr variables are used as rvalue and the variables
themselves do not need to be emitted. However if their address is taken,
then they need to be emitted.
For C++14, clang is able to handle that since clang emits them with
available_externally linkage together with the initializer.
However for C++17, the constexpr static data member of a class or template class
become inline variables implicitly. Therefore they become definitions with
linkonce_odr or weak_odr linkages. As such, they can not have available_externally
linkage.
This patch fixes that by adding implicit constant attribute to
file scope constexpr variables and constexpr static data members
in device compilation.
Differential Revision: https://reviews.llvm.org/D79237
recommit c77a4078e0 with fix
https://reviews.llvm.org/D77954 caused regressions due to diagnostics in implicit
host device functions.
For now, it seems the most feasible workaround is to treat implicit host device function and explicit host
device function differently. Basically in device compilation for implicit host device functions, keep the
old behavior, i.e. give host device candidates and wrong-sided candidates equal preference. For explicit
host device functions, favor host device candidates against wrong-sided candidates.
The rationale is that explicit host device functions are blessed by the user to be valid host device functions,
that is, they should not cause diagnostics in both host and device compilation. If diagnostics occur, user is
able to fix them. However, there is no guarantee that implicit host device function can be compiled in
device compilation, therefore we need to preserve its overloading resolution in device compilation.
Differential Revision: https://reviews.llvm.org/D79526
union ctor does not call ctors of its data members. union dtor does not call dtors of its data members.
Also union does not have base class.
Currently when clang checks whether union has an empty ctor/dtor, it checks the ctors/dtors of its
data members. This causes incorrectly diagnose device side global variables and shared variables as
having non-empty ctors/dtors.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D79367
This patch removes the explicit call graph for CUDA/HIP/OpenMP deferred
diagnostics generated during parsing since it is error prone due to
incomplete information about function declarations during parsing. In stead,
this patch does a post-parsing AST traverse and emits deferred diagnostics
based on the use graph implicitly generated during the traverse.
Differential Revision: https://reviews.llvm.org/D70172
Add this option to allow device side class type global variables
with non-trivial ctor/dtor. device side init/fini functions will
be emitted, which will be executed by HIP runtime when
the fat binary is loaded/unloaded.
This feature is to facilitate implementation of device side
sanitizer which requires global vars with non-trival ctors.
By default this option is disabled.
Differential Revision: https://reviews.llvm.org/D69268
CUDA/HIP program may be compiled with -fopenmp. In this case, -fopenmp is only passed to host compilation
to take advantages of multi-threads computation.
CUDA/HIP and OpenMP both use Sema::DeviceCallGraph to store functions to be analyzed and remove them
once they decide the function is sure to be emitted. CUDA/HIP and OpenMP have different functions to determine
if a function is sure to be emitted.
To check host/device correctly for CUDA/HIP when -fopenmp is enabled, there needs a unified logic to determine
whether a function is to be emitted. The logic needs to be aware of both CUDA and OpenMP logic.
Differential Revision: https://reviews.llvm.org/D67837
llvm-svn: 374263
Clang does not respect the explicit device host attributes of defaulted special members.
Also clang does not respect the hostness of special members determined by their
first declarations.
Clang also adds duplicate implicit device or host attributes in certain cases.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D67509
llvm-svn: 372394
If a kernel template has a function as its template parameter, a device function should be
allowed as template argument since a kernel can call a device function. However,
currently if the kernel template is instantiated in a host function, clang will emit an error
message saying the device function is an invalid candidate for the template parameter.
This happens because clang checks the reference to the device function during parsing
the template arguments. At this point, the template is not instantiated yet. Clang incorrectly
assumes the device function is called by the host function and emits the error message.
This patch fixes the issue by disabling checking of device function during parsing template
arguments and deferring the check to the instantion of the template. At that point, the
template decl is already available, therefore the check can be done against the instantiated
function template decl.
Differential Revision: https://reviews.llvm.org/D56411
llvm-svn: 355421
Summary:
Deferred diagnostic interface is going to be used for OpenMP device
compilation. Generalized previously existed deferred diagnostic
interface for CUDA to be used with OpenMP and, possibly, other models.
Reviewers: rjmccall, tra
Subscribers: caomhin, cfe-commits, kkwli0
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57908
llvm-svn: 353456
Instead of calling CUDA runtime to arrange function arguments,
the new API constructs arguments in a local array and the kernels
are launched with __cudaLaunchKernel().
The old API has been deprecated and is expected to go away
in the next CUDA release.
Differential Revision: https://reviews.llvm.org/D57488
llvm-svn: 352799
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
All of the other constructors already take a reference to the AST context.
This avoids calling Decl::getASTContext in most cases. Additionally move
the definition of the constructor from Expr.h to Expr.cpp since it is calling
DeclRefExpr::computeDependence. NFC.
llvm-svn: 349901
Basically, "AttributeList" loses all list-like mechanisms, ParsedAttributes is
switched to use a TinyPtrVector (and a ParsedAttributesView is created to
have a non-allocating attributes list). DeclaratorChunk gets the later kind,
Declarator/DeclSpec keep ParsedAttributes.
Iterators are added to the ParsedAttribute types so that for-loops work.
llvm-svn: 336945
We were already performing checks on non-template variables,
but the checks on templated ones were missing.
Differential Revision: https://reviews.llvm.org/D45231
llvm-svn: 334143
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ).
The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference
is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source
implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP).
This patch adds support of input kind and language standard hip.
When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA
in most cases and only special handling of hip program is needed LangOpts.HIP is checked.
This patch also adds support of kernel launching of HIP program using HIP host API.
When -x hip is not specified, there is no behaviour change for CUDA.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44984
llvm-svn: 330790
We were already performing checks on non-template variables,
but the checks on templated ones were missing.
Differential Revision: https://reviews.llvm.org/D45231
llvm-svn: 329127
The diagnostic system for Clang can already handle many AST nodes. Instead
of converting them to strings first, just hand the AST node directly to
the diagnostic system and let it handle the output. Minor changes in some
diagnostic output.
llvm-svn: 328688
Launching a kernel from the host code does not generate code for the
kernel itself. This fixes an issue with clang erroneously reporting
an error for a HD->D call from within the kernel.
Differential Revision: https://reviews.llvm.org/D44837
llvm-svn: 328362
While a function body is being parsed, the function declaration is not considered
as a definition because it does not have a body yet. In some cases it leads to
incorrect interpretation, the case is presented in
https://bugs.llvm.org/show_bug.cgi?id=14785:
```
template<typename T> struct Somewhat {
void internal() const {}
friend void operator+(int const &, Somewhat<T> const &) {}
};
void operator+(int const &, Somewhat<char> const &x) { x.internal(); }
```
When statement `x.internal()` in the body of global `operator+` is parsed, the type
of `x` must be completed, so the instantiation of `Somewhat<char>` is started. It
instantiates the declaration of `operator+` defined inline, and makes a check for
redefinition. The check does not detect another definition because the declaration
of `operator+` is still not defining as does not have a body yet.
To solves this problem the function `isThisDeclarationADefinition` considers
a function declaration as a definition if it has flag `WillHaveBody` set.
This change fixes PR14785.
Differential Revision: https://reviews.llvm.org/D30375
This is a recommit of 305379, reverted in 305381, with small changes.
llvm-svn: 305903
While a function body is being parsed, the function declaration is not considered
as a definition because it does not have a body yet. In some cases it leads to
incorrect interpretation, the case is presented in
https://bugs.llvm.org/show_bug.cgi?id=14785:
```
template<typename T> struct Somewhat {
void internal() const {}
friend void operator+(int const &, Somewhat<T> const &) {}
};
void operator+(int const &, Somewhat<char> const &x) { x.internal(); }
```
When statement `x.internal()` in the body of global `operator+` is parsed, the type
of `x` must be completed, so the instantiation of `Somewhat<char>` is started. It
instantiates the declaration of `operator+` defined inline, and makes a check for
redefinition. The check does not detect another definition because the declaration
of `operator+` is still not defining as does not have a body yet.
To solves this problem the function `isThisDeclarationADefinition` considers
a function declaration as a definition if it has flag `WillHaveBody` set.
This change fixes PR14785.
Differential Revision: https://reviews.llvm.org/D30375
llvm-svn: 305379
Some functions and templates are treated as __host__ __device__ even
when they don't have explicitly specified target attributes.
What's worse, this treatment may change depending on command line
options (-fno-cuda-host-device-constexpr) or
#pragma clang force_cuda_host_device.
Combined with strict checking for matching function target that comes
with D25809(r288962), it makes it hard to write code which would
explicitly instantiate or specialize some functions regardless of
pragmas or command line options in effect.
This patch changes the way we match target attributes of base template
vs attributes used in explicit instantiation or specialization so that
only explicitly specified attributes are considered. This makes base
template selection behave consistently regardless of pragma of command
line options that may affect CUDA target.
Differential Revision: https://reviews.llvm.org/D25845
llvm-svn: 289091