Commit Graph

2 Commits

Author SHA1 Message Date
Justin Lebar ddd97faeec [CUDA] Mark all CUDA device-side function defs, decls, and calls as convergent.
Summary:
This is important for e.g. the following case:

  void sync() { __syncthreads(); }
  void foo() {
    do_something();
    sync();
    do_something_else():
  }

Without this change, if the optimizer does not inline sync() (which it
won't because __syncthreads is also marked as noduplicate, for now
anyway), it is free to perform optimizations on sync() that it would not
be able to perform on __syncthreads(), because sync() is not marked as
convergent.

Similarly, we need a notion of convergent calls, since in the case when
we can't statically determine a call's target(s), we need to know
whether it's safe to perform optimizations around the call.

This change is conservative; the optimizer will remove these attrs where
it can, see r260318, r260319.

Reviewers: majnemer

Subscribers: cfe-commits, jhen, echristo, tra

Differential Revision: http://reviews.llvm.org/D17056

llvm-svn: 261779
2016-02-24 21:55:11 +00:00
Artem Belevich 97c01c35f8 [CUDA] Do not allow dynamic initialization of global device side variables.
In general CUDA does not allow dynamic initialization of
global device-side variables. One exception is that CUDA allows
records with empty constructors as described in section E2.2.1 of
CUDA 7.5 Programming guide.

This patch applies initializer checks for all device-side variables.
Empty constructors are accepted, but no code is generated for them.

Differential Revision: http://reviews.llvm.org/D15305

llvm-svn: 259592
2016-02-02 22:29:48 +00:00