llvm-project/clang/test/CodeGenCUDA
Justin Lebar b080b630b1 [CodeGen] [CUDA] Add the ability set default attrs on functions in linked modules.
Summary:
Now when you ask clang to link in a bitcode module, you can tell it to
set attributes on that module's functions to match what we would have
set if we'd emitted those functions ourselves.

This is particularly important for fast-math attributes in CUDA
compilations.

Each CUDA compilation links in libdevice, a bitcode library provided by
nvidia as part of the CUDA distribution.  Without this patch, if we have
a user-function F that is compiled with -ffast-math that calls a
function G from libdevice, F will have the unsafe-fp-math=true (etc.)
attributes, but G will have no attributes.

Since F calls G, the inliner will merge G's attributes into F's.  It
considers the lack of an unsafe-fp-math=true attribute on G to be
tantamount to unsafe-fp-math=false, so it "merges" these by setting
unsafe-fp-math=false on F.

This then continues up the call graph, until every function that
(transitively) calls something in libdevice gets unsafe-fp-math=false
set, thus disabling fastmath in almost all CUDA code.

Reviewers: echristo

Subscribers: hfinkel, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D28538

llvm-svn: 293097
2017-01-25 21:29:48 +00:00
..
Inputs Avoid depending on test inputes that aren't in Inputs 2016-05-20 00:38:25 +00:00
address-spaces.cu [CUDA] Restrict init of local __shared__ variables to empty constructors only. 2016-05-09 22:09:56 +00:00
alias.cu [CUDA] Don't generate aliases for static extern "C" functions. 2016-01-25 22:36:37 +00:00
convergent.cu [CUDA] Mark device functions as nounwind. 2016-10-04 23:41:49 +00:00
cuda-builtin-vars.cu [CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h. 2016-10-08 22:16:08 +00:00
device-stub.cu [CUDA] Place GPU binary into .nv_fatbin section and align it by 8. 2016-08-12 18:44:01 +00:00
device-var-init.cu [CUDA] Mark device functions as nounwind. 2016-10-04 23:41:49 +00:00
device-vtable.cu [CUDA] Make vtable construction aware of host/device side of CUDA compilation. 2015-12-17 18:12:36 +00:00
filter-decl.cu [CUDA] Emit host-side 'shadows' for device-side global variables 2016-03-02 18:28:50 +00:00
flush-denormals.cu [CUDA] Fix flush-denormals.cu test so that it checks what it intends to CHECK. 2016-05-10 00:34:50 +00:00
fp-contract.cu [CUDA] Enable fusing FP ops (-ffp-contract=fast) for CUDA by default. 2016-05-19 18:44:45 +00:00
function-overload.cu [CUDA] Disallow overloading destructors. 2016-10-03 16:48:23 +00:00
kernel-args-alignment.cu [CUDA] Align kernel launch args correctly when the LLVM type's alignment is different from the clang type's alignment. 2016-07-27 22:36:21 +00:00
kernel-call.cu
launch-bounds.cu [CUDA] Improve target attribute checking for function templates. 2016-12-07 19:27:16 +00:00
link-device-bitcode.cu [CUDA] Add -disable-llvm-passes to CodeGenCUDA/link-device-bitcode.cu. NFC 2016-03-30 23:45:38 +00:00
llvm-used.cu
nothrow.cu [CUDA] Add missing ':' to noexcept.cu test. 2016-10-05 00:27:38 +00:00
printf-aggregate.cu [CUDA] Don't crash when trying to printf a non-scalar object. 2016-02-11 02:00:52 +00:00
printf.cu [CUDA] Generate CUDA's printf alloca in its function's entry block. 2016-01-28 23:58:28 +00:00
propagate-metadata.cu [CodeGen] [CUDA] Add the ability set default attrs on functions in linked modules. 2017-01-25 21:29:48 +00:00
ptx-kernels.cu [CUDA] Give templated device functions internal linkage, templated kernels external linkage. 2016-06-30 18:41:33 +00:00