We should (almost) never consider a device-side declaration to match a
library builtin functio. Otherwise clang may ignore the implementation
provided by the CUDA headers and emit clang's idea of the builtin.
Differential Revision: https://reviews.llvm.org/D42319
llvm-svn: 323239