llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	acb6f80d96	[CUDA][HIP] Fix overloading resolution This patch implements correct hostness based overloading resolution in isBetterOverloadCandidate. Based on hostness, if one candidate is emittable whereas the other candidate is not emittable, the emittable candidate is better. If both candidates are emittable, or neither is emittable based on hostness, then other rules should be used to determine which is better. This is because hostness based overloading resolution is mostly for determining viability of a function. If two functions are both viable, other factors should take precedence in preference. If other rules cannot determine which is better, CUDA preference will be used again to determine which is better. However, correct hostness based overloading resolution requires overloading resolution diagnostics to be deferred, which is not on by default. The rationale is that deferring overloading resolution diagnostics may hide overloading reslolutions issues in header files. An option -fgpu-exclude-wrong-side-overloads is added, which is off by default. When -fgpu-exclude-wrong-side-overloads is off, keep the original behavior, that is, exclude wrong side overloads only if there are same side overloads. This may result in incorrect overloading resolution when there are no same side candates, but is sufficient for most CUDA/HIP applications. When -fgpu-exclude-wrong-side-overloads is on, enable deferring overloading resolution diagnostics and enable correct hostness based overloading resolution, i.e., always exclude wrong side overloads. Differential Revision: https://reviews.llvm.org/D80450	2020-12-02 16:33:33 -05:00
Yaxun (Sam) Liu	3f4b5893ef	[AMDGPU] Add option -munsafe-fp-atomics Add an option -munsafe-fp-atomics for AMDGPU target. When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics" to any functions for amdgpu target. This allows amdgpu backend to use unsafe fp atomic instructions in these functions. Differential Revision: https://reviews.llvm.org/D91546	2020-11-16 21:52:12 -05:00
Yaxun (Sam) Liu	e372c1d762	[HIP] Fix -fgpu-allow-device-init option The option needs to be passed to both host and device compilation. Differential Revision: https://reviews.llvm.org/D88550	2020-10-04 22:13:05 -04:00
Yaxun (Sam) Liu	2ae25647d1	[CUDA][HIP] Add -Xarch_device and -Xarch_host options The argument after -Xarch_device will be added to the arguments for CUDA/HIP device compilation and will be removed for host compilation. The argument after -Xarch_host will be added to the arguments for CUDA/HIP host compilation and will be removed for device compilation. Differential Revision: https://reviews.llvm.org/D76520	2020-03-24 10:13:05 -04:00
Yaxun (Sam) Liu	6f79f80e6e	[HIP] Fix duplicate clang -cc1 options on MSVC toolchain HIPToolChain::TranslateArgs call TranslateArgs of host toolchain with the input args to get a list of derived args called DAL, then go through the input args by itself and append them to DAL. This assumes that the host toolchain should not append any unchanged args to DAL, otherwise there will be duplicates since HIPToolChain will append it again. This works for GNU toolchain since it returns an empty list for DAL. However, MSVC toolchain will append unchanged args to DAL, which causes duplicate args. This patch let MSVC toolchain not append unchanged args for HIP offloading kind, which fixes this issue. Differential Revision: https://reviews.llvm.org/D76032	2020-03-18 14:48:04 -04:00
Yaxun (Sam) Liu	9f2d8b5c0c	[HIP] Add option --gpu-max-threads-per-block=n Add this option to change the default launch bounds. Differential Revision: https://reviews.llvm.org/D71221	2020-01-07 11:18:00 -05:00

6 Commits