llvm-project

Commit Graph

Author	SHA1	Message	Date
Artem Belevich	0060fffc82	[CUDA] Bump default GPU architecture to sm_35. It's the oldest GPU architecture currently supported by all CUDA versions clang can use. Differential Revision: https://reviews.llvm.org/D108235	2021-08-23 13:24:45 -07:00
Aaron Ballman	530ea28fef	Correct a lot of diagnostic wordings for the driver Clang diagnostics should not start with a capital letter or use trailing punctuation (https://clang.llvm.org/docs/InternalsManual.html#the-format-string), but quite a few driver diagnostics were not following this advice. This corrects the grammar and punctuation to improve consistency, but does not change the circumstances under which the diagnostics are produced.	2021-08-05 07:04:55 -04:00
Teresa Johnson	d0ee8b64ec	[LTO] Fix -fwhole-program-vtables handling after HIP ThinLTO patch A recent change (D99683) to support ThinLTO for HIP caused a regression when compiling cuda code with -flto=thin -fwhole-program-vtables. Specifically, we now get an error: error: invalid argument '-fwhole-program-vtables' only allowed with '-flto' This error is coming from the device offload cc1 action being set up for the cuda compile, for which -flto=thin doesn't apply and gets dropped. This is a regression, but points to a potential issue that was silently occurring before the patch, details below. Before D99683, the check for fwhole-program-vtables in the driver looked like: if (WholeProgramVTables) { if (!D.isUsingLTO()) D.Diag(diag::err_drv_argument_only_allowed_with) << "-fwhole-program-vtables" << "-flto"; CmdArgs.push_back("-fwhole-program-vtables"); } And D.isUsingLTO() returned true since we have -flto=thin. However, because the cuda cc1 compile is doing device offloading, which didn't support any LTO, there was other code that suppressed -flto* options from being passed to the cc1 invocation. So the cc1 invocation silently had -fwhole-program-vtables without any -flto. This seems potentially problematic, since if we had any virtual calls we would get type test assume sequences without the corresponding LTO pass that handles them. However, with the patch, which adds support for device offloading LTO option -foffload-lto=thin, the code has changed so that we set a bool IsUsingLTO based on either -flto or -foffload-lto, depending on whether this is the device offloading action. For the device offload action in our compile, since we don't have -foffload-lto, IsUsingLTO is false, and the check for LTO with -fwhole-program-vtables now fails. What we should do is only pass through -fwhole-program-vtables to the cc1 invocation that has LTO enabled (either the device offload action with -foffload-lto, or the non-device offload action with -flto), and otherwise drop the -fwhole-program-vtables for the non-LTO action. Then we should error only if we have -fwhole-program-vtables without any -flto* options. Differential Revision: https://reviews.llvm.org/D103579	2021-06-03 14:25:03 -07:00
Reid Kleckner	549ed544c3	[Driver] Move the "-o OUT -x TYPE SRC.c" flags to the end of -cc1 New -cc1 arguments, such as -faddrsig, have started appearing after the input name. I personally find it convenient for the input to be the last argument to the compile command line, since I often need to edit it when running crash reproduction scripts. Differential Revision: https://reviews.llvm.org/D62270 llvm-svn: 361530	2019-05-23 18:35:43 +00:00
Petr Hosek	7b27454477	[ADT] Normalize empty triple components LLVM triple normalization is handling "unknown" and empty components differently; for example given "x86_64-unknown-linux-gnu" and "x86_64-linux-gnu" which should be equivalent, triple normalization returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's config.sub returns "x86_64-unknown-linux-gnu" for both "x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the triple normalization to behave the same way, replacing empty triple components with "unknown". This addresses PR37129. Differential Revision: https://reviews.llvm.org/D50219 llvm-svn: 339294	2018-08-08 22:23:57 +00:00
Artem Belevich	dde3dc27ee	[CUDA] Added --[no-]cuda-include-ptx=sm_XX\|all option. Currently we always include PTX into the fatbin along with the GPU code.It about doubles the size of the GPU binary we need to carry in the executable. These options allow control inclusion of PTX into GPU binary. This patch does not change the defaults, though we may consider making no-PTX the default in the future. Differential Revision: https://reviews.llvm.org/D45495 llvm-svn: 329737	2018-04-10 18:38:22 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Jonas Hahnfeld	e768132f94	[CUDA] Include single GPU binary, NFCI. Binaries for multiple architectures are combined by fatbinary, so the current code was effectively not needed. Differential Revision: https://reviews.llvm.org/D43461 llvm-svn: 326342	2018-02-28 17:53:46 +00:00
Artem Belevich	a424e88eab	[CUDA,Driver] Added --no-cuda-gpu-arch= option. This allows us to negate preceding --cuda-gpu-arch=X. This comes handy when user needs to override default flags set for them by the build system. Differential Revision: https://reviews.llvm.org/D27631 llvm-svn: 289287	2016-12-09 22:59:17 +00:00
Justin Lebar	dc3c50434e	[CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only. Summary: This completes the flag's tristate, letting you override it at will on the command line. Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: http://reviews.llvm.org/D19248 llvm-svn: 266707	2016-04-19 02:27:07 +00:00
Paul Robinson	4abe94fb6b	Get rid of another SAME-NOT. FileCheck does not have this suffix. Differential Revision: http://reviews.llvm.org/D17062 llvm-svn: 260348	2016-02-10 02:08:24 +00:00
Justin Lebar	21e5d4fcfa	[CUDA] Invoke ptxas and fatbinary during compilation. Summary: Previously we compiled CUDA device code to PTX assembly and embedded that asm as text in our host binary. Now we compile to PTX assembly and then invoke ptxas to assemble the PTX into a cubin file. We gather the ptx and cubin files for each of our --cuda-gpu-archs and combine them using fatbinary, and then embed that into the host binary. Adds two new command-line flags, -Xcuda_ptxas and -Xcuda_fatbinary, which pass args down to the external tools. Reviewers: tra, echristo Subscribers: cfe-commits, jhen Differential Revision: http://reviews.llvm.org/D16082 llvm-svn: 257809	2016-01-14 21:41:27 +00:00
Justin Lebar	388579fab7	[CUDA] Rename check-prefixes in cuda-options.cu and cuda-unused-arg-warning.cu. Summary: Rename the args to be more human-readable. Among other things, this lets us get rid of a bunch of comments (e.g. "ensure we don't run the linker"), greatly shortening these tests. Also apply consistent formatting and fix some English nits while we're at it. Reviewers: tra Differential Revision: http://reviews.llvm.org/D15975 llvm-svn: 257557	2016-01-13 01:24:35 +00:00
Justin Lebar	652b97bcf7	[CUDA] Split out tests for unused-arg warnings from cuda-options.cu. Summary: Trying to make this test a bit more manageable. Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D15974 llvm-svn: 257142	2016-01-08 03:33:04 +00:00
Artem Belevich	5e2a3ecd48	[CUDA] use -aux-triple to pass target triple of opposite side of compilation Clang needs to know target triple for both sides of compilation so that preprocessor macros and target builtins from both sides are available. This change augments Compilation class to carry information about toolchains used during different CUDA compilation passes and refactors BuildActions to use it when it constructs CUDA jobs. Removed DeviceTriple from CudaHostAction/CudaDeviceAction as it's no longer needed. Differential Revision: http://reviews.llvm.org/D13144 llvm-svn: 253385	2015-11-17 22:28:40 +00:00
Artem Belevich	2325675143	[CUDA] Fixes minor cuda-related issues in the driver * Only the last of the --cuda-host-only/--cuda-device-only options has effect. * CudaHostAction always wraps host-side compilation now. * Fixed printing of empty action lists. Differential Revision: http://reviews.llvm.org/D12892 llvm-svn: 248297	2015-09-22 17:23:09 +00:00
Artem Belevich	f8144ab44f	[CUDA] Improve CUDA compilation pipeline creation. Current implementation tries to guess which Action will result in a job which needs to incorporate device-side GPU binaries. The guessing was attempting to work around the fact that multiple actions may be combined into a single compiler invocation. If CudaHostAction ends up being combined (and thus bypassed during action list traversal) no device-side actions it pointed to were processed. The guessing worked for most of the usual cases, but fell apart when external assembler was used. This change removes the guessing and makes sure we create and pass device-side jobs regardless of how the jobs get combined. * CudaHostAction is always inserted either at Compile phase or the FinalPhase of current compilation, whichever happens first. * If selectToolForJob combines CudaHostAction with other actions, it passes info about CudaHostAction up to the caller * When it sees that CudaHostAction got combined with other actions (and hence will never be passed to BuildJobsForActions), BuildJobsForActions creates device-side jobs the same way they would be created if CudaHostAction was passed to BuildJobsForActions directly. * Added two more test cases to make sure GPU binaries are passed to correct jobs. Differential Revision: http://reviews.llvm.org/D11280 llvm-svn: 246174	2015-08-27 18:10:41 +00:00
Artem Belevich	baae093e49	Silence unused argument warning for --cuda-host-only. Differential Revision: http://reviews.llvm.org/D11575 llvm-svn: 243479	2015-07-28 21:01:30 +00:00
Artem Belevich	4242f41d8a	--cuda-host-only should not disable linking phase. Host-only cuda compilation does produce valid host object file and in some cases users do want to proceed on to the linking phase. The change removes special case that stopped compilation pipeline at the Assembly phase. Device-side compilation is still stopped early by the types::getCompilationPhases(). Differential Revision: http://reviews.llvm.org/D11573 llvm-svn: 243478	2015-07-28 21:01:21 +00:00
Artem Belevich	df7cd313d9	Fixed an error in cuda-options.cu test: -target option must be used without '='. llvm-svn: 242422	2015-07-16 17:24:18 +00:00
Artem Belevich	b73313de20	Run cuda options test only with specific target. For now it's only x86_64-linux-gnu. llvm-svn: 242181	2015-07-14 18:49:17 +00:00
Yaron Keren	4ca1903696	Fix test for Visual C++ link.exe. llvm-svn: 242125	2015-07-14 06:01:14 +00:00
Artem Belevich	0ff05cd165	[cuda] Driver changes to compile and stitch together host and device-side CUDA code. NOTE: reverts r242077 to reinstate r242058, r242065, 242067 and includes fix for OS X test failures. - Changed driver pipeline to compile host and device side of CUDA files and incorporate results of device-side compilation into host object file. - Added a test for cuda pipeline creation in clang driver. New clang options: --cuda-host-only - Do host-side compilation only. --cuda-device-only - Do device-side compilation only. --cuda-gpu-arch=<ARCH> - specify GPU architecture for device-side compilation. E.g. sm_35, sm_30. Default is sm_20. May be used more than once in which case one device-compilation will be done per unique specified GPU architecture. Differential Revision: http://reviews.llvm.org/D9509 llvm-svn: 242085	2015-07-13 23:27:56 +00:00
Rafael Espindola	abbd6d6824	This reverts commit r242058, r242065, r242067. The tests were failing on OS X. Revert "[cuda] Driver changes to compile and stitch together host and device-side CUDA code." Revert "Fixed regex to properly match '64' in the test case." Revert "clang/test/Driver/cuda-options.cu REQUIRES clang-driver, at least." llvm-svn: 242077	2015-07-13 22:26:30 +00:00
NAKAMURA Takumi	7227a88f23	clang/test/Driver/cuda-options.cu REQUIRES clang-driver, at least. llvm-svn: 242067	2015-07-13 21:18:53 +00:00
Artem Belevich	e9a400e065	Fixed regex to properly match '64' in the test case. llvm-svn: 242065	2015-07-13 20:49:50 +00:00
Artem Belevich	cd42e7f77a	[cuda] Driver changes to compile and stitch together host and device-side CUDA code. - Changed driver pipeline to compile host and device side of CUDA files and incorporate results of device-side compilation into host object file. - Added a test for cuda pipeline creation in clang driver. New clang options: --cuda-host-only - Do host-side compilation only. --cuda-device-only - Do device-side compilation only. --cuda-gpu-arch=<ARCH> - specify GPU architecture for device-side compilation. E.g. sm_35, sm_30. Default is sm_20. May be used more than once in which case one device-compilation will be done per unique specified GPU architecture. Differential Revision: http://reviews.llvm.org/D9509 llvm-svn: 242058	2015-07-13 20:21:06 +00:00

27 Commits