llvm-project

Commit Graph

Author	SHA1	Message	Date
Jon Chesterfield	fc88d927e3	[clang][amdgpu] Use implicit code object version [clang][amdgpu] Use implicit code object version At present, clang always passes amdhsa-code-object-version on to -cc1. That is great for certainty over what object version is being used when debugging. Unfortunately, the command line argument is in AMDGPUBaseInfo.cpp in the amdgpu target. If clang is used with an llvm compiled with DLLVM_TARGETS_TO_BUILD that excludes amdgpu, this will be diagnosed (as discovered via D98658): - Unknown command line argument '--amdhsa-code-object-version=4' This means that clang, built only for X86, can be used to compile the nvptx devicertl for openmp but not the amdgpu one. That would shortly spawn fragile logic in the devicertl cmake to try to guess whether the clang used will work. This change omits the amdhsa-code-object-version parameter when it matches the default that AMDGPUBaseInfo.cpp specifies, with a comment to indicate why. As this is the only part of clang's codegen for amdgpu that depends on the target in the back end it suffices to build the openmp runtime on most (all?) systems. It is a non-functional change, though observable in the updated tests and when compiling with -###. It may cause minor disruption to the amd-stg-open branch. Revision of D98746, builds on refactor in D101077 Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D101095	2021-04-23 23:52:50 +01:00
Yaxun (Sam) Liu	4fd05e0ad7	[HIP] Change to code object v4 Change to code object v4 by default to match ROCm 4.1. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D99235	2021-04-06 20:22:58 -04:00
Jon Chesterfield	c0619d3b21	[NFC] Use regex for code object version in hip tests [NFC] Use regex for code object version in hip tests Extracted from D93258. Makes tests robust to changes in default code object version. Reviewed By: t-tye Differential Revision: https://reviews.llvm.org/D93398	2020-12-16 17:00:19 +00:00
Yaxun (Sam) Liu	0b81d9a992	[AMDGPU] add -mcode-object-version=n Add option -mcode-object-version=n to control code object version for AMDGPU. Differential Revision: https://reviews.llvm.org/D91310	2020-12-07 18:08:37 -05:00
Yaxun (Sam) Liu	4bed1d9b32	[HIP] fix bundle entry ID for -- Canonicalize triple used in fat binary. Change from amdgcn-amd-amdhsa to amdgcn-amd-amdhsa-. This is part of https://reviews.llvm.org/D60620	2020-12-07 18:08:37 -05:00
Yaxun (Sam) Liu	dc6a0b0ec7	[HIP] Align device binary To facilitate faster loading of device binaries and share them among processes, HIP runtime favors their alignment being 4096 bytes. HIP runtime can load unaligned device binaries, however, aligning them at 4096 bytes results in faster loading and less shared memory usage. This patch adds an option -bundle-align to clang-offload-bundler which allows bundles to be aligned at specified alignment. By default it is 1, which is NFC compared to existing format. This patch then aligns embedded fat binary and device binary inside fat binary at 4096 bytes. It has been verified this change does not cause significant overall file size increase for typical HIP applications (less than 1%). Differential Revision: https://reviews.llvm.org/D88734	2020-10-02 18:10:44 -04:00
Yaxun (Sam) Liu	c830d517b4	[HIP] Enable -amdgpu-internalize-symbols Enable -amdgpu-internalize-symbols to eliminate unused functions and global variables for whole program to speed up compilation and improve performance. For -fno-gpu-rdc, -amdgpu-internalize-symbols is passed to clang -cc1. For -fgpu-rdc, -amdgpu-internalize-symbols is passed to lld. Differential Revision: https://reviews.llvm.org/D81959	2020-06-18 16:34:37 -04:00
Hiroshi Yamauchi	ce82b8e8af	[HIP] Improve check patterns to avoid test failures in case string "opt", etc. happens to be in the InstallDir path in the -### output. Differential Revision: https://reviews.llvm.org/D82046	2020-06-18 10:14:31 -07:00
Yaxun (Sam) Liu	6752786d65	[HIP] Do not use llvm-link/opt/llc for -fgpu-rdc This patch is a follow up on https://reviews.llvm.org/D81627. In addition to default -fno-gpu-rdc case, this patches let HIP toolchain not use llvm-link/opt/llc to link device code for -fgpu-rdc case. Instead, uses standard lto. This will eliminate some redundant optimizations and speed up the compilation/linking. Differential Revision: https://reviews.llvm.org/D81861	2020-06-15 21:09:18 -04:00
Yaxun (Sam) Liu	e8090d83fd	[HIP] Do not call opt/llc for -fno-gpu-rdc Currently HIP toolchain calls clang to emit bitcode then calls opt/llc for device compilation for the default -fno-gpu-rdc case, which is unnecessary since clang is able to compile a single source file to ISA. This patch fixes the HIP action builder and toolchain so that the default -fno-gpu-rdc can be done like a canonical toolchain, i.e. one clang -cc1 invocation to compile source code to ISA. This can avoid unnecessary processes to speed up the compilation, and avoid redundant LLVM passes which are performed in clang -cc1 and opt. Differential Revision: https://reviews.llvm.org/D81627	2020-06-15 18:55:01 -04:00
Michael Liao	c97be2c377	[hip] Remove `hip_pinned_shadow`. Summary: - Use `device_builtin_surface` and `device_builtin_texture` for surface/texture reference support. So far, both the host and device use the same reference type, which could be revised later when interface/implementation is stablized. Reviewers: yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77583	2020-04-07 09:51:49 -04:00
Alexandre Ganea	5896e2df45	[Clang] Fix HIP tests when running on Windows with the LLVM toolchain is in the path On Windows, when the LLVM toolchain is in the current path (%PATH%), fusing the linker yields c:\{path}\lld.exe whereas the hip tests did not expect the .exe part. This only happens if LLD is not present in the build folder, which consequently makes the code in Driver::GetProgramPath() to fall back to %PATH% and platform-specific search, which includes the .exe part (see https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/Driver.cpp#L4733). Differential Revision: https://reviews.llvm.org/D76631	2020-03-23 16:54:48 -04:00
Scott Linder	d96ea47c75	[AMDGPU][HIP] Improve opt-level handling Summary: The HIP toolchain invokes `llc` without an explicit opt-level, meaning it always uses the default (-O2). This makes it impossible to use -O1, for example. The HIP toolchain also coerces -Os/-Oz to -O2 even when invoking opt, and it coerces -Og to -O2 rather than -O1. Forward the opt-level to `llc` as well as `opt`, and only coerce levels where it is required. Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70987	2019-12-05 11:27:12 -05:00
Michael Liao	5a48678a6a	[hip] Allow the declaration of functions with variadic arguments in HIP. Summary: - As variadic parameters have the lowest rank in overload resolution, without real usage of `va_arg`, they are commonly used as the catch-all fallbacks in SFINAE. As the front-end still reports errors on calls to `va_arg`, the declaration of functions with variadic arguments should be allowed in general. Reviewers: jlebar, tra, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69389	2019-10-25 00:39:24 -04:00
Yaxun Liu	c3dfe9082b	[HIP] Support attribute hip_pinned_shadow This patch introduces support of hip_pinned_shadow variable for HIP. A hip_pinned_shadow variable is a global variable with attribute hip_pinned_shadow. It has external linkage on device side and has no initializer. It has internal linkage on host side and has initializer or static constructor. It can be accessed in both device code and host code. This allows HIP runtime to implement support of HIP texture reference. Differential Revision: https://reviews.llvm.org/D62738 llvm-svn: 364381	2019-06-26 03:47:37 +00:00
Reid Kleckner	549ed544c3	[Driver] Move the "-o OUT -x TYPE SRC.c" flags to the end of -cc1 New -cc1 arguments, such as -faddrsig, have started appearing after the input name. I personally find it convenient for the input to be the last argument to the compile command line, since I often need to edit it when running crash reproduction scripts. Differential Revision: https://reviews.llvm.org/D62270 llvm-svn: 361530	2019-05-23 18:35:43 +00:00
Yaxun Liu	7bd8c37b17	[HIP] Use -mlink-builtin-bitcode to link device library Use -mlink-builtin-bitcode instead of llvm-link to link device library so that device library bitcode and user device code can be compiled in a consistent way. This is the same approach used by CUDA and OpenMP. Differential Revision: https://reviews.llvm.org/D60513 llvm-svn: 358290	2019-04-12 16:23:31 +00:00
Douglas Yung	607a1b2234	Relax restriction in tests to where "-emit-llvm-bc" and "-emit-obj" must appear. The CHECK lines as structured were requiring them to appear only in a certain position while all that is really needed is to check that they are present. llvm-svn: 354001	2019-02-14 01:11:32 +00:00
Aaron Enye Shi	a1adb80ae7	[HIP] Fix hip-toolchain-rdc tests Since we removed changed the way HIP Toolchain will propagate -m options into LLC, we need to remove from these older tests. This is related to rC353880. Differential Revision: https://reviews.llvm.org/D57977 llvm-svn: 353885	2019-02-12 22:01:19 +00:00
Scott Linder	bef2663751	Add -fapply-global-visibility-to-externs for -cc1 Introduce an option to request global visibility settings be applied to declarations without a definition or an explicit visibility, rather than the existing behavior of giving these default visibility. When the visibility of all or most extern definitions are known this allows for the same optimisations -fvisibility permits without updating source code to annotate all declarations. Differential Revision: https://reviews.llvm.org/D56868 llvm-svn: 352391	2019-01-28 17:12:19 +00:00
Yaxun Liu	9b6d9f2a62	Disable code object version 3 for HIP toolchain AMDGPU backend will switch to code object version 3 by default. Since HIP runtime is not ready, disable it until the runtime is ready. Differential Revision: https://reviews.llvm.org/D53325 llvm-svn: 344630	2018-10-16 17:36:23 +00:00
Yaxun Liu	9767089d00	[HIP] Support early finalization of device code for -fno-gpu-rdc This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original options as aliases. When -fgpu-rdc is off, clang will assume the device code in each translation unit does not call external functions except those in the device library, therefore it is possible to compile the device code in each translation unit to self-contained kernels and embed them in the host object, so that the host object behaves like usual host object which can be linked by lld. The benefits of this feature is: 1. allow users to create static libraries which can be linked by host linker; 2. amortized device code linking time. This patch modifies HIP action builder to insert actions for linking device code and generating HIP fatbin, and pass HIP fatbin to host backend action. It extracts code for constructing command for generating HIP fatbin as a function so that it can be reused by early finalization. It also modifies codegen of HIP host constructor functions to embed the device fatbin when it is available. Differential Revision: https://reviews.llvm.org/D52377 llvm-svn: 343611	2018-10-02 17:48:54 +00:00

22 Commits