Commit Graph

38 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu a6786cdd57 [HIPSPV][3/4] Enable SPIR-V emission for HIP
This patch enables SPIR-V binary emission for HIP device code via the
HIPSPV tool chain.

‘--offload’ option, which is envisioned in [1], is added for specifying
offload targets. This option is used to override default device target
(amdgcn-amd-amdhsa) for HIP compilation for emitting device code as
SPIR-V binary. The option is handled in getHIPOffloadTargetTriple().

getOffloadingDeviceToolChain() function (based on the design in the
SYCL repository) is added to select HIPSPVToolChain when HIP offload
target is ‘spirv64’.

The HIPActionBuilder is modified to produce LLVM IR at the backend
phase. HIPSPV tool chain expects to receive HIP device code as LLVM
IR so it can run external LLVM passes over them. HIPSPV TC is also
responsible for emitting the SPIR-V binary.

A Cuda GPU architecture ‘generic’ is added. The name is picked from
the LLVM SPIR-V Backend. In the HIPSPV code path the architecture
name is inserted to the bundle entry ID as target ID. Target ID is
expected to be always present so a component in the target triple
is not mistaken as target ID.

Tests are added for checking the HIPSPV tool chain.

[1]: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html

Patch by: Henry Linjamäki

Reviewed by: Yaxun Liu, Artem Belevich, Alexey Bader

Differential Revision: https://reviews.llvm.org/D110622
2021-12-20 10:45:09 -05:00
Carlos Galvez 7ecec3f0f5 [CUDA] Bump supported CUDA version to 11.5
Differential Revision: https://reviews.llvm.org/D113249
2021-11-09 08:20:53 +00:00
Artem Belevich 3db8e486e5 [CUDA] Improve CUDA version detection and diagnostics.
Always use cuda.h to detect CUDA version. It's a more universal approach
compared to version.txt which is no longer present in recent CUDA versions.

Split the 'unknown CUDA version' warning in two:

* when detected CUDA version is partially supported by clang. It's expected to
work in general, at the feature parity with the latest supported CUDA
version. and may be missing support for the new features/instructions/GPU
variants. Clang will issue a warning.

* when detected version is new. Recent CUDA versions have been working with
clang reasonably well, and will likely to work similarly to the partially
supported ones above. Or it may not work at all. Clang will issue a warning and
proceed as if the latest known CUDA version was detected.

Differential Revision: https://reviews.llvm.org/D108247
2021-08-23 13:24:48 -07:00
Artem Belevich 49d982d8cb [CUDA] Add support for CUDA-11.4
Differential Revision: https://reviews.llvm.org/D108239
2021-08-23 13:24:46 -07:00
Aakanksha Patil 3453f3dd46 [AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
2021-06-24 14:32:41 -04:00
Brendon Cahoon 294efbbd3e Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2.

Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08 21:15:35 -04:00
Brendon Cahoon 211e584fa2 Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984.

A sanitizer buildbot reports an error.
2021-06-08 16:29:41 -04:00
Brendon Cahoon ea10a86984 [AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
2021-06-08 12:49:49 -04:00
Aakanksha Patil 464e4dc50f [AMDGPU] Add gfx1034 target
Differential Revision: https://reviews.llvm.org/D102306
2021-05-13 14:25:18 -04:00
Fangrui Song 1fcf9247de [Cuda] Internalize a struct and a global variable 2021-05-01 16:24:39 -07:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Artem Belevich 2aa01ccec3 [CUDA, NVPTX] Allow targeting sm_86 GPUs.
The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o
adding any new functionality.

Differential Revision: https://reviews.llvm.org/D95974
2021-02-09 11:01:10 -08:00
Tony 5ad202ce89 [NFC][AMDGPU] Reformat AMD GPU targets in cuda.cpp
Differential Revision: https://reviews.llvm.org/D93181
2020-12-13 23:02:59 +00:00
Tim Renouf 89d41f3a2b [AMDGPU] Add gfx1033 target
Differential Revision: https://reviews.llvm.org/D90447

Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
2020-11-03 16:27:48 +00:00
Tim Renouf ee3e642627 [AMDGPU] Add gfx90c target
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.

Differential Revision: https://reviews.llvm.org/D90419

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-11-03 16:27:43 +00:00
Tim Renouf 666ef0db20 [AMDGPU] Add gfx602, gfx705, gfx805 targets
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:

* The "Oland" and "Hainan" variants of gfx601 are now split out into
  gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
  that to avoid using the shaderZExport workaround on gfx602.

* One variant of gfx703 is now split out into gfx705. LLPC and other
  front-ends could use that to avoid using the
  shaderSpiCsRegAllocFragmentation workaround on gfx705.

* The "TongaPro" variant of gfx802 is now split out into gfx805.
  TongaPro has a faster 64-bit shift than its former friends in gfx802,
  and a subtarget feature could be set up for that to take advantage of
  it. This commit does not make that change; it just adds the target.

V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
    so fix the GPUKind order.

Differential Revision: https://reviews.llvm.org/D88916

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-10-10 17:22:22 +01:00
Yaxun (Sam) Liu cbd420c5ed [CUDA][HIP] Fix bound arch for offload action for fat binary
Currently CUDA/HIP toolchain uses "unknown" as bound arch
for offload action for fat binary. This causes -mcpu or -march
with "unknown" added in HIPToolChain::TranslateArgs or
CUDAToolChain::TranslateArgs.

This causes issue for https://reviews.llvm.org/D88377 since
HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs.

The bound arch of offload action for fat binary is not really
used, therefore set it to CudaArch::UNUSED.

Differential Revision: https://reviews.llvm.org/D88524
2020-10-02 19:05:51 -04:00
Yaxun (Sam) Liu 041da0d828 [HIP] Add gfx1031 and gfx1030
Differential Revision: https://reviews.llvm.org/D87324
2020-09-08 16:38:34 -04:00
Artem Belevich 8c635ba4a8 [CUDA] Fix missed CUDA version mappings. 2020-04-13 15:54:12 -07:00
Artem Belevich a9627b7ea7 [CUDA] Add partial support for recent CUDA versions.
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.

Differential Revision: https://reviews.llvm.org/D77670
2020-04-08 11:19:44 -07:00
Artem Belevich 33386b20aa [CUDA] Simplify GPU variant handling. NFC.
Instead of hardcoding individual GPU mappings in multiple functions, keep them
all in one table and use it to look up the mappings.

We also don't care about 'virtual' architecture much, so the API is trimmed down
down to a simpler GPU->Virtual arch name lookup.

Differential Revision: https://reviews.llvm.org/D77665
2020-04-08 11:19:43 -07:00
Michael Liao b952d799ca [cuda][hip] Fix `RegisterVar` function prototype.
Summary:
- `RegisterVar` has `void` return type and `size_t` in its variable size
  parameter in HIP or CUDA 9.0+.

Reviewers: tra, yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77398
2020-04-03 12:57:09 -04:00
Artem Belevich 12fefeef20 [CUDA] Assume the latest known CUDA version if we've found an unknown one.
This makes clang somewhat forward-compatible with new CUDA releases
without having to patch it for every minor release without adding
any new function.

If an unknown version is found, clang issues a warning (can be disabled
with -Wno-cuda-unknown-version) and assumes that it has detected
the latest known version. CUDA releases are usually supersets
of older ones feature-wise, so it should be sufficient to keep
released clang versions working with minor CUDA updates without
having to upgrade clang, too.

Differential Revision: https://reviews.llvm.org/D73231
2020-01-28 10:11:42 -08:00
Yaxun Liu 6add24adaf [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012
Differential Revision: https://reviews.llvm.org/D64364

llvm-svn: 365799
2019-07-11 17:50:09 +00:00
Stanislav Mekhanoshin 0cfd75a07d [AMDGPU] gfx908 clang target
Differential Revision: https://reviews.llvm.org/D64430

llvm-svn: 365528
2019-07-09 18:19:00 +00:00
Artem Belevich 4071763bb8 Basic CUDA-10 support.
Differential Revision: https://reviews.llvm.org/D57771

llvm-svn: 353232
2019-02-05 22:38:58 +00:00
Artem Belevich 8fa28a0db0 [CUDA] Propagate detected version of CUDA to cc1
..and use it to control that parts of CUDA compilation
that depend on the specific version of CUDA SDK.

This patch has a placeholder for a 'new launch API' support
which is in a separate patch. The list will be further
extended in the upcoming patch to support CUDA-10.1.

Differential Revision: https://reviews.llvm.org/D57487

llvm-svn: 352798
2019-01-31 21:32:24 +00:00
Tim Renouf 632f35d495 Add gfx909 to GPU Arch
Subscribers: jholewinski, cfe-commits

Differential Revision: https://reviews.llvm.org/D53558

llvm-svn: 345198
2018-10-24 21:19:02 +00:00
Yaxun Liu 83b5f35d85 Add gfx904 and gfx906 to GPU Arch
Differential Revision: https://reviews.llvm.org/D53472

llvm-svn: 344996
2018-10-23 02:05:31 +00:00
Artem Belevich 44ecb0e3c2 [CUDA] Added basic support for compiling with CUDA-10.0
llvm-svn: 342924
2018-09-24 23:10:44 +00:00
Artem Belevich 3cce307799 [CUDA] Enable CUDA compilation with CUDA-9.2
Differential Revision: https://reviews.llvm.org/D45827

llvm-svn: 330753
2018-04-24 18:23:19 +00:00
Yaxun Liu 8a5fc15aa4 [CUDA] Add amdgpu sub archs
Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D45277

llvm-svn: 329232
2018-04-04 21:19:27 +00:00
Erich Keane d45879d8ad Add NVPTX Support to ValidCPUList (enabling march notes)
A followup to: https://reviews.llvm.org/D42978
This patch adds NVPTX support for
enabling the march notes.

Differential Revision: https://reviews.llvm.org/D43045

llvm-svn: 324675
2018-02-08 23:16:00 +00:00
Artem Belevich fbc56a904f [CUDA] Added partial support for CUDA-9.1
Clang can use CUDA-9.1 now, though new APIs (are not implemented yet.

The major change is that headers in CUDA-9.1 went through substantial
changes that started in CUDA-9.0 which required substantial changes
in the cuda compatibility headers provided by clang.

There are two major issues:
* CUDA SDK no longer provides declarations for libdevice functions.
* A lot of device-side functions have become nvcc's builtins and
  CUDA headers no longer contain their implementations.

This patch changes the way CUDA headers are handled if we compile
with CUDA 9.x. Both 9.0 and 9.1 are affected.

* Clang provides its own declarations of libdevice functions.
* For CUDA-9.x clang now provides implementation of device-side
  'standard library' functions using libdevice.

This patch should not affect compilation with CUDA-8. There may be
some observable differences for CUDA-9.0, though they are not expected
to affect functionality.

Tested: CUDA test-suite tests for all supported combinations of:
        CUDA: 7.0,7.5,8.0,9.0,9.1
        GPU: sm_20, sm_35, sm_60, sm_70

Differential Revision: https://reviews.llvm.org/D42513

llvm-svn: 323713
2018-01-30 00:00:12 +00:00
Justin Lebar 066494d8c1 [CUDA] Print an error if you try to compile with < sm_30 on CUDA 9.
Summary:
CUDA 9's minimum sm is sm_30.

Ideally we should also make sm_30 the default when compiling with CUDA
9, but that seems harder than it should be.

Subscribers: sanjoy

Differential Revision: https://reviews.llvm.org/D39109

llvm-svn: 316611
2017-10-25 21:32:06 +00:00
Artem Belevich 8af4e23d1e [CUDA] Added rudimentary support for CUDA-9 and sm_70.
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.

On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.

Differential Revision: https://reviews.llvm.org/D37576

llvm-svn: 312734
2017-09-07 18:14:32 +00:00
Justin Lebar d9bc485cfe [CUDA] Fix "control reaches end of non-void function" warnings in Cuda.cpp.
Some compilers are too dumb to realize that the switch statement covers
all cases.

(Don't use a "default" label, because we explicitly want to get a warning
if our switch doesn't cover all the cases.)

llvm-svn: 274713
2016-07-07 01:06:59 +00:00
Justin Lebar 629076178a [CUDA] Add utility functions for dealing with CUDA versions / architectures.
Summary:
Currently our handling of CUDA architectures is scattered all around
clang.  This patch centralizes it.

A key advantage of this centralization is that you can now write a C++
switch on e.g. CudaArch and get a compile error if you don't handle one
of the enum values.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D21867

llvm-svn: 274681
2016-07-06 21:21:39 +00:00