Commit Graph

9 Commits

Author SHA1 Message Date
Saiyedul Islam 98380762c3 [clang-offload-bundler] Make Bundle Entry ID backward compatible
Earlier BundleEntryID used to be <OffloadKind>-<Triple>-<GPUArch>.
This used to work because the clang-offload-bundler didn't need
GPUArch explicitly for any bundling/unbundling action. With
unbundleArchive it needs GPUArch to ensure compatibility between
device specific code objects. D93525 enforced triples to have
separators for all 4 components irrespective of number of
components, like "amdgcn-amd-amdhsa--". It was required to
to correctly parse a possible 4th environment component or a GPU.
But, this condition is breaking backward compatibility with
archive libraries compiled with compilers older than D93525.

This patch allows triples to have any number of components with
and without extra separator for empty environment field. Thus,
both the following bundle entry IDs are same:
openmp-amdgcn-amd-amdhsa--gfx906
openmp-amdgcn-amd-amdhsa-gfx906

Reviewed By: yaxunl, grokos

Differential Revision: https://reviews.llvm.org/D106809
2021-09-08 16:06:12 +05:30
Saiyedul Islam f7ce532d62 [clang-offload-bundler] Add unbundling of archives containing bundled object files into device specific archives
This patch adds unbundling support of an archive file. It takes an
archive file along with a set of offload targets as input.
Output is a device specific archive for each given offload target.
Input archive contains bundled code objects bundled using
clang-offload-bundler. Each generated device specific archive contains
a set of device code object files which are named as
<Parent Bundle Name>-<CodeObject-GPUArch>.

Entries in input archive can be of any binary type which is
supported by clang-offload-bundler, like *.bc. Output archives will
contain files in same type.

Example Usuage:
  clang-offload-bundler --unbundle --inputs=lib-generic.a -type=a
      -targets=openmp-amdgcn-amdhsa--gfx906,openmp-amdgcn-amdhsa--gfx908
      -outputs=devicelib-gfx906.a,deviceLib-gfx908.a

Reviewed By: jdoerfert, yaxunl

Differential Revision: https://reviews.llvm.org/D93525
2021-06-30 17:55:50 +05:30
Yaxun (Sam) Liu 5fc2673fbc [HIP] Add --gpu-bundle-output
Added --gpu-bundle-output to control bundling/unbundling output of HIP device compilation.

By default preprocessor expansion, llvm bitcode and assembly are unbundled, code objects are
bundled.

Reviewed by: Artem Belevich, Jan Svoboda

Differential Revision: https://reviews.llvm.org/D101630
2021-06-09 23:31:43 -04:00
Jon Chesterfield fc88d927e3 [clang][amdgpu] Use implicit code object version
[clang][amdgpu] Use implicit code object version

At present, clang always passes amdhsa-code-object-version on to -cc1. That is
great for certainty over what object version is being used when debugging.

Unfortunately, the command line argument is in AMDGPUBaseInfo.cpp in the amdgpu
target. If clang is used with an llvm compiled with DLLVM_TARGETS_TO_BUILD
that excludes amdgpu, this will be diagnosed (as discovered via D98658):

- Unknown command line argument '--amdhsa-code-object-version=4'

This means that clang, built only for X86, can be used to compile the nvptx
devicertl for openmp but not the amdgpu one. That would shortly spawn fragile
logic in the devicertl cmake to try to guess whether the clang used will work.

This change omits the amdhsa-code-object-version parameter when it matches the
default that AMDGPUBaseInfo.cpp specifies, with a comment to indicate why. As
this is the only part of clang's codegen for amdgpu that depends on the target
in the back end it suffices to build the openmp runtime on most (all?) systems.

It is a non-functional change, though observable in the updated tests and when
compiling with -###. It may cause minor disruption to the amd-stg-open branch.

Revision of D98746, builds on refactor in D101077

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D101095
2021-04-23 23:52:50 +01:00
Yaxun (Sam) Liu 1dab94f9ed [CUDA][HIP] Pass -fgpu-rdc to host clang -cc1
Currently -fgpu-rdc is not passed to host clang -cc1.
This causes issue because -fgpu-rdc affects shadow
variable linkage in host compilation.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D96105
2021-02-08 19:08:20 -05:00
Jon Chesterfield c0619d3b21 [NFC] Use regex for code object version in hip tests
[NFC] Use regex for code object version in hip tests

Extracted from D93258. Makes tests robust to changes in default
code object version.

Reviewed By: t-tye

Differential Revision: https://reviews.llvm.org/D93398
2020-12-16 17:00:19 +00:00
Yaxun (Sam) Liu 0b81d9a992 [AMDGPU] add -mcode-object-version=n
Add option -mcode-object-version=n to control code object version for
AMDGPU.

Differential Revision: https://reviews.llvm.org/D91310
2020-12-07 18:08:37 -05:00
Yaxun (Sam) Liu 6752786d65 [HIP] Do not use llvm-link/opt/llc for -fgpu-rdc
This patch is a follow up on https://reviews.llvm.org/D81627.

In addition to default -fno-gpu-rdc case, this patches let
HIP toolchain not use llvm-link/opt/llc to link device code
for -fgpu-rdc case. Instead, uses standard lto.

This will eliminate some redundant optimizations and speed
up the compilation/linking.

Differential Revision: https://reviews.llvm.org/D81861
2020-06-15 21:09:18 -04:00
Michael Liao 8b6821a584 [hip] Fix device-only relocatable code compilation.
Summary:
- In HIP, just as the regular device-only compilation, the device-only
  relocatable code compilation should not involve offload bundle.
- In addition, that device-only relocatable code compilation should have
  the similar 3 steps, namely preprocessor, compile, and backend, to the
  regular code generation with `-emit-llvm`.

Reviewers: yaxunl, tra

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81427
2020-06-10 14:10:41 -04:00