Summary:
This patch replaces the CUDA specific action by a generic offload action. The offload action may have multiple dependences classier in “host” and “device”. The way this generic offloading action is used is very similar to what is done today by the CUDA implementation: it is used to set a specific toolchain and architecture to its dependences during the generation of jobs.
This patch also proposes propagating the offloading information through the action graph so that that information can be easily retrieved at any time during the generation of commands. This allows e.g. the "clang tool” to evaluate whether CUDA should be supported for the device or host and ptas to easily retrieve the target architecture.
This is an example of how the action graphs would look like (compilation of a single CUDA file with two GPU architectures)
```
0: input, "cudatests.cu", cuda, (host-cuda)
1: preprocessor, {0}, cuda-cpp-output, (host-cuda)
2: compiler, {1}, ir, (host-cuda)
3: input, "cudatests.cu", cuda, (device-cuda, sm_35)
4: preprocessor, {3}, cuda-cpp-output, (device-cuda, sm_35)
5: compiler, {4}, ir, (device-cuda, sm_35)
6: backend, {5}, assembler, (device-cuda, sm_35)
7: assembler, {6}, object, (device-cuda, sm_35)
8: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {7}, object
9: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {6}, assembler
10: input, "cudatests.cu", cuda, (device-cuda, sm_37)
11: preprocessor, {10}, cuda-cpp-output, (device-cuda, sm_37)
12: compiler, {11}, ir, (device-cuda, sm_37)
13: backend, {12}, assembler, (device-cuda, sm_37)
14: assembler, {13}, object, (device-cuda, sm_37)
15: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {14}, object
16: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {13}, assembler
17: linker, {8, 9, 15, 16}, cuda-fatbin, (device-cuda)
18: offload, "host-cuda (powerpc64le-unknown-linux-gnu)" {2}, "device-cuda (nvptx64-nvidia-cuda)" {17}, ir
19: backend, {18}, assembler
20: assembler, {19}, object
21: input, "cuda", object
22: input, "cudart", object
23: linker, {20, 21, 22}, image
```
The changes in this patch pass the existent regression tests (keeps the existent functionality) and resulting binaries execute correctly in a Power8+K40 machine.
Reviewers: echristo, hfinkel, jlebar, ABataev, tra
Subscribers: guansong, andreybokhanko, tcramer, mkuron, cfe-commits, arpith-jacob, carlo.bertolli, caomhin
Differential Revision: https://reviews.llvm.org/D18171
llvm-svn: 275645
they're redeclarations. This is necessary in order for name lookup to correctly
find the most recent declaration of the name (which affects default template
argument lookup and cross-module merging, among other things).
llvm-svn: 275612
Extend the __declspec(dll*) attribute to cover ObjC interfaces. This was
requested by Microsoft for their ObjC support. Cover both import and export.
This only adds the semantic analysis portion of the support, code-generation
still remains outstanding. Add some basic initial documentation on the
attributes that were previously empty. Tweak the previous tests to use the
relative expected-warnings to make the tests easier to read.
llvm-svn: 275610
This patch adds the check for specifying both simdlen and safelen clauses on the 'distribute simd' or 'distribute parallel for simd' constructs.
Differential Revision: https://reviews.llvm.org/D22384
llvm-svn: 275529
This changes the CompilerInstance::createOutputFile function to return
a std::unique_ptr<llvm::raw_ostream>, rather than an llvm::raw_ostream
implicitly owned by the CompilerInstance. This in most cases required that
I move ownership of the output stream to the relevant ASTConsumer.
The motivation for this change is to allow BackendConsumer to be a client
of interfaces such as D20268 which take ownership of the output stream.
Differential Revision: http://reviews.llvm.org/D21537
llvm-svn: 275507
passed on the command line but never actually used. We consider a (top-level)
module to be used if any part of it is imported, either by the current
translation unit, or by any part of a top-level module that is itself used.
(Put another way, a module is used if an implicit modules build would have
loaded its .pcm file.)
llvm-svn: 275481
For some reason it seems the second invocation is getting DMOD_OTHER_H
set to a path with/forward/slashes, but one of the use sites
has\back\slashes. There should be no difference with what was already
there, but for now try to avoid checking those paths.
llvm-svn: 275464
distinct anonymous structs remain distinct despite having similar layout.
This is already ensured by distinguishing based on their placement in the parent
struct, using the function `findAnonymousStructOrUnionIndex`.
The problem is that this function only handles anonymous structs, like
```
class Foo { struct { int a; } }
```
and not untagged structs like
```
class Foo { struct { int a; } var; }
```
Both need to be handled, and this patch fixes that. The test case ensures that this functionality doesn't regress.
Thanks to Manman Ren for review.
https://reviews.llvm.org/D22270
llvm-svn: 275460
Whether we call an ImportDecl a decl or a reference symbol role is
somewhat academic, but in practice it's more like a declaration because
it is interesting even to consumers who wouldn't care about references.
Most importantly, we want to report the module dependencies of system
modules even when we have declaration-only filtering.
rdar://problem/27134855
llvm-svn: 275454
The test currently fails if the name of the Clang binary doesn't contain "clang".
This patch removes that requirement, as some environments may choose to run the test with a differently named binary. This shouldn't make the test any less strict -- the only place where the flags we're searching for can really occur is the Clang command line.
Patch by Martin Böhme!
Differential Revision: https://reviews.llvm.org/D22359
llvm-svn: 275428
This patch implements PR#22821.
Taking the address of a packed member is dangerous since the reduced
alignment of the pointee is lost. This can lead to memory alignment
faults in some architectures if the pointer value is dereferenced.
This change adds a new warning to clang emitted when taking the address
of a packed member. A packed member is either a field/data member
declared as attribute((packed)) or belonging to a struct/class
declared as such. The associated flag is -Waddress-of-packed-member.
Conversions (either implicit or via a valid casting) to pointer types
with lower or equal alignment requirements (e.g. void* or char*)
silence the warning.
This change also adds a new error diagnostic when the user attempts to
bind a reference to a packed member, regardless of the alignment.
Differential Revision: https://reviews.llvm.org/D20561
llvm-svn: 275417
rL275318 added the test Frontend/opencl.cl test, but that test was never actually run because Frontend/lit.local.cfg doesn't contain the '.cl' file suffix.
Once the test is activated, it fails with (unintended) compile errors in the newly added CHECK_INVALID_OPENCL_VERSION checks.
This patch adds the '.cl' file suffix to Frontend/lit.local.cfg to activate the test and fixes the test bug by adding '-fblocks' to the relevant command lines.
Patch by Martin Böhme!
Differential Revision: http://reviews.llvm.org/D22349
llvm-svn: 275405
Summary: Fix the build to use hasFlag instead of hasArg for checking some flags.
Reviewers: echristo
Subscribers: mehdi_amini, cfe-commits
Differential Revision: http://reviews.llvm.org/D22338
llvm-svn: 275377
Summary:
Depends on D21982 which implements the in-memory logging implementation of the
XRay runtime. These additional changes also depends on D20352 which adds the
bulk of XRay flags/dependencies when using the `-fxray-instrument` flag from
Clang.
Reviewers: echristo, rnk, aaron.ballman
Subscribers: mehdi_amini, cfe-commits
Differential Revision: http://reviews.llvm.org/D21983
llvm-svn: 275368
This patch is to implement sema and parsing for 'target parallel for simd' pragma.
Differential Revision: http://reviews.llvm.org/D22096
llvm-svn: 275365
-fxray-instrument: enables XRay annotation of IR
-fxray-instruction-threshold: configures the threshold for function size (looking at IR instructions), and allow LLVM to decide whether to add the nop sleds later on in the process.
Also implements the related xray_always_instrument and xray_never_instrument function attributes.
Patch by Dean Michael Berris.
llvm-svn: 275330
Also fixes strict-aliasing option to only be allowed when OpenCL Version 1.0. Added testcase in test/Frontend/opencl-blocks.cl.
Patch by Aaron En Ye Shi.
Differential Revision: http://reviews.llvm.org/D22170
llvm-svn: 275318
This patch is to add two additional tests for testing 'distribute parallel for simd' pragma with disallowed clauses and loops.
Differential Revision: http://reviews.llvm.org/D22169
llvm-svn: 275315
This patch is to add two additional tests for testing 'distribute simd' pragma with disallowed clauses and loops.
Differential Revision: http://reviews.llvm.org/D22176
llvm-svn: 275306