Currently AMDGPUTargetInfo does not initialize AddrSpaceMap in constructor, which causes regressions in mesa/clover with libclc.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D34987
llvm-svn: 307105
Clang assumes coerced function argument is in address space 0, which is not always true and results in invalid bitcasts.
This patch fixes failure in OpenCL conformance test api/get_kernel_arg_info with amdgcn---amdgizcl triple, where non-zero alloca address space is used.
Differential Revision: https://reviews.llvm.org/D34777
llvm-svn: 306721
Summary: OpenCL and SPIR version metadata must be generated once per module instead of once per mangled global value.
Reviewers: Anastasia, yaxunl
Reviewed By: Anastasia
Subscribers: ahatanak, cfe-commits
Differential Revision: https://reviews.llvm.org/D34235
llvm-svn: 305796
Rationale: OpenCL kernels are called via an explicit runtime API
with arguments set with clSetKernelArg(), not as normal sub-functions.
Return SPIR_KERNEL by default as the kernel calling convention to ensure
the fingerprint is fixed such way that each OpenCL argument gets one
matching argument in the produced kernel function argument list to enable
feasible implementation of clSetKernelArg() with aggregates etc. In case
we would use the default C calling conv here, clSetKernelArg() might
break depending on the target-specific conventions; different targets
might split structs passed as values to multiple function arguments etc.
https://reviews.llvm.org/D33639
llvm-svn: 304389
Amongst other, this will help LTO to correctly handle/honor files
compiled with O0, helping debugging failures.
It also seems in line with how we handle other options, like how
-fnoinline adds the appropriate attribute as well.
Differential Revision: https://reviews.llvm.org/D28404
llvm-svn: 304127
A recent change requires opencl triple environment for compiling OpenCL
program, which causes regressions in libclc.
This patch fixes that. Instead of deducing language based on triple
environment, it checks LangOptions.
Differential Revision: https://reviews.llvm.org/D33445
llvm-svn: 303644
Alloca always returns a pointer in alloca address space, which may
be different from the type defined by the language. For example,
in C++ the auto variables are in the default address space. Therefore
cast alloca to the expected address space when necessary.
Differential Revision: https://reviews.llvm.org/D32248
llvm-svn: 303370
LLVM has changed the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.
https://bugs.llvm.org/show_bug.cgi?id=32382
rdar://problem/31205000
llvm-svn: 300523
For OpenCL, the private address space qualifier is 0 in AST. Before this change, 0 address space qualifier
is always mapped to target address space 0. As now target private address space is specified by
alloca address space in data layout, address space qualifier 0 needs to be mapped to alloca addr space specified by the data layout.
This change has no impact on targets whose alloca addr space is 0.
With contributions from Matt Arsenault, Tony Tye and Wen-Heng (Jack) Chung
Differential Revision: https://reviews.llvm.org/D31404
llvm-svn: 299965
Change constant address space from 4 to 2 for the new address space mapping in Clang.
Differential Revision: https://reviews.llvm.org/D31771
llvm-svn: 299691
These two attributes specify the same info in a different way.
AMGPU BE only checks the latter as a target specific attribute
as opposed to language specific reqd_work_group_size.
This change produces amdgpu_flat_work_group_size out of
reqd_work_group_size if specified.
Differential Revision: https://reviews.llvm.org/D31728
llvm-svn: 299678
Summary:
"kernel_arg_type_qual" metadata should contain const/volatile/restrict
tags only for pointer types to match the corresponding requirement of
the OpenCL specification.
OpenCL 2.0 spec 5.9.3 Kernel Object Queries:
CL_KERNEL_ARG_TYPE_VOLATILE is returned if the argument is a pointer
and the referenced type is declared with the volatile qualifier.
[...]
Similarly, CL_KERNEL_ARG_TYPE_CONST is returned if the argument is a
pointer and the referenced type is declared with the restrict or const
qualifier.
[...]
CL_KERNEL_ARG_TYPE_RESTRICT will be returned if the pointer type is
marked restrict.
Reviewers: Anastasia, cfe-commits
Reviewed By: Anastasia
Subscribers: bader, yaxunl
Differential Revision: https://reviews.llvm.org/D31321
llvm-svn: 299192
For target environment amdgiz and amdgizcl (giz means Generic Is Zero), AMDGPU will use new address space mapping where generic address space is 0 and private address space is 5. The data layout is also changed correspondingly.
Differential Revision: https://reviews.llvm.org/D31210
llvm-svn: 298767
For variables in generic address spaces, for example:
```
unsigned char V[6442450944];
...
```
the address space is not yet known when we get into
*getConstantArrayType*, it is 0. AMDGCN target's
address space 0 has 32 bits pointers, so when we
call *getPointerWidth* with 0, the array size is
trimmed to 32 bits, which is not right.
Differential Revision: https://reviews.llvm.org/D30845
llvm-svn: 298420
Summary: I added a new rank to ImplicitConversionRank enum to resolve the function overload ambiguity with vector types. Rank of scalar types conversion is lower than vector splat. So, we can choose which function should we call. See test for more details.
Reviewers: Anastasia, cfe-commits
Reviewed By: Anastasia
Subscribers: bader, yaxunl
Differential Revision: https://reviews.llvm.org/D30816
llvm-svn: 298366
We can't actually pretend that 0 is valid for address space 0.
r295877 added a workaround to stop allocating user objects
there, so we can use 0 as the invalid pointer.
Some of the tests seemed to be using private as the non-0 null
test address space, so add copies using local to make sure
this is still stressed.
llvm-svn: 297659
Removed ndrange_t as Clang builtin type and added
as a struct type in the OpenCL header.
Use type name to do the Sema checking in enqueue_kernel
and modify IR generation accordingly.
Review: D28058
Patch by Dmitry Borisenkov!
llvm-svn: 295311
Modify ObjC blocks impl wrt address spaces as follows:
- keep default private address space for blocks generated
as local variables (with captures);
- add global address space for global block literals (no captures);
- make the block invoke function and enqueue_kernel prototype with
the generic AS block pointer parameter to accommodate both
private and global AS cases from above;
- add block handling into default AS because it's implemented as
a special pointer type (BlockPointer) in the frontend and therefore
it is used as a pointer everywhere. This is also needed to accommodate
both private and global AS blocks for the two cases above.
- removes ObjC RT specific symbols (NSConcreteStackBlock and
NSConcreteGlobalBlock) in the OpenCL mode.
Review: https://reviews.llvm.org/D28814
llvm-svn: 293286
Summary:
We compile user opencl kernel code with spir triple. But built-ins are written in OpenCL and we compile it with triple x86_64 to be able to use x86 intrinsics. And we need address spaces to match in both cases. So, we change fake address space map in OpenCL for matching with spir.
On CPU address spaces are not really important but we'd like to preserve address space information in order to perform optimizations relying on this info like enhanced alias analysis.
Reviewers: pekka.jaaskelainen, Anastasia
Subscribers: pekka.jaaskelainen, yaxunl, bader, cfe-commits
Differential Revision: https://reviews.llvm.org/D28048
llvm-svn: 290436
-fno-inline-functions, -O0, and optnone.
These were really, really tangled together:
- We used the noinline LLVM attribute for -fno-inline
- But not for -fno-inline-functions (breaking LTO)
- But we did use it for -finline-hint-functions (yay, LTO is happy!)
- But we didn't for -O0 (LTO is sad yet again...)
- We had weird structuring of CodeGenOpts with both an inlining
enumeration and a boolean. They interacted in weird ways and
needlessly.
- A *lot* of set smashing went on with setting these, and then got worse
when we considered optnone and other inlining-effecting attributes.
- A bunch of inline affecting attributes were managed in a completely
different place from -fno-inline.
- Even with -fno-inline we failed to put the LLVM noinline attribute
onto many generated function definitions because they didn't show up
as AST-level functions.
- If you passed -O0 but -finline-functions we would run the normal
inliner pass in LLVM despite it being in the O0 pipeline, which really
doesn't make much sense.
- Lastly, we used things like '-fno-inline' to manipulate the pass
pipeline which forced the pass pipeline to be much more
parameterizable than it really needs to be. Instead we can *just* use
the optimization level to select a pipeline and control the rest via
attributes.
Sadly, this causes a bunch of churn in tests because we don't run the
optimizer in the tests and check the contents of attribute sets. It
would be awesome if attribute sets were a bit more FileCheck friendly,
but oh well.
I think this is a significant improvement and should remove the semantic
need to change what inliner pass we run in order to comply with the
requested inlining semantics by relying completely on attributes. It
also cleans up tho optnone and related handling a bit.
One unfortunate aspect of this is that for generating alwaysinline
routines like those in OpenMP we end up removing noinline and then
adding alwaysinline. I tried a bunch of other approaches, but because we
recompute function attributes from scratch and don't have a declaration
here I couldn't find anything substantially cleaner than this.
Differential Revision: https://reviews.llvm.org/D28053
llvm-svn: 290398
This is a recommit of r290149, which was reverted in r290169 due to msan
failures. msan was failing because we were calling
`isMostDerivedAnUnsizedArray` on an invalid designator, which caused us
to read uninitialized memory. To fix this, the logic of the caller of
said function was simplified, and we now have a `!Invalid` assert in
`isMostDerivedAnUnsizedArray`, so we can catch this particular bug more
easily in the future.
Fingers crossed that this patch sticks this time. :)
Original commit message:
This patch does three things:
- Gives us the alloc_size attribute in clang, which lets us infer the
number of bytes handed back to us by malloc/realloc/calloc/any user
functions that act in a similar manner.
- Teaches our constexpr evaluator that evaluating some `const` variables
is OK sometimes. This is why we have a change in
test/SemaCXX/constant-expression-cxx11.cpp and other seemingly
unrelated tests. Richard Smith okay'ed this idea some time ago in
person.
- Uniques some Blocks in CodeGen, which was reviewed separately at
D26410. Lack of uniquing only really shows up as a problem when
combined with our new eagerness in the face of const.
llvm-svn: 290297
This reverts commit r290171. It triggers a bunch of warnings, because
the new enumerator isn't handled in all switches. We want a warning-free
build.
Replied on the commit with more details.
llvm-svn: 290173