Many non-language extensions are defined but also unused. This patch
removes them with their tests as they do not require compiler support.
The cl_khr_select_fprounding_mode extension is also removed because it
has been deprecated since OpenCL 1.1 and Clang doesn't have any specific
support for it.
The cl_khr_context_abort extension is only referred to in "The OpenCL
Specification", version 1.2 and 2.0, in Table 4.3, but no specification
is provided in "The OpenCL Extension Specification" for these versions.
Because it is both unused in Clang and lacks specification, this
extension is removed.
The following extensions are platform extensions that bring new OpenCL
APIs but do not impact the kernel language nor require compiler support.
They are therefore removed.
- cl_khr_gl_sharing, introduced in OpenCL 1.0
- cl_khr_icd, introduced in OpenCL 1.2
- cl_khr_gl_event, introduced in OpenCL 1.1
Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquireGLObjects.
Hence, they cannot be used on the device side and the extension does
not impact the kernel language.
- cl_khr_d3d10_sharing, introduced in OpenCL 1.1
- cl_khr_d3d11_sharing, introduced in OpenCL 1.2
- cl_khr_dx9_media_sharing, introduced in OpenCL 1.2
- cl_khr_image2d_from_buffer, introduced in OpenCL 1.2
- cl_khr_initialize_memory, introduced in OpenCL 1.2
- cl_khr_gl_depth_images, introduced in OpenCL 1.2
Note: this extension is related to cl_khr_depth_images but only the
latter adds new features to the kernel language.
- cl_khr_spir, introduced in OpenCL 1.2
- cl_khr_egl_event, introduced in OpenCL 1.2
Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquire* API
functions. Hence, they cannot be used on the device side and the
extension does not impact the kernel language.
- cl_khr_egl_image, introduced in OpenCL 1.2
- cl_khr_terminate_context, introduced in OpenCL 1.2
The minimum required OpenCL version used in OpenCLExtensions.def for
these extensions is not always correct. Removing these address that
issue.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D89372
Permitting non-standards-driven "do the best you can" constant-folding
of array bounds is permitted solely as a GNU compatibility feature. We
should not be doing it in any language mode that is attempting to be
conforming.
From https://reviews.llvm.org/D20090 it appears the intent here was to
permit `__constant int` globals to be used in array bounds, but the
change in that patch only added half of the functionality necessary to
support that in the constant evaluator. This patch adds the other half
of the functionality and turns off constant folding for array bounds in
OpenCL.
I couldn't find any spec justification for accepting the kinds of cases
that D20090 accepts, so a reference to where in the OpenCL specification
this is permitted would be useful.
Note that this change also affects the code generation in one test:
because after 'const int n = 0' we now treat 'n' as a constant
expression with value 0, it's now a null pointer, so '(local int *)n'
forms a null pointer rather than a zero pointer.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D89520
This patch introduces 2 new address spaces in OpenCL: global_device and global_host
which are a subset of a global address space, so the address space scheme will be
looking like:
```
generic->global->host
->device
->private
->local
constant
```
Justification: USM allocations may be associated with both host and device memory. We
want to give users a way to tell the compiler the allocation type of a USM pointer for
optimization purposes. (Link to the Unified Shared Memory extension:
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc)
Before this patch USM pointer could be only in opencl_global
address space, hence a device backend can't tell if a particular pointer
points to host or device memory. On FPGAs at least we can generate more
efficient hardware code if the user tells us where the pointer can point -
being able to distinguish between these types of pointers at compile time
allows us to instantiate simpler load-store units to perform memory
transactions.
Patch by Dmitry Sidorov.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D82174
OpenCL 2.0 does not allow block arguments, primarily because it is
difficult to support function pointers on the various architectures
that OpenCL targets. Clang was still accepting them.
Rename and reuse the `err_opencl_half_param` diagnostic.
Fixes PR46324.
Differential Revision: https://reviews.llvm.org/D82313
This reverts commit defd43a5b3.
with correction to solve msan report
To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the
floating point settings in PCH files aren't compatible, rewrite
FPFeatures to use a delta in the settings rather than absolute settings.
With this patch, these floating point options can be benign.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D81869
This reverts commit b55d723ed6.
Reapply Modify FPFeatures to use delta not absolute settings
To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the
floating point settings in PCH files aren't compatible, rewrite
FPFeatures to use a delta in the settings rather than absolute settings.
With this patch, these floating point options can be benign.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D81869
Summary:
__builtin_amdgcn_atomic_inc32(int *Ptr, int Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_inc64(int64_t *Ptr, int64_t Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_dec32(int *Ptr, int Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_dec64(int64_t *Ptr, int64_t Val, unsigned MemoryOrdering, const char *SyncScope)
First and second arguments gets transparently passed to the amdgcn atomic
inc/dec intrinsic. Fifth argument of the intrinsic is set as true if the
first argument of the builtin is a volatile pointer. The third argument of
this builtin is one of the memory-ordering specifiers ATOMIC_ACQUIRE,
ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST following C++11 memory
model semantics. This is mapped to corresponding LLVM atomic memory ordering
for the atomic inc/dec instruction using CLANG atomic C ABI. The fourth
argument is an AMDGPU-specific synchronization scope defined as string.
Reviewers: arsenm, sameerds, JonChesterfield, jdoerfert
Reviewed By: arsenm, sameerds
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, kerbowa, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80804
Added extensions and their function declarations into
the standard header.
Patch by Piotr Fusik!
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79781
Summary:
This is a small improvement for OpenCL diagnostics, but is also useful
for our CHERI fork, as our __capability qualifier is suppressed from
diagnostics when all pointers are capabilities, only being used when pointers
need to be explicitly opted-in to being capabilities.
Reviewers: rsmith, Anastasia, aaron.ballman
Reviewed By: Anastasia, aaron.ballman
Subscribers: aaron.ballman, arichardson, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78777
Expose llvm fence instruction as clang builtin for AMDGPU target
__builtin_amdgcn_fence(unsigned int memoryOrdering, const char *syncScope)
The first argument of this builtin is one of the memory-ordering specifiers
__ATOMIC_ACQUIRE, __ATOMIC_RELEASE, __ATOMIC_ACQ_REL, or __ATOMIC_SEQ_CST
following C++11 memory model semantics. This is mapped to corresponding
LLVM atomic memory ordering for the fence instruction using LLVM atomic C
ABI. The second argument is an AMDGPU-specific synchronization scope
defined as string.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D75917
Add the sub-group builtin functions from the OpenCL Extension
specification. This patch excludes the sub_group_barrier builtins
that take argument types not yet handled by the
`-fdeclare-opencl-builtins` machinery.
Co-authored-by: Pierre Gondois <pierre.gondois@arm.com>
Compute and propagate conversion kind to diagnostics helper in C++
to provide more specific diagnostics about incorrect implicit
conversions in assignments, initializations, params, etc...
Duplicated some diagnostics as errors because C++ is more strict.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74116
The `-fdeclare-opencl-builtins` option was accepting saturated
conversions for non-integer types, which contradicts both the OpenCL
specification (v2.0 s6.2.3) and Clang's opencl-c.h file.
Address space conversion changes pointer representation.
This commit disallows such conversions when they are not
legal i.e. for the nested pointers even with compatible
address spaces. Because the address space conversion in
the nested levels can't be generated to modify the pointers
correctly. The behavior implemented is as follows:
- Any implicit conversions of nested pointers with different
address spaces is rejected.
- Any conversion of address spaces in nested pointers in safe
casts (e.g. const_cast or static_cast) is rejected.
- Conversion in low level C-style or reinterpret_cast is accepted
but with a warning (this aligns with OpenCL C behavior).
Fixes PR39674
Differential Revision: https://reviews.llvm.org/D73360
This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions.
OpenCL tests for new built-ins are included.
Differential Revision: https://reviews.llvm.org/D72723
The language wording change forgot to update overload resolution to rank
implicit conversion sequences based on qualification conversions in
reference bindings. The anticipated resolution for that oversight is
implemented here -- we order candidates based on qualification
conversion, not only on top-level cv-qualifiers, including ranking
reference bindings against non-reference bindings if they differ in
non-top-level qualification conversions.
For OpenCL/C++, this allows reference binding between pointers with
differing (nested) address spaces. This makes the behavior of reference
binding consistent with that of implicit pointer conversions, as is the
purpose of this change, but that pre-existing behavior for pointer
conversions is itself probably not correct. In any case, it's now
consistently the same behavior and implemented in only one place.
This reinstates commit de21704ba9,
reverted in commit d8018233d1, with
workarounds for some overload resolution ordering problems introduced by
CWG2352.
This reverts commit de21704ba9.
Regressed/causes this to error due to ambiguity:
void f(const int * const &);
void f(int *);
int main() {
int * x;
f(x);
}
(in case it's important - the original case where this turned up was a
member function overload in a class template with, essentially:
f(const T1&)
f(T2*)
(where T1 == X const *, T2 == X))
It's not super clear to me if this ^ is expected behavior, in which case
I'm sorry about the revert & happy to look into ways to fix the original
code.
Add printing of __private address space to TypePrinter to allow
it appears in diagnostics and AST dumps as all other language
addr spaces.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71272
The language wording change forgot to update overload resolution to rank
implicit conversion sequences based on qualification conversions in
reference bindings. The anticipated resolution for that oversight is
implemented here -- we order candidates based on qualification
conversion, not only on top-level cv-qualifiers.
For OpenCL/C++, this allows reference binding between pointers with
differing (nested) address spaces. This makes the behavior of reference
binding consistent with that of implicit pointer conversions, as is the
purpose of this change, but that pre-existing behavior for pointer
conversions is itself probably not correct. In any case, it's now
consistently the same behavior and implemented in only one place.
Provide a mechanism to attach OpenCL extension information to builtin
functions, so that their use can be restricted according to the
extension(s) the builtin is part of.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D71476
Allow sending address spaces into diagnostics to simplify and improve
error reporting. Improved wording of diagnostics for address spaces
in overloading.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71111
Summary:
Enable a way to set OpenCL language address space using attributes
in addition to existing keywords.
Signed-off-by: Victor Lomuller victor@codeplay.com
Reviewers: aaron.ballman, Anastasia
Subscribers: yaxunl, ebevhan, cfe-commits, Naghasan
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71005
Signed-off-by: Alexey Bader <alexey.bader@intel.com>
Summary:
- The deduced address space needs applying to its element type as well.
Reviewers: Anastasia
Subscribers: yaxunl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70981
In order to simplify implementation we are moving add space
deduction into Sema while constructing variable declaration
and on template instantiation. Pointee are deduced to generic
addr space during creation of types.
This commit also
- fixed addr space dedution for auto type;
- factors out in a separate helper function OpenCL specific
logic from type diagnostics in var decl.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D65744
We seem to have been gradually growing support for atomic min/max operations
(exposing longstanding IR atomicrmw instructions). But until now there have
been gaps in the expected intrinsics. This adds support for the C11-style
intrinsics (i.e. taking _Atomic, rather than individually blessed by C11
standard), and the variants that return the new value instead of the original
one.
That way, people won't be misled by trying one form and it not working, and the
front-end is more friendly to people using _Atomic types, as we recommend.
This patch adds the integer builtin functions from the OpenCL C
specification.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D69901
Support for C++ mode was accidentally lacking due to not checking the
OpenCLCPlusPlus LangOpts version.
Differential Revision: https://reviews.llvm.org/D69233
Add the -Wconversion -Werror options to check no unexpected conversion
is done.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D67714
llvm-svn: 372975
Add the image query builtin functions from the OpenCL C specification.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D67713
llvm-svn: 372833
Allow setting a MinVersion, stating from which OpenCL version a
builtin function is available, and a MaxVersion, stating from which
OpenCL version a builtin function should not be available anymore.
Guard some definitions of the "work-item" builtin functions according
to the OpenCL versions from which they are available.
Add the "vector data load and store" builtin functions (e.g.
vload/vstore), whose signatures differ before and after OpenCL 2.0 in
the pointer argument address spaces.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D63504
llvm-svn: 372321
Image types were previously available, but not working. This patch
adds image type handling.
Rename the image type definitions in the .td file to make them
consistent with other type names. Use abstract types to represent the
unqualified types. Instantiate access-qualified image types at the
point of use using, e.g. `ImageType<Image2d, "RO">`.
Add/update TableGen definitions for the read_image/write_image
builtin functions.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D63480
llvm-svn: 371046
The err_typecheck_call_too_few_args diagnostic takes arguments, but
none were provided causing clang to crash when attempting to diagnose
an enqueue_kernel call with too few arguments.
Fixes llvm.org/PR42045
Differential Revision: https://reviews.llvm.org/D66883
llvm-svn: 370322
Const, volatile, and pointer types were previously available, but not
working. This patch adds handling for OpenCL builtin functions.
Add TableGen definitions for some atomic and asynchronous builtins to
make use of the new functionality.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D63442
llvm-svn: 369373
Generic types are an abstraction of type sets. It mimics the way
functions are defined in the OpenCL specification. For example,
floatN can abstract all the vector sizes of the float type.
This allows to
* stick more closely to the specification, which uses generic types;
* factorize definitions of functions with numerous prototypes in the
tablegen file; and
* reduce the memory impact of functions with many overloads.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D65456
llvm-svn: 369253