Added __cl_clang_non_portable_kernel_param_types extension that
allows using non-portable types as kernel parameters. This allows
bypassing the portability guarantees from the restrictions specified
in C++ for OpenCL v1.0 s2.4.
Currently this only disables the restrictions related to the data
layout. The programmer should ensure the compiler generates the same
layout for host and device or otherwise the argument should only be
accessed on the device side. This extension could be extended to other
case (e.g. permitting size_t) if desired in the future.
Patch by olestrohm (Ole Strohm)!
https://reviews.llvm.org/D101168
This extension is primarily targeting SPIR-V compilations flow
as the IR translation is the same between 1.x and 2.x atomics.
Differential Revision: https://reviews.llvm.org/D101089
Adds the __clang_literal_encoding__ and __clang_wide_literal_encoding__
predefined macros to expose the encoding used for string literals to
the preprocessor.
Updates __is_unsigned to have the same behavior as the standard
specifies. This is in line with 511dbd8, which applied the same change
to __is_signed.
Refs D67897.
Differential Revision: https://reviews.llvm.org/D98104
This patch adds support for two new variants of the vectorize_width
pragma:
1. vectorize_width(X[, fixed|scalable]) where an optional second
parameter is passed to the vectorize_width pragma, which indicates if
the user wishes to use fixed width or scalable vectorization. For
example the user can now write something like:
#pragma clang loop vectorize_width(4, fixed)
or
#pragma clang loop vectorize_width(4, scalable)
In the absence of a second parameter it is assumed the user wants
fixed width vectorization, in order to maintain compatibility with
existing code.
2. vectorize_width(fixed|scalable) where the width is left unspecified,
but the user hints what type of vectorization they prefer, either
fixed width or scalable.
I have implemented this by making use of the LLVM loop hint attribute:
llvm.loop.vectorize.scalable.enable
Tests were added to
clang/test/CodeGenCXX/pragma-loop.cpp
for both the 'fixed' and 'scalable' optional parameter.
See this thread for context: http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html
Differential Revision: https://reviews.llvm.org/D89031
With the internal clang extension '__cl_clang_variadic_functions'
variadic functions are accepted by the frontend.
This is not a fully supported vendor/Khronos extension
as it can only be used on targets with variadic prototype
support or in metaprogramming to represent functions with
generic prototype without calling such functions in the
kernel code.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D94027
The new clang internal extension '__cl_clang_function_pointers'
allows use of function pointers and other features that have
the same functionality:
- Use of member function pointers;
- Unrestricted use of references to functions;
- Virtual member functions.
This not a vendor extension and therefore it doesn't require any
special target support. Exposing this functionality fully
will require vendor or Khronos extension.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D94021
The `assume` attribute is a way to provide additional, arbitrary
information to the optimizer. For now, assumptions are restricted to
strings which will be accumulated for a function and emitted as comma
separated string function attribute. The key of the LLVM-IR function
attribute is `llvm.assume`. Similar to `llvm.assume` and
`__builtin_assume`, the `assume` attribute provides a user defined
assumption to the compiler.
A follow up patch will introduce an LLVM-core API to query the
assumptions attached to a function. We also expect to add more options,
e.g., expression arguments, to the `assume` attribute later on.
The `omp [begin] asssumes` pragma will leverage this attribute and
expose the functionality in the absence of OpenMP.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D91979
Recently HIP toolchain made a change to use clang instead of opt/llc to do compilation
(https://reviews.llvm.org/D81861). The intention is to make HIP toolchain canonical like
other toolchains.
However, this change introduced an unintentional change regarding backend fp fuse
option, which caused regressions in some HIP applications.
Basically before the change, HIP toolchain used clang to generate bitcode, then use
opt/llc to optimize bitcode and generate ISA. As such, the amdgpu backend takes
the default fp fuse mode which is 'Standard'. This mode respect contract flag of
fmul/fadd instructions and do not fuse fmul/fadd instructions without contract flag.
However, after the change, HIP toolchain now use clang to generate IR, do optimization,
and generate ISA as one process. Now amdgpu backend fp fuse option is determined
by -ffp-contract option, which is 'fast' by default. And this -ffp-contract=fast language option
is translated to 'Fast' fp fuse option in backend. Suddenly backend starts to fuse fmul/fadd
instructions without contract flag.
This causes wrong result for some device library functions, e.g. tan(-1e20), which should
return 0.8446, now returns -0.933. What is worse is that since backend with 'Fast' fp fuse
option does not respect contract flag, there is no way to use #pragma clang fp contract
directive to enforce fp contract requirements.
This patch fixes the regression by introducing a new value 'fast-honor-pragmas' for -ffp-contract
and use it for HIP by default. 'fast-honor-pragmas' is equivalent to 'fast' in frontend but
let the backend to use 'Standard' fp fuse option. 'fast-honor-pragmas' is useful since 'Fast'
fp fuse option in backend does not honor contract flag, it is of little use to HIP
applications since all code with #pragma STDC FP_CONTRACT or any IR from a
source compiled with -ffp-contract=on is broken.
Differential Revision: https://reviews.llvm.org/D90174
Pragma 'clang fp' is extended to support a new option, 'exceptions'. It
allows to specify floating point exception behavior more flexibly.
Differential Revision: https://reviews.llvm.org/D89849
This patch updates the documentation about `__builtin_memcpy_inline` and reorders the sections so it is more consitent and understandable.
Differential Revision: https://reviews.llvm.org/D87458
This enables us to use the __builtin_rotateleft / __builtin_rotateright 8/16/32/64 intrinsics inside constexpr code.
Differential Revision: https://reviews.llvm.org/D86342
Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
In detail this patch:
- introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type.
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Alexandros Lamprineas
- Luke Geeson
- Simon Tatham
- Ties Stuij
Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli
Reviewed By: SjoerdMeijer
Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76077
Extension vectors now can be used in element-wise conditional selector.
For example:
```
R[i] = C[i]? A[i] : B[i]
```
This feature was previously only enabled in OpenCL C. Now it's also
available in C. Not that it has different behaviors than GNU vectors
(i.e. __vector_size__). Extension vectors selects on signdness of the
vector. GNU vectors on the other hand do normal bool conversions. Also,
this feature is not available in C++.
Differential Revision: https://reviews.llvm.org/D80574
test cases
Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff
Differential Revision: https://reviews.llvm.org/D72841
This reverts commit 85dc033cac, and makes
corrections to the test cases that failed on buildbots.
Summary:
When using -ftrivial-auto-var-init=* options to initiate automatic
variables in a file, to disable initialization on some variables,
currently we have to manually annotate the variables with uninitialized
attribute, such as
int dont_initialize_me __attribute((uninitialized));
Making pragma clang attribute to support this attribute would make
annotating variables much easier, and could be particular useful for
bisection efforts, e.g.
void use(void*);
void buggy() {
int arr[256];
int boom;
float bam;
struct { int oops; } oops;
union { int oof; float aaaaa; } oof;
use(&arr);
use(&boom);
use(&bam);
use(&oops);
use(&oof);
}
Reviewers: jfb, rjmccall, aaron.ballman
Reviewed By: jfb, aaron.ballman
Subscribers: aaron.ballman, george.burgess.iv, dexonsmith, MaskRay, phosek, hubert.reinterpretcast, gbiv, manojgupta, llozano, srhines, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78693
This reverts commit 61ba1481e2.
I'm reverting this because it breaks the lldb build with
incomplete switch coverage warnings. I would fix it forward,
but am not familiar enough with lldb to determine the correct
fix.
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
Introduction/Motivation:
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax.
Integers of non-power-of-two aren't particularly interesting or useful
on most hardware, so much so that no language in Clang has been
motivated to expose it before.
However, in the case of FPGA hardware normal integer types where the
full bitwidth isn't used, is extremely wasteful and has severe
performance/space concerns. Because of this, Intel has introduced this
functionality in the High Level Synthesis compiler[0]
under the name "Arbitrary Precision Integer" (ap_int for short). This
has been extremely useful and effective for our users, permitting them
to optimize their storage and operation space on an architecture where
both can be extremely expensive.
We are proposing upstreaming a more palatable version of this to the
community, in the form of this proposal and accompanying patch. We are
proposing the syntax _ExtInt(N). We intend to propose this to the WG14
committee[1], and the underscore-capital seems like the active direction
for a WG14 paper's acceptance. An alternative that Richard Smith
suggested on the initial review was __int(N), however we believe that
is much less acceptable by WG14. We considered _Int, however _Int is
used as an identifier in libstdc++ and there is no good way to fall
back to an identifier (since _Int(5) is indistinguishable from an
unnamed initializer of a template type named _Int).
[0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html)
[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf
Differential Revision: https://reviews.llvm.org/D73967
LanguageExtensions.rst:2191: WARNING: Title underline too short.
llvm-symbolizer.rst:157: Error in "code-block" directive: maximum 1 argument(s) allowed, 30 supplied.
memchr consistent and comprehensible, and document them.
We previously allowed evaluation of memcmp on arrays of integers of any
size, so long as the call evaluated to 0, and allowed evaluation of
memchr on any array of integral type of size 1 (including enums). The
purpose of constant-evaluating these builtins is only to support
constexpr std::char_traits, so we now consistently allow them on arrays
of (possibly signed or unsigned) char only.
In order to support non-user-named kernels, SYCL needs some way in the
integration headers to name the kernel object themselves. Initially, the
design considered just RTTI naming of the lambdas, this results in a
quite unstable situation in light of some device/host macros.
Additionally, this ends up needing to use RTTI, which is a burden on the
implementation and typically unsupported.
Instead, we've introduced a builtin, __builtin_unique_stable_name, which
takes a type or expression, and results in a constexpr constant
character array that uniquely represents the type (or type of the
expression) being passed to it.
The implementation accomplishes that simply by using a slightly modified
version of the Itanium Mangling. The one exception is when mangling
lambdas, instead of appending the index of the lambda in the function,
it appends the macro-expansion back-trace of the lambda itself in the
form LINE->COL[~LINE->COL...].
Differential Revision: https://reviews.llvm.org/D76620
There are a few places with unexpected indents that trip over sphinx and
other syntax errors.
Also, the C++ syntax highlighting does not work for
class [[gsl::Owner(int)]] IntOwner {
Use a regular code:: block instead.
There are a few other warnings errors remaining, of the form
'Duplicate explicit target name: "cmdoption-clang--prefix"'. They seem
to be caused by the following
.. option:: -B<dir>, --prefix <arg>, --prefix=<arg>
I am no Restructured Text expert, but it seems like sphinx 1.8.5
tries to generate the same target for the --prefix <arg> and
--prefix=<arg>. This pops up in a lot of places and I am not sure how to
best resolve it
Reviewers: jfb, Bigcheese, dexonsmith, rjmccall
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D76534
Summary:
Clang's "asm goto" feature didn't initially support outputs constraints. That
was the same behavior as gcc's implementation. The decision by gcc not to
support outputs was based on a restriction in their IR regarding terminators.
LLVM doesn't restrict terminators from returning values (e.g. 'invoke'), so
it made sense to support this feature.
Output values are valid only on the 'fallthrough' path. If an output value's used
on an indirect branch, then it's 'poisoned'.
In theory, outputs *could* be valid on the 'indirect' paths, but it's very
difficult to guarantee that the original semantics would be retained. E.g.
because indirect labels could be used as data, we wouldn't be able to split
critical edges in situations where two 'callbr' instructions have the same
indirect label, because the indirect branch's destination would no longer be
the same.
Reviewers: jyknight, nickdesaulniers, hfinkel
Reviewed By: jyknight, nickdesaulniers
Subscribers: MaskRay, rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69876
user interface and documentation, and update __cplusplus for C++20.
WG21 considers the C++20 standard to be finished (even though it still
has some more steps to pass through in the ISO process).
The old flag names are accepted for compatibility, as usual, and we
still have lots of references to C++2a in comments and identifiers;
those can be cleaned up separately.