size_t is usually defined as unsigned long, but on 64-bit platforms,
stdint.h currently defines SIZE_MAX using "ull" (unsigned long long).
Although this is the same width, it doesn't necessarily have the same
alignment or calling convention. It also triggers printf warnings when
using the format flag "%zu" to print SIZE_MAX.
This changes SIZE_MAX to reuse the compiler-provided __SIZE_MAX__, and
provides similar fixes for the other integers:
- INTPTR_MIN
- INTPTR_MAX
- UINTPTR_MAX
- PTRDIFF_MIN
- PTRDIFF_MAX
- INTMAX_MIN
- INTMAX_MAX
- UINTMAX_MAX
- INTMAX_C()
- UINTMAX_C()
... and fixes the typedefs for intptr_t and uintptr_t to use
__INTPTR_TYPE__ and __UINTPTR_TYPE__ instead of int32_t, effectively
reverting r89224, r89226, and r89237 (r89221 already having been
effectively reverted).
We can probably also kill __INTPTR_WIDTH__, __INTMAX_WIDTH__, and
__UINTMAX_WIDTH__ in a follow-up, but I was hesitant to delete all the
per-target CHECK lines in this commit since those might serve their own
purpose.
rdar://problem/11811377
llvm-svn: 301593
Summary: This patch makes the header `stdatomic.h` work when `-fms-compatibility` is specified.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D32322
llvm-svn: 300919
- To be consistent with the rest of the intrinsics headers, I removed the tags <i> .. </i> for marking instruction names in italics in in smmintrin.h.
- Formatting changes to fit into 80 characters.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 300578
MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics.
LLVM companion patch: D31767.
Differential Revision: https://reviews.llvm.org/D31766
llvm-svn: 300326
It's used by MS headers in VS 2017 without including intrin.h, so we
can't implement it in the header anymore.
Differential Revision: https://reviews.llvm.org/D31736
llvm-svn: 299782
It seems MS headers have started using __readgsqword, and since it's
used in a header that doesn't include intrin.h, we can't implement it as
an inline function anymore.
That was already the case for __readfsdword, which Saleem added support
for in r220859. This patch reuses that codegen to implement all of
__read[fg]s{byte,word,dword,qword}.
Differential Revision: https://reviews.llvm.org/D31248
llvm-svn: 298538
I made some small changes in smmintrin.h and emmintrin.h intrinsics.
- changed some regular comments '//' into doxygen-style comments '///' where necessary
- removed some trailing spaces in doxygen comments.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 298371
- Fix a variable naming mismatch
- Fix gcc extension pointer arithmetic on void to cast to char *.
- Test that the header (and htmintrin.h) parse.
llvm-svn: 298318
Reapply r289181 but rename the include guard to avoid
conflict with the one from Darwin.
Allow darwin to provide additional definitions and implementation
specifc values for tgmath.h on Apple platforms.
rdar://problem/19019845
llvm-svn: 298013
The DAZ feature introduces the denormal zero support for x86.
Currently the definitions are located under SSE3 header, however there are some SSE2 targets that support the feature as well.
Differential Revision: https://reviews.llvm.org/D30194
llvm-svn: 296296
Note: The doxygen comments are automatically generated based on Sony's intrinsic
s document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 295404
Removed ndrange_t as Clang builtin type and added
as a struct type in the OpenCL header.
Use type name to do the Sema checking in enqueue_kernel
and modify IR generation accordingly.
Review: D28058
Patch by Dmitry Borisenkov!
llvm-svn: 295311
__fastfail terminates the process immediately with a special system
call. It does not run any process shutdown code or exception recovery
logic.
Fixes PR31854
llvm-svn: 294606
1. Adds the command line flag for clzero.
2. Includes the clzero flag under znver1.
3. Defines the macro for clzero.
4. Adds a new file which has the intrinsic definition for clzero instruction.
Patch by Ganesh Gopalasubramanian with some additional tests from me.
Differential revision: https://reviews.llvm.org/D29386
llvm-svn: 294559
Added doxygen comments to prfchwintrin.h's intrinsics.
Note: The doxygen comments are automatically generated based on Sony's intrinsic
s document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 293745
Prior to OpenCL 2.0, image3d_t can only be used with the write_only
access qualifier when the cl_khr_3d_image_writes extension is enabled,
see e.g. OpenCL 1.1 s6.8b.
Require the extension for write_only image3d_t types and guard uses of
write_only image3d_t in the OpenCL header.
Patch by Sven van Haastregt!
Review: https://reviews.llvm.org/D28860
llvm-svn: 293050
For a << b (as original vec_sl does), if b >= sizeof(a) * 8, the
behavior is undefined. However, Power instructions do define the
behavior, which is equivalent to a << (b % (sizeof(a) * 8)).
This patch changes altivec.h to use a << (b % (sizeof(a) * 8)), to
ensure the consistent semantic of the instructions. Then it combines
the generated multiple instructions back to a single shift.
This patch handles left shift only. Right shift, on the other hand, is
more complicated, considering arithematic/logical right shift.
Differential Revision: https://reviews.llvm.org/D28037
llvm-svn: 292659
Added doxygen comments for the newly added intrinsics in avxintrin.h, namely _mm256_cvtsd_f64, _mm256_cvtsi256_si32 and _mm256_cvtss_f32
Added doxygen comments for the new intrinsics in emmintrin.h, namely _mm_loadu_si64 and _mm_load_sd.
Explicit parameter names were added for _mm_clflush and _mm_setcsr
The rest of the changes are editorial, removing trailing spaces at the end of the lines.
Differential Revision: https://reviews.llvm.org/D28503
llvm-svn: 291876
Add builtins for the functions and custom codegen mapping the builtins to their
corresponding intrinsics and handling the endian related swapping.
https://reviews.llvm.org/D26546
llvm-svn: 291179
Summary:
MSVC seems to use "__in" and "__out" for its own purposes, so we have to
pick different names in this macro.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28325
llvm-svn: 291138
Summary:
These duplicate declarations cause a problem for CUDA compiles on
Windows. All implicitly-defined functions are host+device, and this
applies to the declarations in Builtin.def. But then when we see the
declarations in intrin.h, they have no attributes, so are host-only
functions. This is an error.
(A better fix might be to make these builtins host-only, but that is a
much bigger change.)
Reviewers: rnk
Subscribers: cfe-commits, echristo
Differential Revision: https://reviews.llvm.org/D28317
llvm-svn: 291128
CUDA-8.0 comes with new headers which nvcc pre-includes via cuda_runtime.h
Clang now makes them available as well.
Differential Revision: https://reviews.llvm.org/D28301
llvm-svn: 290982
Added \n commands to insert a line breaks where necessary, since one long line of documentation is nearly unreadable.
Formatted comments to fit into 80 chars.
In some cases added \a command in front of the parameter names to display them in italics.
llvm-svn: 290619
Improved doxygen comments for the following intrinsics headers: __wmmintrin_pclmul.h, bmiintrin.h, emmintrin.h, f16cintrin.h, immintrin.h, mmintrin.h, pmmintrin.h, tmmintrin.h
Added \n commands to insert a line breaks where necessary, since one long line of documentation is nearly unreadable.
Formatted comments to fit into 80 chars.
In some cases added \a command in front of the parameter names to display them in italics.
llvm-svn: 290561
According to extended asm syntax, a case where the clobber list includes a variable from the inputs or outputs should be an error - conflict.
for example:
const long double a = 0.0;
int main()
{
char b;
double t1 = a;
__asm__ ("fucompp": "=a" (b) : "u" (t1), "t" (t1) : "cc", "st", "st(1)");
return 0;
}
This should conflict with the output - t1 which is st, and st which is st aswell.
The patch fixes it.
Commit on behald of Ziv Izhar.
Differential Revision: https://reviews.llvm.org/D15075
llvm-svn: 290539
Added \n commands to insert a line breaks where necessary to make the documentation more readable.
Formatted comments to fit into 80 chars.
llvm-svn: 290458
Tagged parameter names with \a doxygen command to display parameters in italics.
Added \n commands to insert a line break to make the documentation more readable.
Formatted comments to fit into 80 chars.
llvm-svn: 290455
Added a map to associate types and declarations with extensions.
Refactored existing diagnostic for disabled types associated with extensions and extended it to declarations for generic situation.
Fixed some bugs for types associated with extensions.
Allow users to use pragma to declare types and functions for supported extensions, e.g.
#pragma OPENCL EXTENSION the_new_extension_name : begin
// declare types and functions associated with the extension here
#pragma OPENCL EXTENSION the_new_extension_name : end
Differential Revision: https://reviews.llvm.org/D21698
llvm-svn: 289979
Reverts r289181: it's currently breaking modules using simd.h in
10.12 SDK.
This reverts commit 6e73e3464e96a4e00492c24aa790d36e1adb5702.
llvm-svn: 289487
This will allow the backend to constant fold these to generic shuffle vectors like 128-bit and 256-bit without having to working about handling masking.
llvm-svn: 289351
This will allow the backend to constant fold these to generic shuffle vectors like 128-bit and 256-bit without having to working about handling masking.
llvm-svn: 289345
Tagged instruction names with <c> INSTR_NAME </c> to display them in typewriter font.
In the past, \c command was used, unfortunately it applied to only one word.
<c> .. </c> has the same meaning, but applies to all words in between the tags.
llvm-svn: 289249
Allow darwin to provide additional definitions and implementation
specifc values for tgmath.h on Apple platforms.
rdar://problem/19019845
llvm-svn: 289181
Improved doxygen comments for fxsrintrin.h and mmintrin.h intrinsics by taagging parameter names with \a doxygen command to display parameters in italics.
Formatted comments to fit into 80 chars.
llvm-svn: 289154
Improved doxygen comments for __wmmintrin_pclmul.h and ammintrin.h intrinsics by taagging parameter names with \a doxygen command to display parameters in italics.
Formatted comments to fit into 80 chars.
llvm-svn: 289083
Documentation for some of the avxintrin.h's intrinsics errorneously said that
non VEX-prefixed instructions could be generated. This was fixed.
I tried several different solutions to achieve pretty printing of unordered lists (nested and non-nested) in param sections in doxygen.
llvm-svn: 287990
(commit again after fixing the buildbot failures)
This adds various overloads of the following builtins to altivec.h:
vec_neg
vec_nabs
vec_adde
vec_addec
vec_sube
vec_subec
vec_subc
Note that for vec_sub builtins on 32 bit integers, the semantics is similar to
what ISA describes for instructions like vsubecuq that work on quadwords: the
first operand is added to the one's complement of the second operand. (As
opposed to two's complement which I expected).
llvm-svn: 287872
(commit again after fixing the buildbot failures)
This adds various overloads of the following builtins to altivec.h:
vec_neg
vec_nabs
vec_adde
vec_addec
vec_sube
vec_subec
vec_subc
Note that for vec_sub builtins on 32 bit integers, the semantics is similar to
what ISA describes for instructions like vsubecuq that work on quadwords: the
first operand is added to the one's complement of the second operand. (As
opposed to two's complement which I expected).
llvm-svn: 287795
This adds various overloads of the following builtins to altivec.h:
vec_neg
vec_nabs
vec_adde
vec_addec
vec_sube
vec_subec
vec_subc
Note that for vec_sub builtins on 32 bit integers, the semantics is similar to
what ISA describes for instructions like vsubecuq that work on quadwords: the
first operand is added to the one's complement of the second operand. (As
opposed to two's complement which I expected).
llvm-svn: 287772
The doxygen comments are automatically generated based on Sony's intrinsics docu
ment.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Charles Li.
llvm-svn: 287483
Added doxygen comments to avxintrin.h's intrinsics. As of now, all the intrinsics in this file that were documented by Sony's intrinsics guide should have corresponding doxygen comments.
Note: The doxygen comments are automatically generated based on Sony's intrinsic
s document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
Reviewed by Wolfgang Pieb.
llvm-svn: 287436
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Charles Li.
llvm-svn: 287317
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Paul Robinson and Charles Li.
llvm-svn: 287295
I made several changes for consistency with the rest of x86 instrinsics header files. Some of these changes help to render doxygen comments better.
1. avxintrin.h – Moved the opening bracket on a separate line for several
intrinsics (for consistency with the rest of the intrinsics).
2. emmintrin.h - Moved the doxygen comment next to the body of the function;
- Added braces after extern "C" even though there is only
one declaration each time
3. xmmintrin.h - Moved the doxygen comment next to the body of the function;
- Added intrinsic prototypes for a couple of macro definitions
into the doxygen comment;
- Added braces after extern "C" even though there is only one
declaration each time
4. ammintrin.h – Removed extra line between the doxygen comment and the body
of the functions (for consistency with the rest of the files).
Desk reviewed by Paul Robinson.
llvm-svn: 287278
Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen.
This patch removes the clang builtins and their use in the headers - a future patch will deal with removing the llvm intrinsics.
This is an extension patch to D20528 which dealt with the equivalent sse/avx cases.
Differential Revision: https://reviews.llvm.org/D26686
llvm-svn: 287088
This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.
llvm-svn: 286971
Adds 2 vector functions for converting from a vector of unsigned short to a
vector of float. One converts the low 4 halfwords and one converts the high
4 halfwords.
Differential Revision: https://reviews.llvm.org/D26534
llvm-svn: 286863
Add vector extract exponent/significand functions to altivec.h, as well as
functions (and related constants) to test the data class of vector float
and vector double.
Differential Revision: https://reviews.llvm.org/D26271
llvm-svn: 286830
This is part of a set of changes to allow InstCombine in the backend to optimize variable shifts without having to know about masking.
llvm-svn: 286757
Summary: Inverting the mask argument does not reflect the intended semantics of the intrinsic.
Reviewers: igorb, delena
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D26019
llvm-svn: 286733
Added doxygen comments to avxintrin.h's intrinsics. As of now, around 75% of the
intrinsics in this file are documented here. The patches for the other 25% will be se
nt out later.
Removed extra spaces in emmitrin.h.
Note: The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 286336
Certain OpenCL builtin functions are supposed to be executed by all threads in a work group or sub group. Such functions should not be made divergent during transformation. It makes sense to mark them with convergent attribute.
The adding of convergent attribute is based on Ettore Speziale's work and the original proposal and patch can be found at https://www.mail-archive.com/cfe-commits@lists.llvm.org/msg22271.html.
Differential Revision: https://reviews.llvm.org/D25343
llvm-svn: 285725
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.
llvm-svn: 285667
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.
llvm-svn: 285540
After LGTM and Check-all
Vector-reduction arithmetic accepts vectors as inputs and produces
scalars as outputs.This class of vector operation forms the basis
of many scientific computations. In vector-reduction arithmetic,
the evaluation off is independent of the order of the input elements of V.
Reviewer: 1. craig.topper
2. igorb
Differential Revision: https://reviews.llvm.org/D25988
llvm-svn: 285493
OpenCL disallows using variadic arguments (s6.9.e and s6.12.5 OpenCL v2.0)
apart from some exceptions:
- printf
- enqueue_kernel
This change adds error diagnostic for variadic functions but accepts printf
and any compiler internal function (which should cover __enqueue_kernel_XXX cases).
It also unifies diagnostic with block prototype and adds missing uncaught cases for blocks.
llvm-svn: 285395
Previously, these were always included -- after this change, you have to
#include <new>, which is consistent with how things ought to work.
llvm-svn: 285251
Summary: The preserved input should be the first argument and the vector inputs should be in the same order as the intrinsics it is used to implement.
Reviewers: igorb, delena
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D25902
llvm-svn: 285175
Committed after LGTM and check-all
Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs.
This class of vector operation forms the basis of many scientific computations.
In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V.
Used bisection method. At each step, we partition the vector with previous
step in half, and the operation is performed on its two halves.
This takes log2(n) steps where n is the number of elements in the vector.
Reviwer: 1. igorb
2. craig.topper
Differential Revision: https://reviews.llvm.org/D25527
llvm-svn: 285054
Committed after LGTM and check-all
Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs.
This class of vector operation forms the basis of many scientific computations.
In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V.
Used bisection method. At each step, we partition the vector with previous
step in half, and the operation is performed on its two halves.
This takes log2(n) steps where n is the number of elements in the vector.
Differential Revision: https://reviews.llvm.org/D25527
llvm-svn: 284963
With this patch, all intrinsics in this file (with an exception of a handful of a recently added ones) will be documented. I will send out a patch for 4 missining intrisics later.
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Yunzhong Gao.
llvm-svn: 284934
With this patch, 75% of the intrinsics in this file will be documented now. The patches for the rest of the intrisics in this file will be send out later.
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao.
llvm-svn: 284754
Summary: We need `__stosb` to be an intrinsic, because SecureZeroMemory function uses it without including intrin.h. Implementing it as a volatile memset is not consistent with MSDN specification, but it gives us target-independent IR while keeping the most important properties of `__stosb`.
Reviewers: rnk, hans, thakis, majnemer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D25334
llvm-svn: 284253
Summary: Previously global 64-bit versions of _Interlocked functions broke buildbots on i386, so now I'm adding them as builtins for x86-64 and ARM only (should they be also on AArch64? I had problems with testing it for AArch64, so I left it)
Reviewers: hans, majnemer, mstorsjo, rnk
Subscribers: cfe-commits, aemerson
Differential Revision: https://reviews.llvm.org/D25576
llvm-svn: 284172
Summary: _BitScan intrinsics (and some others, for example _Interlocked and _bittest) are supposed to work on both ARM and x86. This is an attempt to isolate them, avoiding repeating their code or writing separate function for each builtin.
Reviewers: hans, thakis, rnk, majnemer
Subscribers: RKSimon, cfe-commits, aemerson
Differential Revision: https://reviews.llvm.org/D25264
llvm-svn: 284060
These were reverted in r283753 and r283747.
The first patch added a header to the root 'Headers' install directory,
instead of into 'Headers/cuda_wrappers'. This was fixed in the second
patch, but by then the damage was done: The bad header stayed in the
'Headers' directory, continuing to break the build.
We reverted both patches in an attempt to fix things, but that still
didn't get rid of the header, so the Windows boostrap build remained
broken.
It's probably worth fixing up our cmake logic to remove things from the
install dirs, but in the meantime, re-land these patches, since we
believe they no longer have this bug.
llvm-svn: 283907