The DAZ feature introduces the denormal zero support for x86.
Currently the definitions are located under SSE3 header, however there are some SSE2 targets that support the feature as well.
Differential Revision: https://reviews.llvm.org/D30194
llvm-svn: 296296
Note: The doxygen comments are automatically generated based on Sony's intrinsic
s document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 295404
Removed ndrange_t as Clang builtin type and added
as a struct type in the OpenCL header.
Use type name to do the Sema checking in enqueue_kernel
and modify IR generation accordingly.
Review: D28058
Patch by Dmitry Borisenkov!
llvm-svn: 295311
__fastfail terminates the process immediately with a special system
call. It does not run any process shutdown code or exception recovery
logic.
Fixes PR31854
llvm-svn: 294606
1. Adds the command line flag for clzero.
2. Includes the clzero flag under znver1.
3. Defines the macro for clzero.
4. Adds a new file which has the intrinsic definition for clzero instruction.
Patch by Ganesh Gopalasubramanian with some additional tests from me.
Differential revision: https://reviews.llvm.org/D29386
llvm-svn: 294559
Added doxygen comments to prfchwintrin.h's intrinsics.
Note: The doxygen comments are automatically generated based on Sony's intrinsic
s document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 293745
Prior to OpenCL 2.0, image3d_t can only be used with the write_only
access qualifier when the cl_khr_3d_image_writes extension is enabled,
see e.g. OpenCL 1.1 s6.8b.
Require the extension for write_only image3d_t types and guard uses of
write_only image3d_t in the OpenCL header.
Patch by Sven van Haastregt!
Review: https://reviews.llvm.org/D28860
llvm-svn: 293050
For a << b (as original vec_sl does), if b >= sizeof(a) * 8, the
behavior is undefined. However, Power instructions do define the
behavior, which is equivalent to a << (b % (sizeof(a) * 8)).
This patch changes altivec.h to use a << (b % (sizeof(a) * 8)), to
ensure the consistent semantic of the instructions. Then it combines
the generated multiple instructions back to a single shift.
This patch handles left shift only. Right shift, on the other hand, is
more complicated, considering arithematic/logical right shift.
Differential Revision: https://reviews.llvm.org/D28037
llvm-svn: 292659
Added doxygen comments for the newly added intrinsics in avxintrin.h, namely _mm256_cvtsd_f64, _mm256_cvtsi256_si32 and _mm256_cvtss_f32
Added doxygen comments for the new intrinsics in emmintrin.h, namely _mm_loadu_si64 and _mm_load_sd.
Explicit parameter names were added for _mm_clflush and _mm_setcsr
The rest of the changes are editorial, removing trailing spaces at the end of the lines.
Differential Revision: https://reviews.llvm.org/D28503
llvm-svn: 291876
Add builtins for the functions and custom codegen mapping the builtins to their
corresponding intrinsics and handling the endian related swapping.
https://reviews.llvm.org/D26546
llvm-svn: 291179
Summary:
MSVC seems to use "__in" and "__out" for its own purposes, so we have to
pick different names in this macro.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28325
llvm-svn: 291138
Summary:
These duplicate declarations cause a problem for CUDA compiles on
Windows. All implicitly-defined functions are host+device, and this
applies to the declarations in Builtin.def. But then when we see the
declarations in intrin.h, they have no attributes, so are host-only
functions. This is an error.
(A better fix might be to make these builtins host-only, but that is a
much bigger change.)
Reviewers: rnk
Subscribers: cfe-commits, echristo
Differential Revision: https://reviews.llvm.org/D28317
llvm-svn: 291128
CUDA-8.0 comes with new headers which nvcc pre-includes via cuda_runtime.h
Clang now makes them available as well.
Differential Revision: https://reviews.llvm.org/D28301
llvm-svn: 290982
Added \n commands to insert a line breaks where necessary, since one long line of documentation is nearly unreadable.
Formatted comments to fit into 80 chars.
In some cases added \a command in front of the parameter names to display them in italics.
llvm-svn: 290619
Improved doxygen comments for the following intrinsics headers: __wmmintrin_pclmul.h, bmiintrin.h, emmintrin.h, f16cintrin.h, immintrin.h, mmintrin.h, pmmintrin.h, tmmintrin.h
Added \n commands to insert a line breaks where necessary, since one long line of documentation is nearly unreadable.
Formatted comments to fit into 80 chars.
In some cases added \a command in front of the parameter names to display them in italics.
llvm-svn: 290561
According to extended asm syntax, a case where the clobber list includes a variable from the inputs or outputs should be an error - conflict.
for example:
const long double a = 0.0;
int main()
{
char b;
double t1 = a;
__asm__ ("fucompp": "=a" (b) : "u" (t1), "t" (t1) : "cc", "st", "st(1)");
return 0;
}
This should conflict with the output - t1 which is st, and st which is st aswell.
The patch fixes it.
Commit on behald of Ziv Izhar.
Differential Revision: https://reviews.llvm.org/D15075
llvm-svn: 290539
Added \n commands to insert a line breaks where necessary to make the documentation more readable.
Formatted comments to fit into 80 chars.
llvm-svn: 290458
Tagged parameter names with \a doxygen command to display parameters in italics.
Added \n commands to insert a line break to make the documentation more readable.
Formatted comments to fit into 80 chars.
llvm-svn: 290455
Added a map to associate types and declarations with extensions.
Refactored existing diagnostic for disabled types associated with extensions and extended it to declarations for generic situation.
Fixed some bugs for types associated with extensions.
Allow users to use pragma to declare types and functions for supported extensions, e.g.
#pragma OPENCL EXTENSION the_new_extension_name : begin
// declare types and functions associated with the extension here
#pragma OPENCL EXTENSION the_new_extension_name : end
Differential Revision: https://reviews.llvm.org/D21698
llvm-svn: 289979
Reverts r289181: it's currently breaking modules using simd.h in
10.12 SDK.
This reverts commit 6e73e3464e96a4e00492c24aa790d36e1adb5702.
llvm-svn: 289487
This will allow the backend to constant fold these to generic shuffle vectors like 128-bit and 256-bit without having to working about handling masking.
llvm-svn: 289351
This will allow the backend to constant fold these to generic shuffle vectors like 128-bit and 256-bit without having to working about handling masking.
llvm-svn: 289345
Tagged instruction names with <c> INSTR_NAME </c> to display them in typewriter font.
In the past, \c command was used, unfortunately it applied to only one word.
<c> .. </c> has the same meaning, but applies to all words in between the tags.
llvm-svn: 289249
Allow darwin to provide additional definitions and implementation
specifc values for tgmath.h on Apple platforms.
rdar://problem/19019845
llvm-svn: 289181
Improved doxygen comments for fxsrintrin.h and mmintrin.h intrinsics by taagging parameter names with \a doxygen command to display parameters in italics.
Formatted comments to fit into 80 chars.
llvm-svn: 289154
Improved doxygen comments for __wmmintrin_pclmul.h and ammintrin.h intrinsics by taagging parameter names with \a doxygen command to display parameters in italics.
Formatted comments to fit into 80 chars.
llvm-svn: 289083
Documentation for some of the avxintrin.h's intrinsics errorneously said that
non VEX-prefixed instructions could be generated. This was fixed.
I tried several different solutions to achieve pretty printing of unordered lists (nested and non-nested) in param sections in doxygen.
llvm-svn: 287990
(commit again after fixing the buildbot failures)
This adds various overloads of the following builtins to altivec.h:
vec_neg
vec_nabs
vec_adde
vec_addec
vec_sube
vec_subec
vec_subc
Note that for vec_sub builtins on 32 bit integers, the semantics is similar to
what ISA describes for instructions like vsubecuq that work on quadwords: the
first operand is added to the one's complement of the second operand. (As
opposed to two's complement which I expected).
llvm-svn: 287872
(commit again after fixing the buildbot failures)
This adds various overloads of the following builtins to altivec.h:
vec_neg
vec_nabs
vec_adde
vec_addec
vec_sube
vec_subec
vec_subc
Note that for vec_sub builtins on 32 bit integers, the semantics is similar to
what ISA describes for instructions like vsubecuq that work on quadwords: the
first operand is added to the one's complement of the second operand. (As
opposed to two's complement which I expected).
llvm-svn: 287795
This adds various overloads of the following builtins to altivec.h:
vec_neg
vec_nabs
vec_adde
vec_addec
vec_sube
vec_subec
vec_subc
Note that for vec_sub builtins on 32 bit integers, the semantics is similar to
what ISA describes for instructions like vsubecuq that work on quadwords: the
first operand is added to the one's complement of the second operand. (As
opposed to two's complement which I expected).
llvm-svn: 287772
The doxygen comments are automatically generated based on Sony's intrinsics docu
ment.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Charles Li.
llvm-svn: 287483
Added doxygen comments to avxintrin.h's intrinsics. As of now, all the intrinsics in this file that were documented by Sony's intrinsics guide should have corresponding doxygen comments.
Note: The doxygen comments are automatically generated based on Sony's intrinsic
s document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
Reviewed by Wolfgang Pieb.
llvm-svn: 287436
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Charles Li.
llvm-svn: 287317
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Paul Robinson and Charles Li.
llvm-svn: 287295
I made several changes for consistency with the rest of x86 instrinsics header files. Some of these changes help to render doxygen comments better.
1. avxintrin.h – Moved the opening bracket on a separate line for several
intrinsics (for consistency with the rest of the intrinsics).
2. emmintrin.h - Moved the doxygen comment next to the body of the function;
- Added braces after extern "C" even though there is only
one declaration each time
3. xmmintrin.h - Moved the doxygen comment next to the body of the function;
- Added intrinsic prototypes for a couple of macro definitions
into the doxygen comment;
- Added braces after extern "C" even though there is only one
declaration each time
4. ammintrin.h – Removed extra line between the doxygen comment and the body
of the functions (for consistency with the rest of the files).
Desk reviewed by Paul Robinson.
llvm-svn: 287278
Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen.
This patch removes the clang builtins and their use in the headers - a future patch will deal with removing the llvm intrinsics.
This is an extension patch to D20528 which dealt with the equivalent sse/avx cases.
Differential Revision: https://reviews.llvm.org/D26686
llvm-svn: 287088
This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.
llvm-svn: 286971
Adds 2 vector functions for converting from a vector of unsigned short to a
vector of float. One converts the low 4 halfwords and one converts the high
4 halfwords.
Differential Revision: https://reviews.llvm.org/D26534
llvm-svn: 286863
Add vector extract exponent/significand functions to altivec.h, as well as
functions (and related constants) to test the data class of vector float
and vector double.
Differential Revision: https://reviews.llvm.org/D26271
llvm-svn: 286830
This is part of a set of changes to allow InstCombine in the backend to optimize variable shifts without having to know about masking.
llvm-svn: 286757
Summary: Inverting the mask argument does not reflect the intended semantics of the intrinsic.
Reviewers: igorb, delena
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D26019
llvm-svn: 286733
Added doxygen comments to avxintrin.h's intrinsics. As of now, around 75% of the
intrinsics in this file are documented here. The patches for the other 25% will be se
nt out later.
Removed extra spaces in emmitrin.h.
Note: The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 286336
Certain OpenCL builtin functions are supposed to be executed by all threads in a work group or sub group. Such functions should not be made divergent during transformation. It makes sense to mark them with convergent attribute.
The adding of convergent attribute is based on Ettore Speziale's work and the original proposal and patch can be found at https://www.mail-archive.com/cfe-commits@lists.llvm.org/msg22271.html.
Differential Revision: https://reviews.llvm.org/D25343
llvm-svn: 285725
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.
llvm-svn: 285667
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.
llvm-svn: 285540
After LGTM and Check-all
Vector-reduction arithmetic accepts vectors as inputs and produces
scalars as outputs.This class of vector operation forms the basis
of many scientific computations. In vector-reduction arithmetic,
the evaluation off is independent of the order of the input elements of V.
Reviewer: 1. craig.topper
2. igorb
Differential Revision: https://reviews.llvm.org/D25988
llvm-svn: 285493
OpenCL disallows using variadic arguments (s6.9.e and s6.12.5 OpenCL v2.0)
apart from some exceptions:
- printf
- enqueue_kernel
This change adds error diagnostic for variadic functions but accepts printf
and any compiler internal function (which should cover __enqueue_kernel_XXX cases).
It also unifies diagnostic with block prototype and adds missing uncaught cases for blocks.
llvm-svn: 285395
Previously, these were always included -- after this change, you have to
#include <new>, which is consistent with how things ought to work.
llvm-svn: 285251