Commit Graph

1159 Commits

Author SHA1 Message Date
Zaara Syeda 56fa12c5a3 test commmit
llvm-svn: 286977
2016-11-15 15:57:33 +00:00
Tony Jiang 6a49aad177 [PowerPC] Implement BE VSX load/store builtins - clang portion.
This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.

llvm-svn: 286971
2016-11-15 14:30:56 +00:00
Sean Fertile a9548937d6 [PPC] altivec.h functions for converting half precision to single precision.
Adds 2 vector functions for converting from a vector of unsigned short to a
vector of float. One converts the low 4 halfwords and one converts the high
4 halfwords.

Differential Revision: https://reviews.llvm.org/D26534

llvm-svn: 286863
2016-11-14 18:47:15 +00:00
Sean Fertile 193430fe51 [PPC] add extract sig/exp test data class for vec float and vec double.
Add vector extract exponent/significand functions to altivec.h, as well as
 functions (and related constants) to test the data class of vector float
 and vector double.

 Differential Revision: https://reviews.llvm.org/D26271

llvm-svn: 286830
2016-11-14 14:43:27 +00:00
Craig Topper 5e0709d60b [AVX-512] Replace masked dword and qword variable shift builtins with unmasked builtins and a select.
This is part of a set of changes to allow InstCombine in the backend to optimize variable shifts without having to know about masking.

llvm-svn: 286757
2016-11-13 07:26:34 +00:00
Craig Topper d7e5b21914 [X86] Remove extra escaped new lines in intrinsic headers left over from an earlier conversion away from a macro. NFC
llvm-svn: 286756
2016-11-13 07:26:31 +00:00
Craig Topper 298aa12b63 [AVX-512] Add returns to shift intrinsics that converted from macros in r286714.
llvm-svn: 286738
2016-11-13 00:35:01 +00:00
Craig Topper 2c8f49e67b [AVX-512] Use scalar vfmsub/vfnmsub mask3 intrinsics instead of inverting the mask argument of a vfmadd intrinsic.
Summary: Inverting the mask argument does not reflect the intended semantics of the intrinsic.

Reviewers: igorb, delena

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D26019

llvm-svn: 286733
2016-11-12 23:24:34 +00:00
Craig Topper 1a44193afd [AVX-512] Convert the rest of the masked shift by immediate and by single element builtins over to the newly added unmasked builtins and a select.
This should also fix PR30691 since the new builtins are handled like the legacy builtins in the backend.

llvm-svn: 286714
2016-11-12 07:16:59 +00:00
Nemanja Ivanovic 4de0011b5c [PowerPC] Implement remaining permute builtins in altivec.h - Clang portion
This patch corresponds to review:
https://reviews.llvm.org/D26479

It adds the remaining vector permute/rotate builtins to altivec.h.

llvm-svn: 286650
2016-11-11 22:34:44 +00:00
Nemanja Ivanovic 4079fc8188 [PowerPC] Add vector conversion builtins to altivec.h - clang portion
This patch corresponds to review:
https://reviews.llvm.org/D26308

It adds a number of vector type conversion builtins to altivec.h.

llvm-svn: 286627
2016-11-11 19:56:17 +00:00
Tony Jiang 7723f97d6a [PowerPC] Implement plain VSX load/store builtins.
Implement all the different 24 overloads for vec_xl and vec_xst.

llvm-svn: 286455
2016-11-10 14:39:56 +00:00
Ekaterina Romanova 64adc38e51 Doxygen comments for avxintrin.h.
Added doxygen comments to avxintrin.h's intrinsics. As of now, around 75% of the
intrinsics in this file are documented here. The patches for the other 25% will be se
nt out later.

Removed extra spaces in emmitrin.h.

Note: The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.

llvm-svn: 286336
2016-11-09 03:58:30 +00:00
Ayman Musa e60a41ca28 [X86][AVX512][Clang] Add support for mask_{move|store|load}_s{s/d} and int2mask/mask2int intrinsics.
Differential Revision: https://reviews.llvm.org/D26021

llvm-svn: 286229
2016-11-08 12:00:30 +00:00
Tony Jiang c6ddd7221c [PowerPC] Implement remaining vector comparison builtins.
vector bool char vec_cmpeq (vector bool char, vector bool char);
vector bool int vec_cmpeq (vector bool int, vector bool int);
vector bool long long vec_cmpeq (vector bool long long, vector bool long lon
vector bool short vec_cmpeq (vector bool short, vector bool short);

llvm-svn: 286205
2016-11-08 04:15:45 +00:00
Yaxun Liu 7d07ae7c85 [OpenCL] Mark group functions as convergent in opencl-c.h
Certain OpenCL builtin functions are supposed to be executed by all threads in a work group or sub group. Such functions should not be made divergent during transformation. It makes sense to mark them with convergent attribute.

The adding of convergent attribute is based on Ettore Speziale's work and the original proposal and patch can be found at https://www.mail-archive.com/cfe-commits@lists.llvm.org/msg22271.html.

Differential Revision: https://reviews.llvm.org/D25343

llvm-svn: 285725
2016-11-01 18:45:32 +00:00
Nemanja Ivanovic 05ce4ca0dd [PowerPC] Implement vector shift builtins - clang portion
This patch corresponds to review https://reviews.llvm.org/D26092.
Committing on behalf of Tony Jiang.

llvm-svn: 285694
2016-11-01 14:46:20 +00:00
Nemanja Ivanovic 251f6dd93d [PPC] Add vec_absd functions to altivec.h
This patch corresponds to review https://reviews.llvm.org/D26073.
Committing on behalf of Sean Fertile.

llvm-svn: 285679
2016-11-01 08:39:56 +00:00
Craig Topper 08bf53ffda [AVX-512] Remove masked vector insert builtins and replace with native shufflevectors and selects.
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.

llvm-svn: 285667
2016-11-01 05:47:56 +00:00
Craig Topper 350729627a [AVX-512] Use selectd instead of selectps for _mm256_mask_extracti32x4_epi32.
llvm-svn: 285545
2016-10-31 05:49:11 +00:00
Craig Topper 93ffabd28d [AVX-512] Remove masked vector extract builtins and replace with native shufflevectors and selects.
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.

llvm-svn: 285540
2016-10-31 04:30:56 +00:00
Craig Topper 66b2fd1209 [AVX-512] Remove many of the masked 128/256-bit shift builtins and replace them with unmasked builtins and selects.
llvm-svn: 285539
2016-10-31 04:30:51 +00:00
Michael Zuckerman d343697f1e Fixing "type" issue for (epi32)
and replaceing hardcoded inf with clang builtin inf "__builtin_inff()" for float ({max|min}_{pd|ps})

llvm-svn: 285519
2016-10-30 14:54:05 +00:00
Craig Topper 312ff9d19d [AVX-512] Remove masked 128/256-bit builtins for vpmaddwd and vpmaddubsw. Replace with unmasked builtins and select.
llvm-svn: 285516
2016-10-30 07:11:34 +00:00
Craig Topper 4caf76bee2 [AVX-512] Remove 128/256-bit masked pmulhrsw/pmulhuw/pmulhw builtins and use unmasked builtins and select instead.
llvm-svn: 285505
2016-10-29 19:02:14 +00:00
Craig Topper 2eadf1b67e [AVX-512] Remove masked 128/256-bit sqrt builtins and replace them with unmasked builtins and a select.
llvm-svn: 285504
2016-10-29 19:02:10 +00:00
Craig Topper 09e94007be [AVX-512] Remove masked 128/256-bit pmuludq/pmuldq builtins and replace them with unmasked builtins and a select.
llvm-svn: 285503
2016-10-29 19:02:07 +00:00
Craig Topper 160ca8420d [AVX-512] Remove masked 128/256-bit floating point max/min builtins. Use unmasked builtins with select instead.
llvm-svn: 285502
2016-10-29 19:02:03 +00:00
Michael Zuckerman 25eb420233 [X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max|min) intrinsics to Clang .
After LGTM and Check-all 

Vector-reduction arithmetic accepts vectors as inputs and produces 
scalars as outputs.This class of vector operation forms the basis 
of many scientific computations. In vector-reduction arithmetic, 
the evaluation off is independent of the order of the input elements of V.

Reviewer: 1. craig.topper 
          2. igorb

Differential Revision: https://reviews.llvm.org/D25988

llvm-svn: 285493
2016-10-29 10:29:20 +00:00
Nemanja Ivanovic 931bc548e6 [PPC] add float and double overloads for vec_orc and vec_nand in altivec.h
This patch corresponds to review https://reviews.llvm.org/D25950.
Committing on behalf of Sean Fertile.

llvm-svn: 285439
2016-10-28 20:04:53 +00:00
Nemanja Ivanovic 4f69f924df Implement vector count leading/trailing bytes with zero lsb and vector parity
builtins - clang portion

This patch corresponds to review: https://reviews.llvm.org/D26002
Committing on behalf of Zaara Syeda.

llvm-svn: 285436
2016-10-28 19:49:03 +00:00
Michael Zuckerman edd99eb07a 1. Fixing small types issue (PD|PS) (reduce) .
2. Cosmetic changes

llvm-svn: 285405
2016-10-28 15:16:03 +00:00
Anastasia Stulova 7c30533362 [OpenCL] Diagnose variadic arguments
OpenCL disallows using variadic arguments (s6.9.e and s6.12.5 OpenCL v2.0)
apart from some exceptions:
- printf
- enqueue_kernel

This change adds error diagnostic for variadic functions but accepts printf
and any compiler internal function (which should cover __enqueue_kernel_XXX cases).

It also unifies diagnostic with block prototype and adds missing uncaught cases for blocks.

llvm-svn: 285395
2016-10-28 12:59:39 +00:00
Nemanja Ivanovic 09dd423a7d [PPC] add vector byte reverse functions to altivec.h
This patch corresponds to review https://reviews.llvm.org/D25915.
Committing on behalf of Sean Fertile.

llvm-svn: 285268
2016-10-27 06:23:57 +00:00
Justin Lebar ebeeab87a1 [CUDA] Move device placement new definitions into a wrapper header.
Previously, these were always included -- after this change, you have to
 #include <new>, which is consistent with how things ought to work.

llvm-svn: 285251
2016-10-26 22:13:26 +00:00
Justin Lebar 6f5ec7ee88 [CUDA] Switch cuda_wrappers/complex to use a proper include guard instead of #pragma once.
This is consistent with the rest of our internal headers.

llvm-svn: 285250
2016-10-26 22:13:20 +00:00
Nemanja Ivanovic 3de0a385c9 [PowerPC] Implement vector_insert_exp builtins - clang portion
This patch corresponds to review https://reviews.llvm.org/D25956.
Committing on behalf of Zaara Syeda.

llvm-svn: 285229
2016-10-26 19:27:11 +00:00
Nemanja Ivanovic 85a28dcc5d [PPC] Implement vector reverse elements builtins (vec_reve)
This patch corresponds to review https://reviews.llvm.org/D25906.
Committing on behalf of Tony Jiang.

llvm-svn: 285218
2016-10-26 18:25:45 +00:00
Craig Topper f202365910 [AVX-512] Fix the operand order for all calls to __builtin_ia32_vfmaddss3_mask.
Summary: The preserved input should be the first argument and the vector inputs should be in the same order as the intrinsics it is used to implement.

Reviewers: igorb, delena

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25902

llvm-svn: 285175
2016-10-26 05:35:38 +00:00
Yaxun Liu a49bd14843 [OpenCL] Add missing atom_xor for 64 bit to opencl-c.h
Differential Revision: https://reviews.llvm.org/D25954

llvm-svn: 285125
2016-10-25 21:37:05 +00:00
Michael Zuckerman facb37cabf [X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (Operators: +,*,&&,||) intrinsics to Clang
Committed after LGTM and check-all

 
Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs.
This class of vector operation forms the basis of many scientific computations.
In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V.
 
Used bisection method. At each step, we partition the vector with previous
step in half, and the operation is performed on its two halves.
This takes log2(n) steps where n is the number of elements in the vector.
 
Reviwer: 1. igorb
         2. craig.topper    
Differential Revision: https://reviews.llvm.org/D25527

llvm-svn: 285054
2016-10-25 07:56:04 +00:00
Michael Zuckerman 33bd5b235b revert r284963
because new test file is failing in some OS.
test/CodeGen/avx512-reduceIntrin.c

llvm-svn: 284967
2016-10-24 11:30:23 +00:00
Michael Zuckerman 98cb041891 [X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (Operators: +,*,&&,||) intrinsics to Clang
Committed after LGTM and check-all

Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs.
This class of vector operation forms the basis of many scientific computations.
In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V.

Used bisection method. At each step, we partition the vector with previous
step in half, and the operation is performed on its two halves.
This takes log2(n) steps where n is the number of elements in the vector.

Differential Revision: https://reviews.llvm.org/D25527

llvm-svn: 284963
2016-10-24 10:53:20 +00:00
Craig Topper eee7c0520c [AVX-512] Replace masked 128/256-bit byte, word, and dword min/max builtins with selects and the older unmasked builtins.
llvm-svn: 284954
2016-10-23 23:57:30 +00:00
Craig Topper 0c5da26572 [AVX-512] Replace 512-bit pmovzx/sx builtins with native IR.
llvm-svn: 284936
2016-10-23 07:35:47 +00:00
Craig Topper 4ef879ac2c [AVX-512] Remove masked 128/256-bit packss/packus builtins and replace with selects and the older unmasked builtins.
llvm-svn: 284935
2016-10-23 07:35:39 +00:00
Ekaterina Romanova 06477bf035 Add more doxygen comments to emmintrin.h's intrinsics.
With this patch, all intrinsics in this file (with an exception of a handful of a recently added ones) will be documented. I will send out a patch for 4 missining intrisics later.

The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Yunzhong Gao.

llvm-svn: 284934
2016-10-23 07:30:50 +00:00
Craig Topper 4d63dfc286 [AVX-512] Replace masked 128/256-bit pavg builtins and replace with select and older unmasked builtins.
llvm-svn: 284929
2016-10-22 21:24:56 +00:00
Craig Topper 622c63614d [AVX-512] Replace masked 128/256-bit saturating add/sub builtins with select and older unmasked builtins.
llvm-svn: 284928
2016-10-22 21:24:52 +00:00
Craig Topper 11dda92405 [AVX-512] Replace masked 128/256-bit vpmovzx/vpmovsx builtins with native IR.
llvm-svn: 284927
2016-10-22 21:24:48 +00:00
Craig Topper eb1c0afa90 [AVX-512] Remove masked 128/256-bit pshufb builtins. Replace with a select and the older unmaksed builtins.
llvm-svn: 284925
2016-10-22 21:24:42 +00:00
Craig Topper 78a9c40326 [AVX-512] Remove builtins for 128/256-bit pabsb/pabsw. We can use a select and the older non-masked versions instead.
llvm-svn: 284924
2016-10-22 21:24:38 +00:00
Craig Topper c2c7e42bfe [AVX-512] Add typecasts to alignr intrinsics that were modified in r284920.
llvm-svn: 284923
2016-10-22 21:24:34 +00:00
Craig Topper f6373bc6fd [AVX-512] Remove masked 128/256-bit palignr builtins. We can just use a select in the header file with the older unmasked versions instead.
llvm-svn: 284920
2016-10-22 18:32:33 +00:00
Ekaterina Romanova 493091fdef Add more doxygen comments to emmintrin.h's intrinsics.
With this patch, 75% of the intrinsics in this file will be documented now. The patches for the rest of the intrisics in this file will be send out later.

The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao.

llvm-svn: 284754
2016-10-20 17:59:15 +00:00
Albert Gutowski 1deab38717 Implement __stosb intrinsic as a volatile memset
Summary: We need `__stosb` to be an intrinsic, because SecureZeroMemory function uses it without including intrin.h. Implementing it as a volatile memset is not consistent with MSDN specification, but it gives us target-independent IR while keeping the most important properties of `__stosb`.

Reviewers: rnk, hans, thakis, majnemer

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25334

llvm-svn: 284253
2016-10-14 17:33:05 +00:00
Albert Gutowski 5e08df0266 Add 64-bit MS _Interlocked functions as builtins again
Summary: Previously global 64-bit versions of _Interlocked functions broke buildbots on i386, so now I'm adding them as builtins for x86-64 and ARM only (should they be also on AArch64? I had problems with testing it for AArch64, so I left it)

Reviewers: hans, majnemer, mstorsjo, rnk

Subscribers: cfe-commits, aemerson

Differential Revision: https://reviews.llvm.org/D25576

llvm-svn: 284172
2016-10-13 22:35:07 +00:00
Albert Gutowski 397d81bb9a Implement MS _ReturnAddress and _AddressOfReturnAddress intrinsics
Reviewers: rnk, thakis, majnemer, hans

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25540

llvm-svn: 284131
2016-10-13 16:03:42 +00:00
Yunzhong Gao d9fa56a4fb [NFC] Fixing the description for _mm_store_ps and _mm_store_ps1.
It seems that the doxygen description of these two intrinsics were swapped by
mistake.

llvm-svn: 284080
2016-10-12 23:27:27 +00:00
Albert Gutowski 2a0621e58a Implement MS _BitScan intrinsics
Summary: _BitScan intrinsics (and some others, for example _Interlocked and _bittest) are supposed to work on both ARM and x86. This is an attempt to isolate them, avoiding repeating their code or writing separate function for each builtin.

Reviewers: hans, thakis, rnk, majnemer

Subscribers: RKSimon, cfe-commits, aemerson

Differential Revision: https://reviews.llvm.org/D25264

llvm-svn: 284060
2016-10-12 22:01:05 +00:00
Yunzhong Gao c37e2231ad [NFC] Trial change to remove a redundant blank line.
llvm-svn: 284033
2016-10-12 19:33:33 +00:00
Justin Lebar 49ec14692a [CUDA] Re-land support for <complex> (r283683 and r283680).
These were reverted in r283753 and r283747.

The first patch added a header to the root 'Headers' install directory,
instead of into 'Headers/cuda_wrappers'.  This was fixed in the second
patch, but by then the damage was done: The bad header stayed in the
'Headers' directory, continuing to break the build.

We reverted both patches in an attempt to fix things, but that still
didn't get rid of the header, so the Windows boostrap build remained
broken.

It's probably worth fixing up our cmake logic to remove things from the
install dirs, but in the meantime, re-land these patches, since we
believe they no longer have this bug.

llvm-svn: 283907
2016-10-11 17:36:03 +00:00
Albert Gutowski fcea61c563 Implement MS read/write barriers and __faststorefence intrinsic
Reviewers: hans, rnk, majnemer

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25442

llvm-svn: 283793
2016-10-10 19:40:51 +00:00
Albert Gutowski 7216f17653 Implement __emul, __emulu, _mul128 and _umul128 MS intrinsics
Reviewers: rnk, thakis, majnemer, hans

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25353

llvm-svn: 283785
2016-10-10 18:09:27 +00:00
Nico Weber 21b9c7a6dc Revert r283683 because r283680 got reverted.
llvm-svn: 283753
2016-10-10 14:20:35 +00:00
Nico Weber 67dd74ef89 Revert r283680.
Breaks bootstrap builds on (at least) Windows:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\lib\Support\Allocator.cpp:14:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/Allocator.h:24:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/ADT/SmallVector.h:20:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/MathExtras.h:19:
D:\buildslave\clang-x64-ninja-win7\stage1.install\bin\..\lib\clang\4.0.0\include\algorithm(63,8) :
    error: unknown type name '__device__'
    inline __device__ const __T &

llvm-svn: 283747
2016-10-10 14:10:00 +00:00
Justin Lebar 3b593f56fc [CUDA] Don't install cuda_wrappers/{algorithm,complex} into the main include dir.
This is obviously wrong -- if we do this, then all compiles will pick up
these wrappers, which is not what we want.

llvm-svn: 283683
2016-10-09 00:27:39 +00:00
Justin Lebar d3c5d2a4de [CUDA] Support <complex> and std::min/max on the device.
Summary:
We do this by wrapping <complex> and <algorithm>.

Tests are in the test-suite.

Reviewers: tra

Subscribers: jhen, beanz, cfe-commits, mgorny

Differential Revision: https://reviews.llvm.org/D24979

llvm-svn: 283680
2016-10-08 22:16:12 +00:00
Justin Lebar 2dfbe9a3b4 [CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h.
Summary: This matches the idiom we use for our other CUDA wrapper headers.

Reviewers: tra

Subscribers: beanz, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D24978

llvm-svn: 283679
2016-10-08 22:16:08 +00:00
Justin Lebar e9eb792a0f [CUDA] Declare our __device__ math functions in the same inline namespace as our standard library.
Summary:
Currently we declare our inline __device__ math functions in namespace
std.  But libstdc++ and libc++ declare these functions in an inline
namespace inside namespace std.  We need to match this because, in a
later patch, we want to get e.g. <complex> to use our device overloads,
and it only will if those overloads are in the right inline namespace.

Reviewers: tra

Subscribers: cfe-commits, jhen

Differential Revision: https://reviews.llvm.org/D24977

llvm-svn: 283678
2016-10-08 22:16:03 +00:00
Michael Zuckerman 9e43ccfe68 [Clang][AVX512][BuiltIn]Adding missing intrinsics move_{sd|ss} to clang
Differential Revision: http://reviews.llvm.org/D21021

llvm-svn: 283314
2016-10-05 12:56:06 +00:00
Albert Gutowski f3a0bce155 Separate builtins for x84-64 and i386; implement __mulh and __umulh
Summary: We need x86-64-specific builtins if we want to implement some of the MS intrinsics - winnt.h contains definitions of some functions for i386, but not for x86-64 (for example _InterlockedOr64), which means that we cannot treat them as builtins for both i386 and x86-64, because then we have definitions of builtin functions in winnt.h on i386.

Reviewers: thakis, majnemer, hans, rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24598

llvm-svn: 283264
2016-10-04 22:29:49 +00:00
Craig Topper c4a8228bcc [AVX-512] Use native IR for masked 512-bit add/sub/mul/div ps/pd intrinsics when rounding mode isn't used.
llvm-svn: 283073
2016-10-02 17:43:00 +00:00
Artem Belevich d4d9dc8252 [CUDA] Added support for CUDA-8
Differential Revision: https://reviews.llvm.org/D24946

llvm-svn: 282610
2016-09-28 17:47:40 +00:00
Martin Storsjo 963f75efc2 [Headers] Replace stray indentation with tabs with spaces. NFC.
This matches the rest of the surrounding file.

llvm-svn: 282569
2016-09-28 09:34:51 +00:00
Ayman Musa 17a2819b05 Update to commit r282488, fix the buildboot failure.
llvm-svn: 282492
2016-09-27 15:37:31 +00:00
Ayman Musa 2e250e8845 [avx512] Add aliases to some missing avx512 intrinsics.
Differential Revision:https: //reviews.llvm.org/D24961

llvm-svn: 282488
2016-09-27 14:06:32 +00:00
Nemanja Ivanovic 10e2b5dcaa [Power9] Builtins for ELF v.2 ABI conformance - front end portion
This patch corresponds to review:
https://reviews.llvm.org/D24397

It adds the __POWER9_VECTOR__ macro and the -mpower9-vector option along with
a number of altivec.h functions (refer to the code review for a list).

llvm-svn: 282481
2016-09-27 10:45:22 +00:00
Saleem Abdulrasool eae64f8a62 headers: add missing Windows ARM Interlocked intrinsics
On ARM, there are multiple versions of each of the intrinsics, with
acquire/relaxed/release barrier semantics.

The newly added ones are provided as inline functions here instead of builtins,
since they should only be available on certain archs (arm/aarch64).

This is necessary in order to compile C++ code for ARM in MSVC mode.

Patch by Martin Storsjö!

llvm-svn: 282447
2016-09-26 22:12:43 +00:00
Simon Dardis 3d9c763816 [mips] MSA intrinsics header file
This patch adds the msa.h header file containing the shorter names for the
MSA instrinsics, e.g. msa_sll_b for builtin_msa_sll_b.

Reviewers: vkalintiris, zoran.jovanovic

Differential Review: https://reviews.llvm.org/D24674

llvm-svn: 281975
2016-09-20 15:07:36 +00:00
Justin Lebar e3612a039f [CUDA] Make __clang_cuda_cmath.h compatible with libc++.
Summary:
We need to add a bunch more "using"s, which weren't necessary with
libstdc++.

Once this is in I can check in a test to the test-suite.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24588

llvm-svn: 281544
2016-09-14 21:50:14 +00:00
Albert Gutowski 727ab8a803 Add some MS aliases for existing intrinsics
Reviewers: thakis, compnerd, majnemer, rsmith, rnk

Subscribers: alexshap, cfe-commits

Differential Revision: https://reviews.llvm.org/D24330

llvm-svn: 281540
2016-09-14 21:19:43 +00:00
Albert Gutowski fc19fa3721 Temporary fix for MS _Interlocked intrinsics
llvm-svn: 281401
2016-09-13 21:51:37 +00:00
Albert Gutowski 9918cb6573 Reverse commit 281375 (breaks building Chromium)
llvm-svn: 281399
2016-09-13 21:24:51 +00:00
Albert Gutowski ce7a9a47b2 Add bunch of _Interlocked builtins
Reviewers: compnerd, thakis, Prazek, majnemer, rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24153

llvm-svn: 281378
2016-09-13 19:43:33 +00:00
Albert Gutowski ae3fb3113f Add some MS aliases for existing intrinsics
Reviewers: thakis, compnerd, majnemer, rsmith, rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24330

llvm-svn: 281375
2016-09-13 19:26:42 +00:00
Albert Gutowski b6a11acb53 Implement MS _rot intrinsics
Reviewers: thakis, Prazek, compnerd, rnk

Subscribers: majnemer, cfe-commits

Differential Revision: https://reviews.llvm.org/D24311

llvm-svn: 280997
2016-09-08 22:32:19 +00:00
Reid Kleckner 5de2bcdcf6 Add MS __nop intrinsic to intrin.h
Summary: There was no definition for __nop function - added inline
assembly.

Patch by Albert Gutowski!

Reviewers: rnk, thakis

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24286

llvm-svn: 280826
2016-09-07 16:55:12 +00:00
Craig Topper 2dfab63bb3 [AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div builtins and replace with native operations.
We can't do the 512-bit ones because they take a rounding mode argument that we can't represent.

llvm-svn: 280635
2016-09-04 18:30:17 +00:00
Elad Cohen fb6358d2b5 [Modules] Add 'freestanding' to the 'requires-declaration' feature-list.
This adds support for modules that require (non-)freestanding
environment, such as the compiler builtin mm_malloc submodule.

Differential Revision: https://reviews.llvm.org/D23871

llvm-svn: 280613
2016-09-04 06:00:42 +00:00
Joerg Sonnenberger b50b2fac9f Trailing dot that shouldn't have been committed.
llvm-svn: 280609
2016-09-04 00:51:02 +00:00
Joerg Sonnenberger 82216f0faa PR 27200: Fix names of the atomic lock-free macros.
llvm-svn: 280607
2016-09-04 00:44:10 +00:00
Craig Topper f43e4a1728 [AVX-512] Remove masked integer mullo builtins and replace with native IR.
llvm-svn: 280597
2016-09-03 19:19:49 +00:00
Craig Topper 0e18976b8d [AVX-512] Remove masked integer add/sub builtins and replace with native IR.
llvm-svn: 280596
2016-09-03 18:29:35 +00:00
Craig Topper a815f488d5 [AVX-512] Implement masked floating point logical operations with native IR and remove the builtins.
llvm-svn: 280197
2016-08-31 05:38:58 +00:00
Craig Topper d0681d528d [X86] Use v2i64 vectors to implement _mm_and/andn/or/xor_pd.
These will be reused when removing some builtins from avx512vldqintrin.h and this will make the tests for that change show a better number of vector elements.

llvm-svn: 280196
2016-08-31 05:38:55 +00:00
Bruno Cardoso Lopes 6736e199c7 [Modules] Add 'gnuinlineasm' to the 'requires-declaration' feature-list.
This adds support for modules that require (no-)gnu-inline-asm
environment, such as the compiler builtin cpuid submodule.

This is the gnu-inline-asm variant of https://reviews.llvm.org/D23871

Differential Revision: https://reviews.llvm.org/D23905

rdar://problem/26931199

llvm-svn: 280159
2016-08-30 21:25:42 +00:00
Alexey Bader b5d90e57dc [OpenCL] Make is_valid_event, create_user_event overloadable.
Summary: Make is_valid_event and create_user_event overloadable like other built-ins.

Patch by Evgeniy Tyurin.

Reviewers: bader, yaxunl

Subscribers: Anastasia, cfe-commits

Differential Revision: https://reviews.llvm.org/D23914

llvm-svn: 280097
2016-08-30 14:42:54 +00:00
Asaf Badouh 356bb76809 [X86][AVX512F] minor fix of the parameter names
add "__" prefix

Bug 28842 https://llvm.org/bugs/show_bug.cgi?id=29040

Differential Revision: https://reviews.llvm.org/D23753


 

llvm-svn: 279392
2016-08-21 07:56:47 +00:00
Justin Lebar cb20a09f54 [CUDA] Improve handling of math functions.
Summary:
A bunch of related changes here to our CUDA math headers.

- The second arg to nexttoward is a double (well, technically, long
  double, but we don't have that), not a float.

- Add a forward-declare of llround(float), which is defined in the CUDA
  headers.  We need this for the same reason we need most of the other
  forward-declares: To prevent a constexpr function in our standard
  library from becoming host+device.

- Add nexttowardf implementation.

- Pull "foobarf" functions defined by the CUDA headers in the global
  namespace into namespace std.  This lets you do e.g. std::sinf.

- Add overloads for math functions accepting integer types.  This lets
  you do e.g. std::sin(0) without having an ambiguity between the
  overload that takes a float and the one that takes a double.

With these changes, we pass testcases derived from libc++ for cmath and
math.h.  We can check these testcases in to the test-suite once support
for CUDA lands there.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D23627

llvm-svn: 279140
2016-08-18 20:43:13 +00:00