llvm-project

Commit Graph

Author	SHA1	Message	Date
Nemanja Ivanovic	09dd423a7d	[PPC] add vector byte reverse functions to altivec.h This patch corresponds to review https://reviews.llvm.org/D25915. Committing on behalf of Sean Fertile. llvm-svn: 285268	2016-10-27 06:23:57 +00:00
Justin Lebar	ebeeab87a1	[CUDA] Move device placement new definitions into a wrapper header. Previously, these were always included -- after this change, you have to #include <new>, which is consistent with how things ought to work. llvm-svn: 285251	2016-10-26 22:13:26 +00:00
Justin Lebar	6f5ec7ee88	[CUDA] Switch cuda_wrappers/complex to use a proper include guard instead of #pragma once. This is consistent with the rest of our internal headers. llvm-svn: 285250	2016-10-26 22:13:20 +00:00
Nemanja Ivanovic	3de0a385c9	[PowerPC] Implement vector_insert_exp builtins - clang portion This patch corresponds to review https://reviews.llvm.org/D25956. Committing on behalf of Zaara Syeda. llvm-svn: 285229	2016-10-26 19:27:11 +00:00
Nemanja Ivanovic	85a28dcc5d	[PPC] Implement vector reverse elements builtins (vec_reve) This patch corresponds to review https://reviews.llvm.org/D25906. Committing on behalf of Tony Jiang. llvm-svn: 285218	2016-10-26 18:25:45 +00:00
Craig Topper	f202365910	[AVX-512] Fix the operand order for all calls to __builtin_ia32_vfmaddss3_mask. Summary: The preserved input should be the first argument and the vector inputs should be in the same order as the intrinsics it is used to implement. Reviewers: igorb, delena Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25902 llvm-svn: 285175	2016-10-26 05:35:38 +00:00
Yaxun Liu	a49bd14843	[OpenCL] Add missing atom_xor for 64 bit to opencl-c.h Differential Revision: https://reviews.llvm.org/D25954 llvm-svn: 285125	2016-10-25 21:37:05 +00:00
Michael Zuckerman	facb37cabf	[X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (Operators: +,*,&&,\|\|) intrinsics to Clang Committed after LGTM and check-all Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs. This class of vector operation forms the basis of many scientific computations. In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V. Used bisection method. At each step, we partition the vector with previous step in half, and the operation is performed on its two halves. This takes log2(n) steps where n is the number of elements in the vector. Reviwer: 1. igorb 2. craig.topper Differential Revision: https://reviews.llvm.org/D25527 llvm-svn: 285054	2016-10-25 07:56:04 +00:00
Michael Zuckerman	33bd5b235b	revert r284963 because new test file is failing in some OS. test/CodeGen/avx512-reduceIntrin.c llvm-svn: 284967	2016-10-24 11:30:23 +00:00
Michael Zuckerman	98cb041891	[X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (Operators: +,*,&&,\|\|) intrinsics to Clang Committed after LGTM and check-all Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs. This class of vector operation forms the basis of many scientific computations. In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V. Used bisection method. At each step, we partition the vector with previous step in half, and the operation is performed on its two halves. This takes log2(n) steps where n is the number of elements in the vector. Differential Revision: https://reviews.llvm.org/D25527 llvm-svn: 284963	2016-10-24 10:53:20 +00:00
Craig Topper	eee7c0520c	[AVX-512] Replace masked 128/256-bit byte, word, and dword min/max builtins with selects and the older unmasked builtins. llvm-svn: 284954	2016-10-23 23:57:30 +00:00
Craig Topper	0c5da26572	[AVX-512] Replace 512-bit pmovzx/sx builtins with native IR. llvm-svn: 284936	2016-10-23 07:35:47 +00:00
Craig Topper	4ef879ac2c	[AVX-512] Remove masked 128/256-bit packss/packus builtins and replace with selects and the older unmasked builtins. llvm-svn: 284935	2016-10-23 07:35:39 +00:00
Ekaterina Romanova	06477bf035	Add more doxygen comments to emmintrin.h's intrinsics. With this patch, all intrinsics in this file (with an exception of a handful of a recently added ones) will be documented. I will send out a patch for 4 missining intrisics later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao. llvm-svn: 284934	2016-10-23 07:30:50 +00:00
Craig Topper	4d63dfc286	[AVX-512] Replace masked 128/256-bit pavg builtins and replace with select and older unmasked builtins. llvm-svn: 284929	2016-10-22 21:24:56 +00:00
Craig Topper	622c63614d	[AVX-512] Replace masked 128/256-bit saturating add/sub builtins with select and older unmasked builtins. llvm-svn: 284928	2016-10-22 21:24:52 +00:00
Craig Topper	11dda92405	[AVX-512] Replace masked 128/256-bit vpmovzx/vpmovsx builtins with native IR. llvm-svn: 284927	2016-10-22 21:24:48 +00:00
Craig Topper	eb1c0afa90	[AVX-512] Remove masked 128/256-bit pshufb builtins. Replace with a select and the older unmaksed builtins. llvm-svn: 284925	2016-10-22 21:24:42 +00:00
Craig Topper	78a9c40326	[AVX-512] Remove builtins for 128/256-bit pabsb/pabsw. We can use a select and the older non-masked versions instead. llvm-svn: 284924	2016-10-22 21:24:38 +00:00
Craig Topper	c2c7e42bfe	[AVX-512] Add typecasts to alignr intrinsics that were modified in r284920. llvm-svn: 284923	2016-10-22 21:24:34 +00:00
Craig Topper	f6373bc6fd	[AVX-512] Remove masked 128/256-bit palignr builtins. We can just use a select in the header file with the older unmasked versions instead. llvm-svn: 284920	2016-10-22 18:32:33 +00:00
Ekaterina Romanova	493091fdef	Add more doxygen comments to emmintrin.h's intrinsics. With this patch, 75% of the intrinsics in this file will be documented now. The patches for the rest of the intrisics in this file will be send out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao. llvm-svn: 284754	2016-10-20 17:59:15 +00:00
Albert Gutowski	1deab38717	Implement __stosb intrinsic as a volatile memset Summary: We need `__stosb` to be an intrinsic, because SecureZeroMemory function uses it without including intrin.h. Implementing it as a volatile memset is not consistent with MSDN specification, but it gives us target-independent IR while keeping the most important properties of `__stosb`. Reviewers: rnk, hans, thakis, majnemer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25334 llvm-svn: 284253	2016-10-14 17:33:05 +00:00
Albert Gutowski	5e08df0266	Add 64-bit MS _Interlocked functions as builtins again Summary: Previously global 64-bit versions of _Interlocked functions broke buildbots on i386, so now I'm adding them as builtins for x86-64 and ARM only (should they be also on AArch64? I had problems with testing it for AArch64, so I left it) Reviewers: hans, majnemer, mstorsjo, rnk Subscribers: cfe-commits, aemerson Differential Revision: https://reviews.llvm.org/D25576 llvm-svn: 284172	2016-10-13 22:35:07 +00:00
Albert Gutowski	397d81bb9a	Implement MS _ReturnAddress and _AddressOfReturnAddress intrinsics Reviewers: rnk, thakis, majnemer, hans Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25540 llvm-svn: 284131	2016-10-13 16:03:42 +00:00
Yunzhong Gao	d9fa56a4fb	[NFC] Fixing the description for _mm_store_ps and _mm_store_ps1. It seems that the doxygen description of these two intrinsics were swapped by mistake. llvm-svn: 284080	2016-10-12 23:27:27 +00:00
Albert Gutowski	2a0621e58a	Implement MS _BitScan intrinsics Summary: _BitScan intrinsics (and some others, for example _Interlocked and _bittest) are supposed to work on both ARM and x86. This is an attempt to isolate them, avoiding repeating their code or writing separate function for each builtin. Reviewers: hans, thakis, rnk, majnemer Subscribers: RKSimon, cfe-commits, aemerson Differential Revision: https://reviews.llvm.org/D25264 llvm-svn: 284060	2016-10-12 22:01:05 +00:00
Yunzhong Gao	c37e2231ad	[NFC] Trial change to remove a redundant blank line. llvm-svn: 284033	2016-10-12 19:33:33 +00:00
Justin Lebar	49ec14692a	[CUDA] Re-land support for <complex> (r283683 and r283680). These were reverted in r283753 and r283747. The first patch added a header to the root 'Headers' install directory, instead of into 'Headers/cuda_wrappers'. This was fixed in the second patch, but by then the damage was done: The bad header stayed in the 'Headers' directory, continuing to break the build. We reverted both patches in an attempt to fix things, but that still didn't get rid of the header, so the Windows boostrap build remained broken. It's probably worth fixing up our cmake logic to remove things from the install dirs, but in the meantime, re-land these patches, since we believe they no longer have this bug. llvm-svn: 283907	2016-10-11 17:36:03 +00:00
Albert Gutowski	fcea61c563	Implement MS read/write barriers and __faststorefence intrinsic Reviewers: hans, rnk, majnemer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25442 llvm-svn: 283793	2016-10-10 19:40:51 +00:00
Albert Gutowski	7216f17653	Implement __emul, __emulu, _mul128 and _umul128 MS intrinsics Reviewers: rnk, thakis, majnemer, hans Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25353 llvm-svn: 283785	2016-10-10 18:09:27 +00:00
Nico Weber	21b9c7a6dc	Revert r283683 because r283680 got reverted. llvm-svn: 283753	2016-10-10 14:20:35 +00:00
Nico Weber	67dd74ef89	Revert r283680. Breaks bootstrap builds on (at least) Windows: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\lib\Support\Allocator.cpp:14: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/Allocator.h:24: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/ADT/SmallVector.h:20: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/MathExtras.h:19: D:\buildslave\clang-x64-ninja-win7\stage1.install\bin\..\lib\clang\4.0.0\include\algorithm(63,8) : error: unknown type name '__device__' inline __device__ const __T & llvm-svn: 283747	2016-10-10 14:10:00 +00:00
Justin Lebar	3b593f56fc	[CUDA] Don't install cuda_wrappers/{algorithm,complex} into the main include dir. This is obviously wrong -- if we do this, then all compiles will pick up these wrappers, which is not what we want. llvm-svn: 283683	2016-10-09 00:27:39 +00:00
Justin Lebar	d3c5d2a4de	[CUDA] Support <complex> and std::min/max on the device. Summary: We do this by wrapping <complex> and <algorithm>. Tests are in the test-suite. Reviewers: tra Subscribers: jhen, beanz, cfe-commits, mgorny Differential Revision: https://reviews.llvm.org/D24979 llvm-svn: 283680	2016-10-08 22:16:12 +00:00
Justin Lebar	2dfbe9a3b4	[CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h. Summary: This matches the idiom we use for our other CUDA wrapper headers. Reviewers: tra Subscribers: beanz, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D24978 llvm-svn: 283679	2016-10-08 22:16:08 +00:00
Justin Lebar	e9eb792a0f	[CUDA] Declare our __device__ math functions in the same inline namespace as our standard library. Summary: Currently we declare our inline __device__ math functions in namespace std. But libstdc++ and libc++ declare these functions in an inline namespace inside namespace std. We need to match this because, in a later patch, we want to get e.g. <complex> to use our device overloads, and it only will if those overloads are in the right inline namespace. Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: https://reviews.llvm.org/D24977 llvm-svn: 283678	2016-10-08 22:16:03 +00:00
Michael Zuckerman	9e43ccfe68	[Clang][AVX512][BuiltIn]Adding missing intrinsics move_{sd\|ss} to clang Differential Revision: http://reviews.llvm.org/D21021 llvm-svn: 283314	2016-10-05 12:56:06 +00:00
Albert Gutowski	f3a0bce155	Separate builtins for x84-64 and i386; implement __mulh and __umulh Summary: We need x86-64-specific builtins if we want to implement some of the MS intrinsics - winnt.h contains definitions of some functions for i386, but not for x86-64 (for example _InterlockedOr64), which means that we cannot treat them as builtins for both i386 and x86-64, because then we have definitions of builtin functions in winnt.h on i386. Reviewers: thakis, majnemer, hans, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24598 llvm-svn: 283264	2016-10-04 22:29:49 +00:00
Craig Topper	c4a8228bcc	[AVX-512] Use native IR for masked 512-bit add/sub/mul/div ps/pd intrinsics when rounding mode isn't used. llvm-svn: 283073	2016-10-02 17:43:00 +00:00
Artem Belevich	d4d9dc8252	[CUDA] Added support for CUDA-8 Differential Revision: https://reviews.llvm.org/D24946 llvm-svn: 282610	2016-09-28 17:47:40 +00:00
Martin Storsjo	963f75efc2	[Headers] Replace stray indentation with tabs with spaces. NFC. This matches the rest of the surrounding file. llvm-svn: 282569	2016-09-28 09:34:51 +00:00
Ayman Musa	17a2819b05	Update to commit r282488, fix the buildboot failure. llvm-svn: 282492	2016-09-27 15:37:31 +00:00
Ayman Musa	2e250e8845	[avx512] Add aliases to some missing avx512 intrinsics. Differential Revision:https: //reviews.llvm.org/D24961 llvm-svn: 282488	2016-09-27 14:06:32 +00:00
Nemanja Ivanovic	10e2b5dcaa	[Power9] Builtins for ELF v.2 ABI conformance - front end portion This patch corresponds to review: https://reviews.llvm.org/D24397 It adds the __POWER9_VECTOR__ macro and the -mpower9-vector option along with a number of altivec.h functions (refer to the code review for a list). llvm-svn: 282481	2016-09-27 10:45:22 +00:00
Saleem Abdulrasool	eae64f8a62	headers: add missing Windows ARM Interlocked intrinsics On ARM, there are multiple versions of each of the intrinsics, with acquire/relaxed/release barrier semantics. The newly added ones are provided as inline functions here instead of builtins, since they should only be available on certain archs (arm/aarch64). This is necessary in order to compile C++ code for ARM in MSVC mode. Patch by Martin Storsjö! llvm-svn: 282447	2016-09-26 22:12:43 +00:00
Simon Dardis	3d9c763816	[mips] MSA intrinsics header file This patch adds the msa.h header file containing the shorter names for the MSA instrinsics, e.g. msa_sll_b for builtin_msa_sll_b. Reviewers: vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24674 llvm-svn: 281975	2016-09-20 15:07:36 +00:00
Justin Lebar	e3612a039f	[CUDA] Make __clang_cuda_cmath.h compatible with libc++. Summary: We need to add a bunch more "using"s, which weren't necessary with libstdc++. Once this is in I can check in a test to the test-suite. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24588 llvm-svn: 281544	2016-09-14 21:50:14 +00:00
Albert Gutowski	727ab8a803	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: alexshap, cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281540	2016-09-14 21:19:43 +00:00
Albert Gutowski	fc19fa3721	Temporary fix for MS _Interlocked intrinsics llvm-svn: 281401	2016-09-13 21:51:37 +00:00

1 2 3 4 5 ...

1076 Commits