llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	934f86a848	[X86] Fix some inconsistent formatting in the first line of our intrinsics headers. Some were too long and some were too short. llvm-svn: 331559	2018-05-04 21:45:25 +00:00
Volodymyr Sapsai	2d77119f72	Revert "Emit an error when mixing <stdatomic.h> and <atomic>" It reverts r331378 as it caused test failures ThreadSanitizer-x86_64 :: Darwin/gcd-groups-destructor.mm ThreadSanitizer-x86_64 :: Darwin/libcxx-shared-ptr-stress.mm ThreadSanitizer-x86_64 :: Darwin/xpc-race.mm Only clang part of the change is reverted, libc++ part remains as is because it emits error less aggressively. llvm-svn: 331392	2018-05-02 19:52:07 +00:00
Volodymyr Sapsai	c0a278aada	Emit an error when mixing <stdatomic.h> and <atomic> Atomics in C and C++ are incompatible at the moment and mixing the headers can result in confusing error messages. Emit an error explicitly telling about the incompatibility. Introduce the macro `__ALLOW_STDC_ATOMICS_IN_CXX__` that allows to choose in C++ between C atomics and C++ atomics. rdar://problem/27435938 Reviewers: rsmith, EricWF, mclow.lists Reviewed By: mclow.lists Subscribers: jkorous-apple, christof, bumblebritches57, JonChesterfield, smeenai, cfe-commits Differential Revision: https://reviews.llvm.org/D45470 llvm-svn: 331378	2018-05-02 17:50:43 +00:00
Gabor Buella	a51e0c2243	[X86] directstore and movdir64b intrinsics Reviewers: spatel, craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45984 llvm-svn: 331249	2018-05-01 10:05:42 +00:00
Craig Topper	e95bde33df	[X86] Add support for _mm512_mullox_epi64 and _mm512_mask_mullox_epi64 intrinsics to match icc. On AVX512F targets we'll produce an emulated sequence using 3 pmuludqs with shifts and adds. On AVX512DQ we'll use vpmulld. Fixes PR37140. llvm-svn: 330923	2018-04-26 05:38:39 +00:00
Artem Belevich	3cce307799	[CUDA] Enable CUDA compilation with CUDA-9.2 Differential Revision: https://reviews.llvm.org/D45827 llvm-svn: 330753	2018-04-24 18:23:19 +00:00
Craig Topper	5f1d10e26e	[X86] Add recently added intrinsic headers to the module map. llvm-svn: 330744	2018-04-24 17:40:49 +00:00
Craig Topper	bd16b11255	[X86] Consistently use double underscore at the beginning of the include guards in our intrinsic headers. Most files used double underscore, but a few used single. This converges them all to double. llvm-svn: 330743	2018-04-24 17:40:47 +00:00
Craig Topper	ce281a41b5	[X86] Remove '#ifdef __x86_64__' around mask_set1_epi64 intrinsics. The unmasked versions already didn't have this restrction. I don't think gcc or icc limit these to 64-bit mode so we shouldn't either. llvm-svn: 330681	2018-04-24 03:36:08 +00:00
Gabor Buella	eba6c42e66	[X86] WaitPKG intrinsics Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45254 llvm-svn: 330463	2018-04-20 18:44:33 +00:00
Artem Belevich	5832eb4cfd	[CUDA] added missing __ldg(const signed char *) Differential Revision: https://reviews.llvm.org/D45780 llvm-svn: 330280	2018-04-18 18:33:43 +00:00
Gabor Buella	b220dd2b6c	[X86] Introduce cldemote intrinsic Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45257 llvm-svn: 329993	2018-04-13 07:37:24 +00:00
Gabor Buella	e708a09e21	[X86] Introduce wbinvd intrinsic A previously missing intrinsic for an old instruction. Reviewers: craig.topper, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45311 llvm-svn: 329937	2018-04-12 18:42:02 +00:00
Gabor Buella	a052016ef2	[x86] wbnoinvd intrinsic The WBNOINVD instruction writes back all modified cache lines in the processor’s internal cache to main memory but does not invalidate (flush) the internal caches. Reviewers: craig.topper, zvi, ashlykov Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D43817 llvm-svn: 329848	2018-04-11 20:09:09 +00:00
Craig Topper	dcdac965f1	[X86] Fix typo in intrinsic header file __mask16->__mmask16 from r329775. llvm-svn: 329777	2018-04-11 05:17:14 +00:00
Craig Topper	2575454fe9	[X86] Replace 512-bit masked pmaddubsw and pmaddwd intrinsic with unmasked intrinsic and a select. This makes it consistent with the 128/256-bit functions. Someday maybe we'll have all the masking moved to selects. llvm-svn: 329775	2018-04-11 04:55:10 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Douglas Yung	17d2ef90e0	[DOXYGEN] Fix doxygen and content issues in mmintrin.h - Fix instruction mappings/listings for various intrinsics This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41517 llvm-svn: 327090	2018-03-09 00:38:51 +00:00
Craig Topper	260ed8647a	[X86] Fix typo in cpuid.h, bit_AVX51SER->bit_AVX512ER. llvm-svn: 326807	2018-03-06 16:06:44 +00:00
Alexander Ivchenko	9d3b45301f	[x86][CET] Introduce _get_ssp, _inc_ssp intrinsics Summary: The _get_ssp intrinsic can be used to retrieve the shadow stack pointer, independent of the current arch -- in contract with the rdsspd and the rdsspq intrinsics. Also, this intrinsic returns zero on CPUs which don't support CET. The rdssp[d\|q] instruction is decoded as nop, essentially just returning the input operand, which is zero. Example result of compilation: ``` xorl %eax, %eax movl %eax, %ecx rdsspq %rcx # NOP when CET is not supported movq %rcx, %rax # return zero ``` Reviewers: craig.topper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D43814 llvm-svn: 326689	2018-03-05 11:30:28 +00:00
Craig Topper	21f66a3f6b	[X86] Remove some masked cvt builtins that can be replaced with legacy sse/avx buiiltins and a select. llvm-svn: 326039	2018-02-24 18:55:13 +00:00
Craig Topper	5dc6ca8e5b	[X86] Remove __builtin_ia32_permvarsf256_mask and __builtin_ia32_permvarsi256_mask and use the avx2 unmasked versions and a select instead. llvm-svn: 326022	2018-02-24 06:46:42 +00:00
Artem Belevich	df38f155ec	[CUDA] Added missing functions. Initial commit missed sincos(float), llabs() and few atomics that we used to pull in from device_functions.hpp, which we no longer include. Differential Revision: https://reviews.llvm.org/D43602 llvm-svn: 325814	2018-02-22 18:40:52 +00:00
Artem Belevich	4dbea99137	[CUDA] Added missing __threadfence_system() function for CUDA9. llvm-svn: 325626	2018-02-20 21:25:30 +00:00
Craig Topper	0a70c3c7af	[X86] Remove mask from 512 bit pmulhrsw/pmulhw/pmulhuw builtins. We now use a vselect node in IR around an unmasked builtin. This makes it consistent with the 128 and 256 bit versions. llvm-svn: 325560	2018-02-20 07:28:18 +00:00
Ekaterina Romanova	f28751849e	[DOXYGEN] There was a request in the review D41507 to change the notation for hex numbers in doxygen documentation from <...>h to 0x<...>. Both of these notations were used in x86 intrinsics documentation. I promised to change them to 0x<...> for consistency. Differential Revision: https://reviews.llvm.org/D41888 llvm-svn: 325312	2018-02-16 03:11:35 +00:00
Artem Belevich	fbc56a904f	[CUDA] Added partial support for CUDA-9.1 Clang can use CUDA-9.1 now, though new APIs (are not implemented yet. The major change is that headers in CUDA-9.1 went through substantial changes that started in CUDA-9.0 which required substantial changes in the cuda compatibility headers provided by clang. There are two major issues: * CUDA SDK no longer provides declarations for libdevice functions. * A lot of device-side functions have become nvcc's builtins and CUDA headers no longer contain their implementations. This patch changes the way CUDA headers are handled if we compile with CUDA 9.x. Both 9.0 and 9.1 are affected. * Clang provides its own declarations of libdevice functions. * For CUDA-9.x clang now provides implementation of device-side 'standard library' functions using libdevice. This patch should not affect compilation with CUDA-8. There may be some observable differences for CUDA-9.0, though they are not expected to affect functionality. Tested: CUDA test-suite tests for all supported combinations of: CUDA: 7.0,7.5,8.0,9.0,9.1 GPU: sm_20, sm_35, sm_60, sm_70 Differential Revision: https://reviews.llvm.org/D42513 llvm-svn: 323713	2018-01-30 00:00:12 +00:00
Hiroshi Inoue	1019f8a98e	[NFC] fix trivial typos in comments "to to" -> "to" llvm-svn: 323627	2018-01-29 05:15:18 +00:00
Craig Topper	8cdb94901d	[X86] Add rdpid command line option and intrinsics. Summary: This patch adds -mrdpid/-mno-rdpid and the rdpid intrinsic. The corresponding LLVM commit has already been made. Reviewers: RKSimon, spatel, zvi, AndreiGrischenko Reviewed By: RKSimon Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D42272 llvm-svn: 323047	2018-01-20 18:36:52 +00:00
Abderrazek Zaafrani	ce8746d178	[AArch64] Add ARMv8.2-A FP16 scalar intrinsics https://reviews.llvm.org/D41792 llvm-svn: 323006	2018-01-19 23:11:18 +00:00
Douglas Yung	46474dae4d	[DOXYGEN] Fix doxygen and content issues in xmmintrin.h - Fix inaccurate instruction listings. - Fix small issues in _mm_getcsr and _mm_setcsr. - Fix description of NaN handling in comparison intrinsics. - Fix inaccurate description of _mm_movemask_pi8. - Fix inaccurate instruction mappings. - Fix typos. - Clarify wording on some descriptions. - Fix bit ranges in return value. - Fix typo in _mm_move_ms intrinsic instruction since it operates on singe-precision values, not double. - This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41523 llvm-svn: 322778	2018-01-17 22:53:15 +00:00
Craig Topper	f517f1a516	[X86] Implement old kunpck intrinsics using vector ops on vXi1 instead of integer shift/and/or Summary: kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation. I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations. Reviewers: RKSimon, spatel, zvi, jina.nahias Reviewed By: RKSimon Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D42016 llvm-svn: 322461	2018-01-14 19:23:50 +00:00
Sven van Haastregt	774355e321	[OpenCL] Reorder the CLK_sRGBx/sRGBA defines, NFC Swap them so that all channel order defines are ordered according to their values. llvm-svn: 322278	2018-01-11 14:05:38 +00:00
Douglas Yung	7ff91421b4	[DOXYGEN] Fix doxygen and content issues in avxintrin.h - Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed". - Fix a few typos and errors found during review. - Restore new line endings. This patch was made by Craig Flores llvm-svn: 322027	2018-01-08 21:21:17 +00:00
Douglas Yung	4c549c31bb	[DOXYGEN] Fix doxygen and content issues in smmintrin.h - Fix formatting issue due to hyphenated terms at line breaks. - Fix typo This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41520 llvm-svn: 321671	2018-01-02 20:45:29 +00:00
Douglas Yung	df1e9ef156	[DOXYGEN] Fix doxygen and content issues in pmmintrin.h - Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed". This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41518 llvm-svn: 321670	2018-01-02 20:42:53 +00:00
Douglas Yung	0686df106c	[DOXYGEN] Fix doxygen and content issues in emmintrin.h - Fixed innaccurate instruction mappings for various intrinsics. - Fixed description of NaN handling in comparison intrinsics. - Unify description of _mm_store_pd1 to match _mm_store1_pd. - Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed". - Fix typos. - Add missing italics command (\a) for params and fixed some parameter spellings. This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41516 llvm-svn: 321669	2018-01-02 20:39:29 +00:00
Coby Tayree	a09663a5c1	[x86][icelake][vbmi2] added vbmi2 feature recognition added intrinsics support for vbmi2 instructions _mm[128,256,512]_mask[z]_compress_epi[16,32] _mm[128,256,512]_mask_compressstoreu_epi[16,32] _mm[128,256,512]_mask[z]_expand_epi[16,32] _mm[128,256,512]_mask[z]_expandloadu_epi[16,32] _mm[128,256,512]_mask[z]_sh[l,r]di_epi[16,32,64] _mm[128,256,512]_mask_sh[l,r]dv_epi[16,32,64] matching a similar work on the backend (D40206) Differential Revision: https://reviews.llvm.org/D41557 llvm-svn: 321487	2017-12-27 11:25:07 +00:00
Coby Tayree	3d9c88cfec	[x86][icelake][vnni] added vnni feature recognition added intrinsics support for VNNI instructions _mm256_mask_dpbusd_epi32 _mm256_maskz_dpbusd_epi32 _mm256_dpbusd_epi32 _mm256_mask_dpbusds_epi32 _mm256_maskz_dpbusds_epi32 _mm256_dpbusds_epi32 _mm256_mask_dpwssd_epi32 _mm256_maskz_dpwssd_epi32 _mm256_dpwssd_epi32 _mm256_mask_dpwssds_epi32 _mm256_maskz_dpwssds_epi32 _mm256_dpwssds_epi32 _mm128_mask_dpbusd_epi32 _mm128_maskz_dpbusd_epi32 _mm128_dpbusd_epi32 _mm128_mask_dpbusds_epi32 _mm128_maskz_dpbusds_epi32 _mm128_dpbusds_epi32 _mm128_mask_dpwssd_epi32 _mm128_maskz_dpwssd_epi32 _mm128_dpwssd_epi32 _mm128_mask_dpwssds_epi32 _mm128_maskz_dpwssds_epi32 _mm128_dpwssds_epi32 _mm512_mask_dpbusd_epi32 _mm512_maskz_dpbusd_epi32 _mm512_dpbusd_epi32 _mm512_mask_dpbusds_epi32 _mm512_maskz_dpbusds_epi32 _mm512_dpbusds_epi32 _mm512_mask_dpwssd_epi32 _mm512_maskz_dpwssd_epi32 _mm512_dpwssd_epi32 _mm512_mask_dpwssds_epi32 _mm512_maskz_dpwssds_epi32 _mm512_dpwssds_epi32 matching a similar work on the backend (D40208) Differential Revision: https://reviews.llvm.org/D41558 llvm-svn: 321484	2017-12-27 10:37:51 +00:00
Coby Tayree	2268576fa0	[x86][icelake][bitalg] added bitalg feature recognition added intrinsics support for bitalg instructions _mm512_popcnt_epi16 _mm512_mask_popcnt_epi16 _mm512_maskz_popcnt_epi16 _mm512_popcnt_epi8 _mm512_mask_popcnt_epi8 _mm512_maskz_popcnt_epi8 _mm512_mask_bitshuffle_epi64_mask _mm512_bitshuffle_epi64_mask _mm256_popcnt_epi16 _mm256_mask_popcnt_epi16 _mm256_maskz_popcnt_epi16 _mm128_popcnt_epi16 _mm128_mask_popcnt_epi16 _mm128_maskz_popcnt_epi16 _mm256_popcnt_epi8 _mm256_mask_popcnt_epi8 _mm256_maskz_popcnt_epi8 _mm128_popcnt_epi8 _mm128_mask_popcnt_epi8 _mm128_maskz_popcnt_epi8 _mm256_mask_bitshuffle_epi32_mask _mm256_bitshuffle_epi32_mask _mm128_mask_bitshuffle_epi16_mask _mm128_bitshuffle_epi16_mask matching a similar work on the backend (D40222) Differential Revision: https://reviews.llvm.org/D41564 llvm-svn: 321483	2017-12-27 10:01:00 +00:00
Coby Tayree	cf96c876c6	[x86][icelake][vpclmulqdq] added vpclmulqdq feature recognition added intrinsics support for vpclmulqdq instructions _mm256_clmulepi64_epi128 _mm512_clmulepi64_epi128 matching a similar work on the backend (D40101) Differential Revision: https://reviews.llvm.org/D41573 llvm-svn: 321480	2017-12-27 09:00:31 +00:00
Coby Tayree	f4811ebc39	[x86][icelake][gfni] added gfni feature recognition added intrinsics support for gfni instructions _mm_gf2p8affineinv_epi64_epi8 _mm_mask_gf2p8affineinv_epi64_epi8 _mm_maskz_gf2p8affineinv_epi64_epi8 _mm256_gf2p8affineinv_epi64_epi8 _mm256_mask_gf2p8affineinv_epi64_epi8 _mm256_maskz_gf2p8affineinv_epi64_epi8 _mm512_gf2p8affineinv_epi64_epi8 _mm512_mask_gf2p8affineinv_epi64_epi8 _mm512_maskz_gf2p8affineinv_epi64_epi8 _mm_gf2p8affine_epi64_epi8 _mm_mask_gf2p8affine_epi64_epi8 _mm_maskz_gf2p8affine_epi64_epi8 _mm256_gf2p8affine_epi64_epi8 _mm256_mask_gf2p8affine_epi64_epi8 _mm256_maskz_gf2p8affine_epi64_epi8 _mm512_gf2p8affine_epi64_epi8 _mm512_mask_gf2p8affine_epi64_epi8 _mm512_maskz_gf2p8affine_epi64_epi8 _mm_gf2p8mul_epi8 _mm_mask_gf2p8mul_epi8 _mm_maskz_gf2p8mul_epi8 _mm256_gf2p8mul_epi8 _mm256_mask_gf2p8mul_epi8 _mm256_maskz_gf2p8mul_epi8 _mm512_gf2p8mul_epi8 _mm512_mask_gf2p8mul_epi8 _mm512_maskz_gf2p8mul_epi8 matching a similar work on the backend (D40373) Differential Revision: https://reviews.llvm.org/D41582 llvm-svn: 321477	2017-12-27 08:37:47 +00:00
Coby Tayree	a1e5f0c339	[x86][icelake][vaes] added vaes feature recognition added intrinsics support for vaes instructions, matching a similar work on the backend (D40078) _mm256_aesenc_epi128 _mm512_aesenc_epi128 _mm256_aesenclast_epi128 _mm512_aesenclast_epi128 _mm256_aesdec_epi128 _mm512_aesdec_epi128 _mm256_aesdeclast_epi128 _mm512_aesdeclast_epi128 llvm-svn: 321474	2017-12-27 08:16:54 +00:00
Artem Belevich	3cebc738b6	[CUDA] More fixes for __shfl_* intrinsics. * __shfl_{up,down}* uses unsigned int for the third parameter. * added [unsigned] long overloads for non-sync shuffles. Differential Revision: https://reviews.llvm.org/D41521 llvm-svn: 321326	2017-12-21 23:52:09 +00:00
Craig Topper	170de4b4ba	[X86] Allow _mm_prefetch (both the header implementation and the builtin) to accept bit 2 which is supposed to indicate the prefetched addresses will be written to Add the appropriate _MM_HINT_ET0/ET1 defines to match gcc. llvm-svn: 321325	2017-12-21 23:50:22 +00:00
Craig Topper	54b3f718e4	[X86] Add more CPUID bits to cpuid.h to match gcc and support icelake features. llvm-svn: 321129	2017-12-20 00:46:09 +00:00
Craig Topper	798f2c037c	[X86] Add the two files I forgot to commit in r320915. llvm-svn: 320916	2017-12-16 06:10:24 +00:00
Craig Topper	b846d1ff76	[X86] Add builtins and tests for 128 and 256 bit vpopcntdq. llvm-svn: 320915	2017-12-16 06:02:31 +00:00
Stephan Bergmann	feed26ff07	In stdbool.h, define bool, false, true only in gnu++98 GCC has meanwhile corrected that with the similar <https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=216679> "C++11 explicitly forbids macros for bool, true and false." Differential Revision: https://reviews.llvm.org/D40167 llvm-svn: 320135	2017-12-08 08:28:08 +00:00
Artem Belevich	a659d2590e	[NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang. Differential Revision: https://reviews.llvm.org/D40872 llvm-svn: 319909	2017-12-06 17:50:05 +00:00

1 2 3 4 5 ...

1311 Commits