Commit Graph

1311 Commits

Author SHA1 Message Date
Craig Topper 934f86a848 [X86] Fix some inconsistent formatting in the first line of our intrinsics headers.
Some were too long and some were too short.

llvm-svn: 331559
2018-05-04 21:45:25 +00:00
Volodymyr Sapsai 2d77119f72 Revert "Emit an error when mixing <stdatomic.h> and <atomic>"
It reverts r331378 as it caused test failures

    ThreadSanitizer-x86_64 :: Darwin/gcd-groups-destructor.mm
    ThreadSanitizer-x86_64 :: Darwin/libcxx-shared-ptr-stress.mm
    ThreadSanitizer-x86_64 :: Darwin/xpc-race.mm

Only clang part of the change is reverted, libc++ part remains as is because it
emits error less aggressively.

llvm-svn: 331392
2018-05-02 19:52:07 +00:00
Volodymyr Sapsai c0a278aada Emit an error when mixing <stdatomic.h> and <atomic>
Atomics in C and C++ are incompatible at the moment and mixing the
headers can result in confusing error messages.

Emit an error explicitly telling about the incompatibility. Introduce
the macro `__ALLOW_STDC_ATOMICS_IN_CXX__` that allows to choose in C++
between C atomics and C++ atomics.

rdar://problem/27435938

Reviewers: rsmith, EricWF, mclow.lists

Reviewed By: mclow.lists

Subscribers: jkorous-apple, christof, bumblebritches57, JonChesterfield, smeenai, cfe-commits

Differential Revision: https://reviews.llvm.org/D45470

llvm-svn: 331378
2018-05-02 17:50:43 +00:00
Gabor Buella a51e0c2243 [X86] directstore and movdir64b intrinsics
Reviewers: spatel, craig.topper, RKSimon

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45984

llvm-svn: 331249
2018-05-01 10:05:42 +00:00
Craig Topper e95bde33df [X86] Add support for _mm512_mullox_epi64 and _mm512_mask_mullox_epi64 intrinsics to match icc.
On AVX512F targets we'll produce an emulated sequence using 3 pmuludqs with shifts and adds. On AVX512DQ we'll use vpmulld.

Fixes PR37140.

llvm-svn: 330923
2018-04-26 05:38:39 +00:00
Artem Belevich 3cce307799 [CUDA] Enable CUDA compilation with CUDA-9.2
Differential Revision: https://reviews.llvm.org/D45827

llvm-svn: 330753
2018-04-24 18:23:19 +00:00
Craig Topper 5f1d10e26e [X86] Add recently added intrinsic headers to the module map.
llvm-svn: 330744
2018-04-24 17:40:49 +00:00
Craig Topper bd16b11255 [X86] Consistently use double underscore at the beginning of the include guards in our intrinsic headers.
Most files used double underscore, but a few used single. This converges them all to double.

llvm-svn: 330743
2018-04-24 17:40:47 +00:00
Craig Topper ce281a41b5 [X86] Remove '#ifdef __x86_64__' around mask_set1_epi64 intrinsics.
The unmasked versions already didn't have this restrction. I don't think gcc or icc limit these to 64-bit mode so we shouldn't either.

llvm-svn: 330681
2018-04-24 03:36:08 +00:00
Gabor Buella eba6c42e66 [X86] WaitPKG intrinsics
Reviewers: craig.topper, zvi

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45254

llvm-svn: 330463
2018-04-20 18:44:33 +00:00
Artem Belevich 5832eb4cfd [CUDA] added missing __ldg(const signed char *)
Differential Revision: https://reviews.llvm.org/D45780

llvm-svn: 330280
2018-04-18 18:33:43 +00:00
Gabor Buella b220dd2b6c [X86] Introduce cldemote intrinsic
Reviewers: craig.topper, zvi

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45257

llvm-svn: 329993
2018-04-13 07:37:24 +00:00
Gabor Buella e708a09e21 [X86] Introduce wbinvd intrinsic
A previously missing intrinsic for an old instruction.

Reviewers: craig.topper, echristo

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45311

llvm-svn: 329937
2018-04-12 18:42:02 +00:00
Gabor Buella a052016ef2 [x86] wbnoinvd intrinsic
The WBNOINVD instruction writes back all modified
cache lines in the processor’s internal cache to main memory
but does not invalidate (flush) the internal caches.

Reviewers: craig.topper, zvi, ashlykov

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D43817

llvm-svn: 329848
2018-04-11 20:09:09 +00:00
Craig Topper dcdac965f1 [X86] Fix typo in intrinsic header file __mask16->__mmask16 from r329775.
llvm-svn: 329777
2018-04-11 05:17:14 +00:00
Craig Topper 2575454fe9 [X86] Replace 512-bit masked pmaddubsw and pmaddwd intrinsic with unmasked intrinsic and a select.
This makes it consistent with the 128/256-bit functions.

Someday maybe we'll have all the masking moved to selects.

llvm-svn: 329775
2018-04-11 04:55:10 +00:00
Alexander Kornienko 2a8c18d991 Fix typos in clang
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:

  archtype
  cas
  classs
  checkk
  compres
  definit
  frome
  iff
  inteval
  ith
  lod
  methode
  nd
  optin
  ot
  pres
  statics
  te
  thru

Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)

Differential revision: https://reviews.llvm.org/D44188

llvm-svn: 329399
2018-04-06 15:14:32 +00:00
Douglas Yung 17d2ef90e0 [DOXYGEN] Fix doxygen and content issues in mmintrin.h
- Fix instruction mappings/listings for various intrinsics

This patch was made by Craig Flores

Differential Revision: https://reviews.llvm.org/D41517

llvm-svn: 327090
2018-03-09 00:38:51 +00:00
Craig Topper 260ed8647a [X86] Fix typo in cpuid.h, bit_AVX51SER->bit_AVX512ER.
llvm-svn: 326807
2018-03-06 16:06:44 +00:00
Alexander Ivchenko 9d3b45301f [x86][CET] Introduce _get_ssp, _inc_ssp intrinsics
Summary:
The _get_ssp intrinsic can be used to retrieve the
shadow stack pointer, independent of the current arch -- in
contract with the rdsspd and the rdsspq intrinsics.
Also, this intrinsic returns zero on CPUs which don't
support CET. The rdssp[d|q] instruction is decoded as nop,
essentially just returning the input operand, which is zero.
Example result of compilation:

```
xorl    %eax, %eax
movl    %eax, %ecx
rdsspq  %rcx         # NOP when CET is not supported
movq    %rcx, %rax   # return zero
```

Reviewers: craig.topper

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D43814

llvm-svn: 326689
2018-03-05 11:30:28 +00:00
Craig Topper 21f66a3f6b [X86] Remove some masked cvt builtins that can be replaced with legacy sse/avx buiiltins and a select.
llvm-svn: 326039
2018-02-24 18:55:13 +00:00
Craig Topper 5dc6ca8e5b [X86] Remove __builtin_ia32_permvarsf256_mask and __builtin_ia32_permvarsi256_mask and use the avx2 unmasked versions and a select instead.
llvm-svn: 326022
2018-02-24 06:46:42 +00:00
Artem Belevich df38f155ec [CUDA] Added missing functions.
Initial commit missed sincos(float), llabs() and few atomics that we
used to pull in from device_functions.hpp, which we no longer include.

Differential Revision: https://reviews.llvm.org/D43602

llvm-svn: 325814
2018-02-22 18:40:52 +00:00
Artem Belevich 4dbea99137 [CUDA] Added missing __threadfence_system() function for CUDA9.
llvm-svn: 325626
2018-02-20 21:25:30 +00:00
Craig Topper 0a70c3c7af [X86] Remove mask from 512 bit pmulhrsw/pmulhw/pmulhuw builtins.
We now use a vselect node in IR around an unmasked builtin. This makes it consistent with the 128 and 256 bit versions.

llvm-svn: 325560
2018-02-20 07:28:18 +00:00
Ekaterina Romanova f28751849e [DOXYGEN] There was a request in the review D41507 to change the notation for hex numbers in doxygen documentation from <...>h to 0x<...>. Both of these notations were used in x86 intrinsics documentation. I promised to change them to 0x<...> for consistency.
Differential Revision: https://reviews.llvm.org/D41888

llvm-svn: 325312
2018-02-16 03:11:35 +00:00
Artem Belevich fbc56a904f [CUDA] Added partial support for CUDA-9.1
Clang can use CUDA-9.1 now, though new APIs (are not implemented yet.

The major change is that headers in CUDA-9.1 went through substantial
changes that started in CUDA-9.0 which required substantial changes
in the cuda compatibility headers provided by clang.

There are two major issues:
* CUDA SDK no longer provides declarations for libdevice functions.
* A lot of device-side functions have become nvcc's builtins and
  CUDA headers no longer contain their implementations.

This patch changes the way CUDA headers are handled if we compile
with CUDA 9.x. Both 9.0 and 9.1 are affected.

* Clang provides its own declarations of libdevice functions.
* For CUDA-9.x clang now provides implementation of device-side
  'standard library' functions using libdevice.

This patch should not affect compilation with CUDA-8. There may be
some observable differences for CUDA-9.0, though they are not expected
to affect functionality.

Tested: CUDA test-suite tests for all supported combinations of:
        CUDA: 7.0,7.5,8.0,9.0,9.1
        GPU: sm_20, sm_35, sm_60, sm_70

Differential Revision: https://reviews.llvm.org/D42513

llvm-svn: 323713
2018-01-30 00:00:12 +00:00
Hiroshi Inoue 1019f8a98e [NFC] fix trivial typos in comments
"to to" -> "to"

llvm-svn: 323627
2018-01-29 05:15:18 +00:00
Craig Topper 8cdb94901d [X86] Add rdpid command line option and intrinsics.
Summary: This patch adds -mrdpid/-mno-rdpid and the rdpid intrinsic. The corresponding LLVM commit has already been made.

Reviewers: RKSimon, spatel, zvi, AndreiGrischenko

Reviewed By: RKSimon

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D42272

llvm-svn: 323047
2018-01-20 18:36:52 +00:00
Abderrazek Zaafrani ce8746d178 [AArch64] Add ARMv8.2-A FP16 scalar intrinsics
https://reviews.llvm.org/D41792

llvm-svn: 323006
2018-01-19 23:11:18 +00:00
Douglas Yung 46474dae4d [DOXYGEN] Fix doxygen and content issues in xmmintrin.h
- Fix inaccurate instruction listings.
- Fix small issues in _mm_getcsr and _mm_setcsr.
- Fix description of NaN handling in comparison intrinsics.
- Fix inaccurate description of _mm_movemask_pi8.
- Fix inaccurate instruction mappings.
- Fix typos.
- Clarify wording on some descriptions.
- Fix bit ranges in return value.
- Fix typo in _mm_move_ms intrinsic instruction since it operates on singe-precision values, not double.
- This patch was made by Craig Flores

Differential Revision: https://reviews.llvm.org/D41523

llvm-svn: 322778
2018-01-17 22:53:15 +00:00
Craig Topper f517f1a516 [X86] Implement old kunpck intrinsics using vector ops on vXi1 instead of integer shift/and/or
Summary:
kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation.

I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations.

Reviewers: RKSimon, spatel, zvi, jina.nahias

Reviewed By: RKSimon

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D42016

llvm-svn: 322461
2018-01-14 19:23:50 +00:00
Sven van Haastregt 774355e321 [OpenCL] Reorder the CLK_sRGBx/sRGBA defines, NFC
Swap them so that all channel order defines are ordered according to
their values.

llvm-svn: 322278
2018-01-11 14:05:38 +00:00
Douglas Yung 7ff91421b4 [DOXYGEN] Fix doxygen and content issues in avxintrin.h
- Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed".
- Fix a few typos and errors found during review.
- Restore new line endings.

This patch was made by Craig Flores

llvm-svn: 322027
2018-01-08 21:21:17 +00:00
Douglas Yung 4c549c31bb [DOXYGEN] Fix doxygen and content issues in smmintrin.h
- Fix formatting issue due to hyphenated terms at line breaks.
- Fix typo

This patch was made by Craig Flores

Differential Revision: https://reviews.llvm.org/D41520

llvm-svn: 321671
2018-01-02 20:45:29 +00:00
Douglas Yung df1e9ef156 [DOXYGEN] Fix doxygen and content issues in pmmintrin.h
- Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed".

This patch was made by Craig Flores

Differential Revision: https://reviews.llvm.org/D41518

llvm-svn: 321670
2018-01-02 20:42:53 +00:00
Douglas Yung 0686df106c [DOXYGEN] Fix doxygen and content issues in emmintrin.h
- Fixed innaccurate instruction mappings for various intrinsics.
- Fixed description of NaN handling in comparison intrinsics.
- Unify description of _mm_store_pd1 to match _mm_store1_pd.
- Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed".
- Fix typos.
- Add missing italics command (\a) for params and fixed some parameter spellings.

This patch was made by Craig Flores

Differential Revision: https://reviews.llvm.org/D41516

llvm-svn: 321669
2018-01-02 20:39:29 +00:00
Coby Tayree a09663a5c1 [x86][icelake][vbmi2]
added vbmi2 feature recognition
added intrinsics support for vbmi2 instructions
_mm[128,256,512]_mask[z]_compress_epi[16,32]
_mm[128,256,512]_mask_compressstoreu_epi[16,32]
_mm[128,256,512]_mask[z]_expand_epi[16,32]
_mm[128,256,512]_mask[z]_expandloadu_epi[16,32]
_mm[128,256,512]_mask[z]_sh[l,r]di_epi[16,32,64]
_mm[128,256,512]_mask_sh[l,r]dv_epi[16,32,64]
matching a similar work on the backend (D40206)
Differential Revision: https://reviews.llvm.org/D41557

llvm-svn: 321487
2017-12-27 11:25:07 +00:00
Coby Tayree 3d9c88cfec [x86][icelake][vnni]
added vnni feature recognition
added intrinsics support for VNNI instructions
_mm256_mask_dpbusd_epi32
_mm256_maskz_dpbusd_epi32
_mm256_dpbusd_epi32
_mm256_mask_dpbusds_epi32
_mm256_maskz_dpbusds_epi32
_mm256_dpbusds_epi32
_mm256_mask_dpwssd_epi32
_mm256_maskz_dpwssd_epi32
_mm256_dpwssd_epi32
_mm256_mask_dpwssds_epi32
_mm256_maskz_dpwssds_epi32
_mm256_dpwssds_epi32
_mm128_mask_dpbusd_epi32
_mm128_maskz_dpbusd_epi32
_mm128_dpbusd_epi32
_mm128_mask_dpbusds_epi32
_mm128_maskz_dpbusds_epi32
_mm128_dpbusds_epi32
_mm128_mask_dpwssd_epi32
_mm128_maskz_dpwssd_epi32
_mm128_dpwssd_epi32
_mm128_mask_dpwssds_epi32
_mm128_maskz_dpwssds_epi32
_mm128_dpwssds_epi32
_mm512_mask_dpbusd_epi32
_mm512_maskz_dpbusd_epi32
_mm512_dpbusd_epi32
_mm512_mask_dpbusds_epi32
_mm512_maskz_dpbusds_epi32
_mm512_dpbusds_epi32
_mm512_mask_dpwssd_epi32
_mm512_maskz_dpwssd_epi32
_mm512_dpwssd_epi32
_mm512_mask_dpwssds_epi32
_mm512_maskz_dpwssds_epi32
_mm512_dpwssds_epi32
matching a similar work on the backend (D40208)
Differential Revision: https://reviews.llvm.org/D41558

llvm-svn: 321484
2017-12-27 10:37:51 +00:00
Coby Tayree 2268576fa0 [x86][icelake][bitalg]
added bitalg feature recognition
added intrinsics support for bitalg instructions
_mm512_popcnt_epi16
_mm512_mask_popcnt_epi16
_mm512_maskz_popcnt_epi16
_mm512_popcnt_epi8
_mm512_mask_popcnt_epi8
_mm512_maskz_popcnt_epi8
_mm512_mask_bitshuffle_epi64_mask
_mm512_bitshuffle_epi64_mask
_mm256_popcnt_epi16
_mm256_mask_popcnt_epi16
_mm256_maskz_popcnt_epi16
_mm128_popcnt_epi16
_mm128_mask_popcnt_epi16
_mm128_maskz_popcnt_epi16
_mm256_popcnt_epi8
_mm256_mask_popcnt_epi8
_mm256_maskz_popcnt_epi8
_mm128_popcnt_epi8
_mm128_mask_popcnt_epi8
_mm128_maskz_popcnt_epi8
_mm256_mask_bitshuffle_epi32_mask
_mm256_bitshuffle_epi32_mask
_mm128_mask_bitshuffle_epi16_mask
_mm128_bitshuffle_epi16_mask
matching a similar work on the backend (D40222)
Differential Revision: https://reviews.llvm.org/D41564

llvm-svn: 321483
2017-12-27 10:01:00 +00:00
Coby Tayree cf96c876c6 [x86][icelake][vpclmulqdq]
added vpclmulqdq feature recognition
added intrinsics support for vpclmulqdq instructions
  _mm256_clmulepi64_epi128
  _mm512_clmulepi64_epi128
matching a similar work on the backend (D40101)
Differential Revision: https://reviews.llvm.org/D41573

llvm-svn: 321480
2017-12-27 09:00:31 +00:00
Coby Tayree f4811ebc39 [x86][icelake][gfni]
added gfni feature recognition
added intrinsics support for gfni instructions
  _mm_gf2p8affineinv_epi64_epi8
  _mm_mask_gf2p8affineinv_epi64_epi8
  _mm_maskz_gf2p8affineinv_epi64_epi8
  _mm256_gf2p8affineinv_epi64_epi8
  _mm256_mask_gf2p8affineinv_epi64_epi8
  _mm256_maskz_gf2p8affineinv_epi64_epi8
  _mm512_gf2p8affineinv_epi64_epi8
  _mm512_mask_gf2p8affineinv_epi64_epi8
  _mm512_maskz_gf2p8affineinv_epi64_epi8
  _mm_gf2p8affine_epi64_epi8
  _mm_mask_gf2p8affine_epi64_epi8
  _mm_maskz_gf2p8affine_epi64_epi8
  _mm256_gf2p8affine_epi64_epi8
  _mm256_mask_gf2p8affine_epi64_epi8
  _mm256_maskz_gf2p8affine_epi64_epi8
  _mm512_gf2p8affine_epi64_epi8
  _mm512_mask_gf2p8affine_epi64_epi8
  _mm512_maskz_gf2p8affine_epi64_epi8
  _mm_gf2p8mul_epi8
  _mm_mask_gf2p8mul_epi8
  _mm_maskz_gf2p8mul_epi8
  _mm256_gf2p8mul_epi8
  _mm256_mask_gf2p8mul_epi8
  _mm256_maskz_gf2p8mul_epi8
  _mm512_gf2p8mul_epi8
  _mm512_mask_gf2p8mul_epi8
  _mm512_maskz_gf2p8mul_epi8
matching a similar work on the backend (D40373)
Differential Revision: https://reviews.llvm.org/D41582

llvm-svn: 321477
2017-12-27 08:37:47 +00:00
Coby Tayree a1e5f0c339 [x86][icelake][vaes]
added vaes feature recognition
added intrinsics support for vaes instructions, matching a similar work on the backend (D40078)
  _mm256_aesenc_epi128
  _mm512_aesenc_epi128
  _mm256_aesenclast_epi128
  _mm512_aesenclast_epi128
  _mm256_aesdec_epi128
  _mm512_aesdec_epi128
  _mm256_aesdeclast_epi128
  _mm512_aesdeclast_epi128

llvm-svn: 321474
2017-12-27 08:16:54 +00:00
Artem Belevich 3cebc738b6 [CUDA] More fixes for __shfl_* intrinsics.
* __shfl_{up,down}* uses unsigned int for the third parameter.
* added [unsigned] long overloads for non-sync shuffles.

Differential Revision: https://reviews.llvm.org/D41521

llvm-svn: 321326
2017-12-21 23:52:09 +00:00
Craig Topper 170de4b4ba [X86] Allow _mm_prefetch (both the header implementation and the builtin) to accept bit 2 which is supposed to indicate the prefetched addresses will be written to
Add the appropriate _MM_HINT_ET0/ET1 defines to match gcc.

llvm-svn: 321325
2017-12-21 23:50:22 +00:00
Craig Topper 54b3f718e4 [X86] Add more CPUID bits to cpuid.h to match gcc and support icelake features.
llvm-svn: 321129
2017-12-20 00:46:09 +00:00
Craig Topper 798f2c037c [X86] Add the two files I forgot to commit in r320915.
llvm-svn: 320916
2017-12-16 06:10:24 +00:00
Craig Topper b846d1ff76 [X86] Add builtins and tests for 128 and 256 bit vpopcntdq.
llvm-svn: 320915
2017-12-16 06:02:31 +00:00
Stephan Bergmann feed26ff07 In stdbool.h, define bool, false, true only in gnu++98
GCC has meanwhile corrected that with the similar
<https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=216679> "C++11
explicitly forbids macros for bool, true and false."

Differential Revision: https://reviews.llvm.org/D40167

llvm-svn: 320135
2017-12-08 08:28:08 +00:00
Artem Belevich a659d2590e [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang.
Differential Revision: https://reviews.llvm.org/D40872

llvm-svn: 319909
2017-12-06 17:50:05 +00:00