Commit Graph

980 Commits

Author SHA1 Message Date
Michael Zuckerman bf05a4589e [Clang][AVX512] completing missing intrinsics for [vpabs] instruction set
Differential Revision: http://reviews.llvm.org/D20069

llvm-svn: 269680
2016-05-16 18:57:24 +00:00
Nico Weber 379a1952b3 [ms] Reintroduce feature guards in intrinsic headers in Microsoft mode
Visual Studio's C++ standard library headers include intrin.h, so the intrinsic
headers get included a lot more often in Microsoft mode than elsewhere. The
AVX512 intrinsics are a lot of code (0.7 MB, causing 30% compile time overhead
for small programs including e.g. <string> and 6% compile time overhead for
larger projects like e.g. v8). Since multiversioning can't be relied on in
Microsoft mode (cl.exe doesn't support it), having faster compiles seems like
the much better tradeoff until we have a better intrinsic story going forward
(which we'll need for e.g. PR19898).

Actually using intrinsics on Windows already requires the right /arch:
settings, so this patch should have no big behavior change.

See also thread "The intrinsics headers (especially avx512) are too big. What
to do about it?" on cfe-dev.

http://reviews.llvm.org/D20291

llvm-svn: 269675
2016-05-16 18:14:07 +00:00
Michael Zuckerman cb85677471 [Clang][AVX512] completing missing intrinsics [vsqrt|vrsqrt|vrcp14 ].
Differential Revision: http://reviews.llvm.org/D20068

llvm-svn: 269649
2016-05-16 11:42:01 +00:00
Craig Topper 1aa231e3aa [X86] Add typecasts to remove most assumptions about what __m128i/__m256i is defined as. Add similar typecasts for the fp types as well.
llvm-svn: 269632
2016-05-16 06:38:42 +00:00
Craig Topper 9c6c85f1ad [AVX512] Add typecasts to some intrinsics to avoid doing operations on the __m512/__m512i/__m512d types.
llvm-svn: 269631
2016-05-16 06:38:36 +00:00
Craig Topper 91f23d900f [X86] Remove bad cast from the 'int' return type of __builtin_ia32_kortestchi to '__mask16' before return in an 'int' intrinsic.
llvm-svn: 269621
2016-05-16 01:09:16 +00:00
Craig Topper 7d00d2031d [AVX512] Fix bad typecasts on return value for 512-bit integer byte/word compare builtins.
llvm-svn: 269620
2016-05-16 00:51:06 +00:00
Craig Topper dca1f230ae [AVX512] Add intrinsics for 512-bit insertf32x8/insertf32x4/inserti32x4.
llvm-svn: 269617
2016-05-15 21:26:20 +00:00
Craig Topper 79d05c9b3d [AVX512] Mark some integer builtin arguments that go to immediates in final instructions as an ICE.
llvm-svn: 269613
2016-05-15 20:10:06 +00:00
Craig Topper 9864c59c89 [AVX512] Move unary negations to the left side of typecasts to specific vector type. The __m128/__m256/__m512 types should be treated more opaquely and not have any operations performed on them.
llvm-svn: 269612
2016-05-15 20:10:03 +00:00
Craig Topper f32e2fbe0e [AVX512] Use the correct mask type in an intrinsic.
llvm-svn: 269611
2016-05-15 20:10:00 +00:00
Craig Topper b81d430d3a [AVX512] Fix an intrinsic that was passing -2 as a mask instead of -1.
llvm-svn: 269610
2016-05-15 20:09:58 +00:00
Craig Topper 4537ea74eb [X86] Change most 'void' pointers in builtin type lists to more correct types. Fix some unaligned load/store intrinsics to use a less aligned type in their pointer casts.
llvm-svn: 269552
2016-05-14 06:03:13 +00:00
Michael Zuckerman 13d3c002df [clang][AVX512] completing missing set intrinsics
Differential Revision: http://reviews.llvm.org/D20099

llvm-svn: 269172
2016-05-11 11:41:29 +00:00
Michael Zuckerman 5e2c6b6200 [clang][AVX512] completing missing intrinsics for [vpermt2d|vptestm] instruction set.
Differential Revision: http://reviews.llvm.org/D20096

llvm-svn: 269170
2016-05-11 11:21:18 +00:00
Michael Zuckerman e9e8e573e3 [Clang][AVX512] completing missing intrinsics [load/store]
Differential Revision: http://reviews.llvm.org/D20063

llvm-svn: 269056
2016-05-10 13:13:54 +00:00
Michael Zuckerman de860e5585 [Clang][AVX512] completing missing intrinsics [vmin/vmax]{sd|sq|uq|ud}.
Differential Revision: http://reviews.llvm.org/D20064

llvm-svn: 269042
2016-05-10 11:34:19 +00:00
Michael Zuckerman 2564d2f5fe [Clang][AVX512] completing missing intrinsics [vextractf].
Differential Revision: http://reviews.llvm.org/D20061

llvm-svn: 269037
2016-05-10 10:14:50 +00:00
Michael Zuckerman 7360d8a9cc [Clang][AVX512] completing missing intrinsics [roundscale, ceil, floor]
Differential Revision: http://reviews.llvm.org/D20070

llvm-svn: 269022
2016-05-10 07:30:58 +00:00
Michael Zuckerman f9be3bb1d5 [clang][AVX512] completing missing intrinsics [vmin/vmax].
Differential Revision: http://reviews.llvm.org/D20062

llvm-svn: 268910
2016-05-09 12:38:49 +00:00
Michael Zuckerman f15447537f [Clang][AVX512] completing missing intrinsics [CVT]
Differential Revision: http://reviews.llvm.org/D20056

llvm-svn: 268903
2016-05-09 10:32:51 +00:00
Michael Zuckerman e6f7389b5a [Clang][Builtin][AVX512] Adding intrinsics fot cvt{u}si2s{d|s} cvt{sd|ss}2{ss|sd} instruction set
Differential Revision: http://reviews.llvm.org/D19765

llvm-svn: 268481
2016-05-04 08:55:11 +00:00
Michael Zuckerman c66770313a [clang][AVX512][BuiltIn] Adding intrinsics for cast{pd|ps|si}128_{pd|ps|si}512 and castsi256_si512 instruction set
Differential Revision: http://reviews.llvm.org/D19858

llvm-svn: 268387
2016-05-03 14:26:52 +00:00
Michael Zuckerman e871785eb6 [Clang][avx512][Builtin] Adding intrinsics for cvtw2mask{128|256|512} instruction set
Differential Revision: http://reviews.llvm.org/D19766

llvm-svn: 268385
2016-05-03 14:12:23 +00:00
Michael Zuckerman 8bfb7776e4 [Clang][AVX512][Builtin] Adding intrinsics for vcvt{ph|ps}2{ps|ph} instruction set
Differential Revision: http://reviews.llvm.org/D19767

llvm-svn: 268376
2016-05-03 12:45:04 +00:00
Michael Zuckerman 138fc5b5a8 [Clang][AVX512][Builtin] Adding intrinsics for vcvttpd2udq instruction set
Differential Revision: http://reviews.llvm.org/D19768

llvm-svn: 268373
2016-05-03 11:05:24 +00:00
Michael Zuckerman 708e759b86 [Clang][AVX512][BUILTIN] Adding intrinsics for compressstore{df|di|sf|si} instruction set.
Differential Revision: http://reviews.llvm.org/D19808

llvm-svn: 268372
2016-05-03 10:42:46 +00:00
Michael Zuckerman 5f0e96e56a [CLANG][AVX512][BUILTIN]movap{d|s}{128|256|512}
Differential Revision: http://reviews.llvm.org/D17818

llvm-svn: 268230
2016-05-02 14:02:01 +00:00
Michael Zuckerman d6e68ce75f [Clang][AVX512][BuiltIn] Adding intrinsics for cvtps2pd instruction set
Differential Revision: http://reviews.llvm.org/D19774

llvm-svn: 268217
2016-05-02 09:42:31 +00:00
Michael Zuckerman 6a0e0871db [Clang][avx512][builtin] Adding intrinsics for vexpand{d|q|ps|pd} instrctuon set
Differential Revision: http://reviews.llvm.org/D19467

llvm-svn: 268214
2016-05-02 08:36:41 +00:00
Michael Zuckerman c62f27e3f4 [Clang][BuiltIn][avx512] Adding intrinsics for vpshufd instruction set
Differential Revision: http://reviews.llvm.org/D19580

llvm-svn: 268213
2016-05-02 07:35:27 +00:00
Michael Zuckerman ac1e519944 [clang][Builtin][AVX512] Adding intrinsics for vmovshdup and vmovsldup instruction set
Differential Revision: http://reviews.llvm.org/D19595

llvm-svn: 268196
2016-05-01 14:43:43 +00:00
Michael Zuckerman 0b9d105a16 [clang][BuiltIn][AVX512]Adding intrinsics for cmp{ss|sd} instruction set.
Differential Revision: http://reviews.llvm.org/D19601

llvm-svn: 268028
2016-04-29 11:01:16 +00:00
Michael Zuckerman 41f5a37707 [Clang][AVX512][Builtin] Adding intrinsics for compress instruction set
Differential Revision: http://reviews.llvm.org/D19599

llvm-svn: 268013
2016-04-29 08:52:02 +00:00
Michael Zuckerman de8d3753d3 [clang][AVX512][Builtin] Adding intrinsics for the SAD instruction set.
Differential Revision: http://reviews.llvm.org/D19591

llvm-svn: 267942
2016-04-28 21:21:08 +00:00
Michael Zuckerman 533e065bdc [Clang][BuiltIn][AVX512] Adding intrinsics fot align{d|q} and palignr instruction set
Differential Revision: http://reviews.llvm.org/D19588

llvm-svn: 267876
2016-04-28 12:47:30 +00:00
Michael Zuckerman 514f05543f [Clang][Builtin][AVX512] Adding intrisnics for the vpconflict{q|d} instruction set
Differential Revision: http://reviews.llvm.org/D19525

llvm-svn: 267728
2016-04-27 15:35:13 +00:00
Michael Zuckerman 8c2900f44d [Clang][BuiltIn][AVX512] Adding intrinsics without mask for VBROADCAST and VPBROADCAST instruction set .
Differential Revision: http://reviews.llvm.org/D19196

llvm-svn: 267696
2016-04-27 11:43:14 +00:00
Michael Zuckerman 7c85a8cb46 [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Differential Revision: http://reviews.llvm.org/D19529

llvm-svn: 267690
2016-04-27 10:44:15 +00:00
Ekaterina Romanova a2d72377a1 Updated doxygen comments for intrinsics.
(1) Removed \code.. \endcode tags around the instruction name. This matches the doxygen format for all other intrinsics.
(2) Did a better formatting for the comments (to fit into 80 columns more compactly).

llvm-svn: 267676
2016-04-27 07:14:02 +00:00
Michael Zuckerman fa508e8b6d [Clang][Builtin][AVX512]Adding k-register logic intrinsics KAND, KANDN, KOR, KORTEST, KXNOR, KXOR, KUNPACK instruction set.
Differential Revision: http://reviews.llvm.org/D19466

llvm-svn: 267425
2016-04-25 16:42:29 +00:00
Michael Zuckerman edc82fe3ef [Clang][Builtin][AVX512]Adding intrinsics for vfpclass{sd|ss} vfpclass{pd|ps} instruction set
Differential Revision: http://reviews.llvm.org/D19476

llvm-svn: 267414
2016-04-25 14:48:23 +00:00
Michael Zuckerman fcf32c2f00 [Clang][AVX512][BUILTIN] Adding intrinsics for VSCATTERPF{1|0}{DPS|QPS|DPD|QPD} instruction set
Differential Revision: http://reviews.llvm.org/D19313

llvm-svn: 267398
2016-04-25 13:01:40 +00:00
Michael Zuckerman 8938e836c4 [Clang][AVX512][BuiltIn] Adding support to intrinsics of VPERMD and VPERMW instruction set
Differential Revision: http://reviews.llvm.org/D19195

llvm-svn: 267380
2016-04-25 05:32:35 +00:00
Michael Zuckerman 743d68c3cb [clang][AVX512][Builtin] adding intrinsics for vf{n}madd{ss|sd} and vf{n}sub{ss|sd} instruction set
Differential Revision: http://reviews.llvm.org/D19320

llvm-svn: 267135
2016-04-22 10:56:24 +00:00
Michael Zuckerman a1ceca20b6 [Clang][AVX512][BUILTIN] Adding scalar intrinsics for rsqrt14 ,rcp14, getexp and getmant instruction set
Differential Revision: http://reviews.llvm.org/D19326

llvm-svn: 267129
2016-04-22 10:06:10 +00:00
Artem Belevich c34a519407 [CUDA] removed unneeded __nvvm_reflect_anchor()
Since r265060 LLVM infers correct __nvvm_reflect attributes, so
explicit declaration of __nvvm_reflect() is no longer needed.

Differential Revision: http://reviews.llvm.org/D19074

llvm-svn: 267062
2016-04-21 21:40:27 +00:00
Michael Zuckerman 4fa96af4db [Clang][AVX512][BuiltIn] Adding intrinsics of VGATHER{DPS|DPD} , VPGATHER{QD|QQ|DD|DQ} and VGATHERPF{0|1}{DPS|QPS|DPD|QPD} instruction set .
Differential Revision: http://reviews.llvm.org/D19224

llvm-svn: 266983
2016-04-21 12:47:27 +00:00
Richard Smith e0fa4c83b2 [modules] Make the tweak to avoid circular inclusion of emmintrin.h and
xmmintrin.h a bit more directed. If for whatever reason modules are enabled but
we textually include one of these headers, don't deploy the special case for
modules. To make this work cleanly, extend __building_module to be defined
even when modules is disabled.

llvm-svn: 266945
2016-04-21 01:46:37 +00:00
Michael Zuckerman 6fa512cecf [Clang][Builtin][AVX512] Adding intrinsics for VGETMANT{PD|PS} and VGETEXP{PD|PS} instruction set
Differential Revision: http://reviews.llvm.org/D19197

llvm-svn: 266763
2016-04-19 17:10:29 +00:00
Michael Zuckerman ef2979af50 [Clang][AVX512][BUILTIN] Adding intrinsics support to VEXTRACT{I|F} and VINSERT{I|F} instruction set
Differential Revision: http://reviews.llvm.org/D19097

llvm-svn: 266745
2016-04-19 15:18:23 +00:00
Richard Smith 20d4701b3d [modules] Don't expose *intrin.h headers that cannot be included standalone as
separate modules. These cause build breakage with -fmodules-local-submodule-visibility.

llvm-svn: 266501
2016-04-16 00:46:26 +00:00
Michael Zuckerman 0a3508a8d3 [Clang][AVX512][BUILTIN] Adding support for intrinsics of vpmov{d|q}{b|w|d}{128|256|512} instruction set
Differential Revision: http://reviews.llvm.org/D19055

llvm-svn: 266280
2016-04-14 07:56:51 +00:00
Michael Zuckerman d871531687 [Clang][AVX512][Builtin] Adding intrinsics of vpmovus{d|q}{b|w|d}{128|256|512} instruction set
Differential Revision: http://reviews.llvm.org/D19050

llvm-svn: 266278
2016-04-14 06:48:09 +00:00
Michael Zuckerman e1680617b0 [Clang][AVX512][Builtin] Adding support to intrinsics of pmovs{d|q}{b|w|d}{128|256|512} instruction set
Differential Revision: http://reviews.llvm.org/D19023

llvm-svn: 266202
2016-04-13 15:02:04 +00:00
Michael Zuckerman c2b6128a8f [Clang][AVX512][Builtin] Adding support for VBROADCAST and VPBROADCASTB/W/D/Q instruction set
Differential Revision: http://reviews.llvm.org/D19012

llvm-svn: 266195
2016-04-13 12:58:01 +00:00
Michael Zuckerman 074edd7c1e [Clang][AVX512][Builtin] Adding supporting to intrinsics of cvt{b|d|q}2mask{128|256|512} and cvtmask2{b|d|q}{128|256|512} instruction set.
Differential Revision: http://reviews.llvm.org/D19009

llvm-svn: 266188
2016-04-13 10:49:37 +00:00
Chuang-Yu Cheng 8eac7ae9ad [PPC64][VSX] Add a couple of new data types for vec_vsx_ld and vec_vsx_st intrinsics and fix incorrect testcases with minor refactoring
New added data types:
  vector double vec_vsx_ld (int, const double *);
  vector float vec_vsx_ld (int, const float *);
  vector bool short vec_vsx_ld (int, const vector bool short *);
  vector bool int vec_vsx_ld (int, const vector bool int *);
  vector signed int vec_vsx_ld (int, const signed int *);
  vector unsigned int vec_vsx_ld (int, const unsigned int *);

  void vec_vsx_st (vector double, int, double *);
  void vec_vsx_st (vector float, int, float *);
  void vec_vsx_st (vector bool short, int, vector bool short *);
  void vec_vsx_st (vector bool short, int, signed short *);
  void vec_vsx_st (vector bool short, int, unsigned short *);
  void vec_vsx_st (vector bool int, int, vector bool int *);
  void vec_vsx_st (vector bool int, int, signed int *);
  void vec_vsx_st (vector bool int, int, unsigned int *);

Also fix testcases which use non-vector argument version of vec_vsx_ld or
vec_vsx_st, but pass incorrect parameter.

llvm-svn: 266166
2016-04-13 05:16:31 +00:00
Eric Christopher d5c75eed44 Add a couple of missing vsx load and store intrinsics.
Patch by Jing Yu!

llvm-svn: 266122
2016-04-12 21:08:54 +00:00
Michael Zuckerman 04fb3bc682 [Clang][BuiltIn][avx512] Adding avx512 (shuf,sqrt{ss|sd},rsqrt ) builtin to clang
llvm-svn: 266048
2016-04-12 07:59:39 +00:00
Michael Zuckerman 81f468c859 [Clang][AVX512][BuiltIn] Adding avx512 ( psll{d|q}512,psllv{16si|8di},psra{d|q}512,psrav{16si|8di},pternlog{d|q}{128|256|512} ) builtin to clang
Differential Revision: http://reviews.llvm.org/D18926

llvm-svn: 265964
2016-04-11 17:04:21 +00:00
Michael Zuckerman 6b5f4d8ad1 [CLANG] [AVX512] [BUILTIN] Adding PSRA{Q|D|QI|DI}{128|256|512} builtin
Differential Revision: http://reviews.llvm.org/D17693

llvm-svn: 265952
2016-04-11 15:46:39 +00:00
Michael Zuckerman 1af947a7b3 [Clang][AVX512][BuiltIn] Adding avx512 ( punpck{h|l}{dq|qdq}{128|256|512},rndscale{ss|sd}, {scalef{ss|sd|pd512|ps512} ) builtin to clang
Differential Revision: http://reviews.llvm.org/D18929

llvm-svn: 265935
2016-04-11 12:32:31 +00:00
Michael Zuckerman 07525091e6 [Clang][AVX512][BuiltIn] Adding avx512 ( ptest{n}m{b|w}{128|256|512} ) builtin to clang
Differential Revision: http://reviews.llvm.org/D18924

llvm-svn: 265928
2016-04-11 10:22:07 +00:00
Michael Zuckerman d8d2f62107 [Clang][AVX512][BuiltIn] Adding avx512 ( vperm{i|t}2var, vpermil{var}{ps|pd}{256|512} ) builtin to clang.
Differential Revision: http://reviews.llvm.org/D18933

llvm-svn: 265915
2016-04-11 07:15:34 +00:00
Michael Zuckerman 8d16199b7b [Clang][AVX512][BuiltIn] Adding avx512 ( vcvt ) builtin to clang
Differential Revision: http://reviews.llvm.org/D18932

llvm-svn: 265904
2016-04-10 17:24:03 +00:00
Michael Zuckerman cdd54c83d8 Adding avx512 (unpck{h|l}{pd|ps}, rcp14{pd|ps}{128|256},vplzcnt{d|q} ) builtin to clang
Differential Revision: http://reviews.llvm.org/D18931

llvm-svn: 265896
2016-04-10 12:54:23 +00:00
Michael Zuckerman fa7ccc5bcf [Clang][AVX512][BuiltIn] Adding avx512 ( store ) builtin to clang
Differential Revision: http://reviews.llvm.org/D18925

llvm-svn: 265895
2016-04-10 10:51:04 +00:00
Ekaterina Romanova f2ed62027d Add doxygen comments to emmintrin.h's intrinsics. Only around 25% of the intrinsics in this file are documented now. The patches for the rest of the intrisics in this file will be send out later.
The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Paul Robinson.

llvm-svn: 265844
2016-04-08 20:45:48 +00:00
Justin Lebar 25c36fd61b [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9.
Summary:
See comments in patch; we were assuming that some stdlib math functions
would be defined in namespace std, when in fact the spec says they
should be defined in the global namespace.  libstdc++4.9 became more
conforming and broke us.

This new implementation seems to cover the known knowns.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: http://reviews.llvm.org/D18882

llvm-svn: 265751
2016-04-07 23:55:53 +00:00
Michael Zuckerman 5ae71243c2 Fixing duplicate declaration "_mm256 _mm_set_epi32" in revision 262177
Differential Revision: http://reviews.llvm.org/D17685

llvm-svn: 265677
2016-04-07 14:44:08 +00:00
Yunzhong Gao c293a2688d Add copyright notice to the modulemap file.
The module.modulemap file in the lib/Headers directory was missing the LLVM
copyright notice. This patch adds the copyright notice just like the rest of
the files in this directory.

Differential Revision: http://reviews.llvm.org/D18709

llvm-svn: 265325
2016-04-04 18:46:09 +00:00
Justin Lebar cb28f15fbc [CUDA] Fix typo in __clang_cuda_runtime_wrapper.h.
We're #including the wrong file!

llvm-svn: 265083
2016-04-01 00:25:42 +00:00
Justin Lebar 0cda764430 [CUDA] Add math forward declares to CUDA header wrapper.
Summary:
This is necessary for a future patch which will make all constexpr
functions implicitly host+device.  cmath may declare constexpr
functions, but these we do *not* want to be host+device.  The forward
declares added in this patch prevent this (because the rule will be,
constexpr functions become implicitly host+device unless they're
preceeded by a decl with __device__).

Reviewers: tra

Subscribers: cfe-commits, rnk, rsmith

Differential Revision: http://reviews.llvm.org/D18539

llvm-svn: 264963
2016-03-30 23:30:14 +00:00
Justin Lebar 50e5f184d8 [CUDA] Add missing #undef __DEVICE__ to CUDA shim header.
llvm-svn: 264742
2016-03-29 16:24:23 +00:00
Michael Zuckerman def78750b7 [CLANG][avx512][BUILTIN] Adding fixupimm{pd|ps|sd|ss}
getexp{sd|ss} getmant{sd|ss} kunpck{di|si} loada{pd|ps} loaddqu{di|hi|qi|si} max{sd|ss} min{sd|ss} kmov16 builtins to clang


Differential Revision: http://reviews.llvm.org/D18215

llvm-svn: 264574
2016-03-28 12:23:09 +00:00
Justin Lebar 334535132f [CUDA] Don't define __NVCC__.
Summary:
We decided this makes life too difficult for code authors.  For example,
people may want to detect NVCC and disable variadic templates, which
NVCC does not support, but which we do.

Since people are going to have to change compiler flags *anyway* in
order to compile with clang, if they really want the old behavior, they
can pass -D__NVCC__.

Tested with tensorflow and thrust, no apparent problems.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18417

llvm-svn: 264205
2016-03-23 22:42:27 +00:00
John Thompson debce24c90 D18325: Added mm_malloc module export.
llvm-svn: 264092
2016-03-22 20:57:51 +00:00
Daniel Jasper be50836514 Make functions in altivec.h be __inline__. As they are all also marked
__always_inline__, this has likely been meant from the start.

Review: http://reviews.llvm.org/D18015
llvm-svn: 263302
2016-03-11 22:13:28 +00:00
Ekaterina Romanova 13f189da86 Add doxygen comments to avxintrin.h's intrinsics.
Only around 25% of the intrinsics in this file are documented here. The patches for the other half will be sent out later.

The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream.

llvm-svn: 263175
2016-03-11 00:05:54 +00:00
Ekaterina Romanova e2961f71d2 Add doxygen comments to xmmintrin.h's intrinsics.
Only half of the intrinsics in this file is documented here. The patch for the other half will be sent out later.

The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream.

llvm-svn: 263098
2016-03-10 09:37:04 +00:00
Kit Barton fbab158767 [PPC] FE support for generating VSX [negated] absolute value instructions
Includes new built-in, conversion of built-in to target-independent intrinsic
and update in the header file. Tests are also updated. There is a second part in
the backend for which I will post a separate code-review. BACKEND PART SHOULD BE
COMMITTED FIRST.

Phabricator: http://reviews.llvm.org/D17816
llvm-svn: 263051
2016-03-09 19:28:31 +00:00
Michael Zuckerman 10d6f9ac04 Fixing wrong header title name.
Differential Revision: http://reviews.llvm.org/D17917

llvm-svn: 263007
2016-03-09 11:26:45 +00:00
Ekaterina Romanova c8976d58fe Add doxygen comments to bmiintrin.h's intrinsics.
The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream.

llvm-svn: 262895
2016-03-08 01:36:59 +00:00
Michael Zuckerman e71d59fc4f [CLANG][AVX512][BUILTIN] Add builtin vcomi{ss|sd}
Differential Revision: http://reviews.llvm.org/D17919

llvm-svn: 262847
2016-03-07 19:15:00 +00:00
Michael Zuckerman 9f33848f04 [CLANG][AVX512][BUILTIN] Adding new feature flag headed files and new BUILTIN vpermi2varq{i|t}{128|256|512}{mask|maskz}
Differential Revision: http://reviews.llvm.org/D17917

llvm-svn: 262834
2016-03-07 17:04:11 +00:00
Michael Zuckerman 0190c65571 [CLANG][AVX512][BUILTIN] Adding new feature flag header file and new builtin vpmadd52{h|l}uq{128|256|512}{mask|maskz}
Differential Revision: http://reviews.llvm.org/D17915

llvm-svn: 262820
2016-03-07 09:55:55 +00:00
Michael Zuckerman 912be16a0e [CLANG][AVX512][BUILTIN] Adding vpmultishiftqb{128|256|512}
Differential Revision: http://reviews.llvm.org/D17914

llvm-svn: 262817
2016-03-07 08:29:10 +00:00
Michael Zuckerman 0d67e4b5d6 [CLANG][AVX512][BUILTIN] movddup{128|256|512}
Differential Revision: http://reviews.llvm.org/D17826

llvm-svn: 262617
2016-03-03 13:43:05 +00:00
Michael Zuckerman 1ad03e7f01 [CLANG][AVX512][BUILTIN] movdqu{qi|hi} {128|256|512}
Differential Revision: http://reviews.llvm.org/D17814

llvm-svn: 262609
2016-03-03 11:34:52 +00:00
Michael Zuckerman ffbb67a8e2 [CLANG][AVX512][BUILTIN] movdqa{32|64}{load|store|}{128|256|512}
Differential Revision: http://reviews.llvm.org/D17812

llvm-svn: 262598
2016-03-03 09:26:01 +00:00
Michael Zuckerman abbe34bce6 [Clang][AVX512][BUILTIN] Adding PSRL{W|WI}{128|256|512}
Differential Revision: http://reviews.llvm.org/D17754

llvm-svn: 262593
2016-03-03 08:55:20 +00:00
Ekaterina Romanova 4711441e52 This patch adds doxygen comments for all the intrinsincs in the header file tmmintrin.h.
The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream.

llvm-svn: 262565
2016-03-03 00:20:11 +00:00
Michael Zuckerman 3df95e711f [CLANG] [AVX512] [BUILTIN] Adding PSRA{W|WI}{128|256|512}.
Differential Revision: http://reviews.llvm.org/D17706

llvm-svn: 262481
2016-03-02 12:06:06 +00:00
Michael Zuckerman d15c95a793 [CLANG] [AVX512] [BUILTIN] Adding PSRAV
Differential Revision: http://reviews.llvm.org/D17699

llvm-svn: 262471
2016-03-02 09:05:46 +00:00
Ekaterina Romanova c207006bbb This patch adds doxygen comments for the intrinsincs in the header file popcntintrin.h.
The doxygen comments are automatically generated based on Sony's intrinsics documentation.

Differential Revision: http://reviews.llvm.org/D17550 

llvm-svn: 262385
2016-03-01 20:04:57 +00:00
Kit Barton 2b36b15834 [PPC64][VSX] Add short, char, and bool data type for vec_vsx_ld and vec_vsx_st intrinsics
Issue: https://llvm.org/bugs/show_bug.cgi?id=26720

Fix compile error when building ffmpeg for PowerPC64LE because of some
vec_vsx_ld/vec_vsx_st intrinsics are not supported by current clang.

New added intrinsics:

(vector) {signed|unsigned} {short|char} vec_vsx_ld: (total: 8)
bool vec_vsx_ld: (total: 1)
(vector) {signed|unsigned} {short|char} vec_vsx_st: (total: 8)
bool vec_vsx_st: (total: 1)
Total: 18 intrinsics

Phabricator: http://reviews.llvm.org/D17637
llvm-svn: 262359
2016-03-01 18:11:28 +00:00
Michael Zuckerman d176d744af [CLANG][AVX512][BUILTIN] Adding PSRL{DI|QI}{128|256|512} builtin
Differential Revision: http://reviews.llvm.org/D17714

llvm-svn: 262355
2016-03-01 17:49:03 +00:00
Michael Zuckerman 0165e7669c [CLANG][AVX512][BUILTIN] Adding PSRLV builtin
Differential Revision: http://reviews.llvm.org/D17718

llvm-svn: 262326
2016-03-01 13:03:45 +00:00
Michael Zuckerman 1ac360cca4 [CLANG] [AVX512] [BUILTIN] Adding PSRA{Q|D|QI|DI}{128|256|512} builtin
Differential Revision: http://reviews.llvm.org/D17693

llvm-svn: 262321
2016-03-01 11:38:16 +00:00
Logan Chien 3267ca225d Add ARM EHABI-related constants to unwind.h.
Adds a number of constants, defined in the ARM EHABI spec, to the Clang
lib/Headers/unwind.h header. This is prerequisite for landing
http://reviews.llvm.org/D15781, as previously discussed there.

Patch by Timon Van Overveldt.

llvm-svn: 262178
2016-02-28 15:01:42 +00:00
Michael Zuckerman 431b0e18b4 [CLANG] [AVX512] [BUILTIN] Adding PSLL{V|W|Wi}{128|256|512} builtin
Differential Revision: http://reviews.llvm.org/D17685

llvm-svn: 262177
2016-02-28 07:39:34 +00:00
Chris Bieneman 2c6c01a4fc [CMake] Fixing install-clang-headers dependencies to depend on generating the headers.
llvm-svn: 261911
2016-02-25 18:39:19 +00:00
Justin Lebar d7a35492ad [CUDA] Add conversion operators for threadIdx, blockIdx, gridDim, and blockDim to uint3 and dim3.
Summary:
This lets you write, e.g.

  uint3 a = threadIdx;
  uint3 b = blockIdx;
  dim3 c = gridDim;
  dim3 d = blockDim;

which is legal in nvcc, but was not legal in clang.

The fact that e.g. the type of threadIdx is not actually uint3 is still
observable, but now you have to try to observe it.

Reviewers: tra

Subscribers: echristo, cfe-commits

Differential Revision: http://reviews.llvm.org/D17561

llvm-svn: 261777
2016-02-24 21:49:33 +00:00
Justin Lebar c8dae5378b [CUDA] Add hack so code which includes "curand.h" doesn't break.
Summary:
curand.h includes curand_mtgp32_kernel.h.  In host mode, this header
redefines threadIdx and blockDim, giving them their "proper" types of
uint3 and dim3, respectively.

clang has its own plan for these variables -- their types are magic
builtin classes.  So these redefinitions are incompatible.

As a hack, we force-include the offending CUDA header and use #defines
to get the right types for threadIdx and blockDim.

Reviewers: tra

Subscribers: echristo, cfe-commits

Differential Revision: http://reviews.llvm.org/D17562

llvm-svn: 261776
2016-02-24 21:49:31 +00:00
Michael Zuckerman 6c317515e4 [CLANG] [AVX512] [BUILTIN] Adding PSHUF{L|H}W{128|256|512} builtin to clang .
Differential Revision: http://reviews.llvm.org/D17539

llvm-svn: 261755
2016-02-24 17:39:35 +00:00
Michael Zuckerman e98cc7477f [CLANG] [AVX512] [BUILTIN] Adding prorv{d|q}{128|256|512} builtin to clang
Differential Revision: http://reviews.llvm.org/D17512

llvm-svn: 261641
2016-02-23 15:59:47 +00:00
Michael Zuckerman 4924c7a2b5 [CLANG] [AVX512] [BUILTIN] Adding pro{lv|r}{d|q}{128|256|512} builtin to clang
Adding closer to the end of macro }->}) 

Differential Revision: http://reviews.llvm.org/D17506

llvm-svn: 261638
2016-02-23 14:23:53 +00:00
Michael Zuckerman 0231f1649b [CLANG] [AVX512] [BUILTIN] Adding pro{lv|r}{d|q}{128|256|512} builtin to clang
Differential Revision: http://reviews.llvm.org/D17506

llvm-svn: 261635
2016-02-23 13:41:13 +00:00
Michael Zuckerman 477e0a326b [CLANG] [AVX512] [BUILTIN] Adding prol{d|q|w}{128|256|512} builtin to clang .
Fixing problem with the lib/include/avx512vlintrin.h file. 
Adding one more _ to the prefix of _extension__ -> __extension__.

Differential Revision: http://reviews.llvm.org/D16985

llvm-svn: 261518
2016-02-22 09:42:57 +00:00
Michael Zuckerman 38a2727764 [CLANG] [AVX512] [BUILTIN] Adding prol{d|q|w}{128|256|512} builtin to clang .
Differential Revision: http://reviews.llvm.org/D16985

llvm-svn: 261516
2016-02-22 09:05:41 +00:00
Michael Zuckerman 7a33dce4ef [CLANG] [AVX512] [BUILTIN] Adding pmovzx{b|d|w}{w|d|q}{128|256|512} builtin to clang
Differential Revision: http://reviews.llvm.org/D16961

llvm-svn: 261471
2016-02-21 14:00:11 +00:00
David Majnemer 7a0d7d6be9 Remove a duplicate declaration specifier from _ReadBarrier
This fixes PR26675.

llvm-svn: 261388
2016-02-20 00:57:00 +00:00
Michael Zuckerman 7cdb72f7ea [CLANG] [AVX512] [BUILTIN] Adding pmovsx{b|d|w}{w|d|q}{128|256|512} builtin to clang
Differential Revision: http://reviews.llvm.org/D16955

llvm-svn: 261196
2016-02-18 09:09:34 +00:00
Artem Belevich 7f522b7876 Added missing '__'.
llvm-svn: 260719
2016-02-12 20:26:43 +00:00
Eric Christopher 39a84d0b9b Update functions in clang supplied headers to use the compiler reserved
namespace for arguments.

llvm-svn: 260647
2016-02-12 02:22:53 +00:00
Richard Smith 66a7385e27 <float.h>: do not define DECIMAL_DIG in -std=c89 mode; this macro was added in C99.
Patch by Jorge Teixeira!

llvm-svn: 260639
2016-02-12 01:15:33 +00:00
Eric Christopher 0466c7ce23 Use __ before argument names in provided headers.
llvm-svn: 260631
2016-02-12 00:32:23 +00:00
Richard Smith b473e1e473 In C11, provide macros FLT_DECIMAL_DIG, DBL_DECIMAL_DIG, and LDBL_DECIMAL_DIG in <float.h>.
Patch by Jorge Teixeira!

llvm-svn: 260577
2016-02-11 19:57:37 +00:00
Ekaterina Romanova a61946d551 This patch adds doxygen comments for all the intrinsincs in the header file f16cintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics document.
Differential Revision: http://reviews.llvm.org/D17021

llvm-svn: 260333
2016-02-10 00:12:24 +00:00
Ekaterina Romanova d416747803 This patch adds doxygen comments for all the intrinsincs in the header file pmmintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics document.
Differential Revision: http://reviews.llvm.org/D16913

llvm-svn: 260160
2016-02-08 22:35:09 +00:00
Igor Breger 9c2a0bfa13 AVX512: Change builtin function name for scalar intrinsics. Add "mask" to function name to reflect the function behavior.
Differential Revision: http://reviews.llvm.org/D16957

llvm-svn: 260088
2016-02-08 12:36:48 +00:00
Artem Belevich 2aad2b3500 [CUDA] Bug 26497 : Remove wrappers for variants provided by CUDA headers.
... and pull global-scope ones into std namespace with using-declaration.

Differential Revision: http://reviews.llvm.org/D16932

llvm-svn: 259944
2016-02-05 22:54:05 +00:00
Artem Belevich 7b660e2604 [CUDA] added declarations for device-side system calls
...and std:: wrappers for free/malloc.

llvm-svn: 259690
2016-02-03 20:53:58 +00:00
Ekaterina Romanova 0e19cf2dd8 This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_aes.h.
The doxygen comments are automatically generated based on Sony's intrinsics document.

Differential Revision: http://reviews.llvm.org/D16562

llvm-svn: 259275
2016-01-29 23:59:00 +00:00
Ekaterina Romanova deec50a3d2 This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_pclmul.h. The doxygen comments are automatically generated based on Sony's intrinsics document.
Differential Revision: http://reviews.llvm.org/D15999

llvm-svn: 259239
2016-01-29 20:37:14 +00:00
Artem Belevich c5f41a34e5 [CUDA] Implemented device-side support functions in <cmath>.
CUDA expects math functions in std:: namespace to work on device side.
In order to make it work with clang without allowing device-side code
generation for functions w/o appropriate target attributes, this patch
provides device-side implementations for <cmath> functions. Most of
them call global-scope math functions provided by CUDA headers. In few
cases we use clang builtins.

Tested out-of tree by compiling and running thrust's unit_tests.
https://github.com/thrust/thrust/tree/master/testing

Differential Revision: http://reviews.llvm.org/D16593

llvm-svn: 258880
2016-01-26 23:37:29 +00:00
Chris Bieneman 2bf68c6c1c Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

    "This is the way [autoconf] ends
    Not with a bang but a whimper."
    -T.S. Eliot

Reviewers: chandlerc, grosbach, bob.wilson, echristo

Subscribers: klimek, cfe-commits

Differential Revision: http://reviews.llvm.org/D16472

llvm-svn: 258862
2016-01-26 21:30:40 +00:00
Justin Lebar 3039a593db [CUDA] Make printf work.
Summary:
The code in CGCUDACall is largely based on a patch written by Eli
Bendersky:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html

That patch implemented an LLVM pass lowering printf to vprintf; this
one does something similar, but in Clang codegen.

Reviewers: echristo

Subscribers: cfe-commits, jhen, tra, majnemer

Differential Revision: http://reviews.llvm.org/D16372

llvm-svn: 258642
2016-01-23 21:28:14 +00:00
Ekaterina Romanova 08d1f2431d 2 missing intrinsics _cvtss_sh and _mm_cvtps_ph were added to the intrinsics header f16intrin.h
Differential Revision: http://reviews.llvm.org/D16177

llvm-svn: 258492
2016-01-22 06:50:50 +00:00
Adam Nemet e708747129 [AVX512] Fix typo in r226298
Hal noticed that the double/float got mixed up on the parameters for
these.

llvm-svn: 258108
2016-01-19 02:02:25 +00:00
Kyle Butt 436ff85b63 [PPC] Add long long/double support for vec_cts, vec_ctu and vec_ctf
Add long long/double support for vec_cts, vec_ctu and vec_ctf.

Similar to this change in GCC:
https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02653.html

Patch by Tim Shen.

llvm-svn: 257135
2016-01-08 02:00:48 +00:00
David Majnemer 30f9bfd574 Reimplement __readeflags and __writeeflags on top of intrinsics
Lean on LLVM to provide this functionality now that it provides the
necessary intrinsics.

llvm-svn: 256686
2016-01-01 06:50:08 +00:00
Asaf Badouh a9d1e18f48 [X86][PKU] add clang intrinsic for {RD|WR}PKRU
Differential Revision: http://reviews.llvm.org/D15837

llvm-svn: 256672
2015-12-31 14:14:07 +00:00
Eric Christopher 7f7d9bea6f Fix up comment in header.
llvm-svn: 256508
2015-12-28 19:07:46 +00:00
Michael Kuperstein 591278c08d [X86] Add missing m64/int64 conversions
Define the 64-bit equivalents of _m_to_int and _m_from_int.

Differential Revision: http://reviews.llvm.org/D15572

llvm-svn: 256122
2015-12-20 12:37:18 +00:00
Michael Kuperstein beae026738 [X86] Add signed aliases for popcnt intrinsics
The Intel manual documents both an unsigned form (_mm_popcnt_u32)
and a signed form (_popcnt32) of the intrinsic. Add the missing signed form.

Differential Revision: http://reviews.llvm.org/D15568

llvm-svn: 256121
2015-12-20 12:35:35 +00:00
Artem Belevich 8e9ba042a6 [CUDA] runtime wrapper header tweaks
* Pull in host-only implementations of few CUDA-specific math functions.
* #nclude <cmath> early to prevent its inclusion from CUDA headers after
  they've messed with __THROW macro.

llvm-svn: 255933
2015-12-17 22:25:22 +00:00
Artem Belevich 7fda3c9ff3 [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h
Currently it's easy to break CUDA compilation by passing
"-isystem /path/to/cuda/include" to compiler which leads to
compiler including real cuda_runtime.h from there instead
of the wrapper we need.

Renaming the wrapper ensures that we can include the wrapper
regardless of user-specified include paths and files.

Differential Revision: http://reviews.llvm.org/D15534

llvm-svn: 255802
2015-12-16 18:51:59 +00:00
Asaf Badouh 5e4248b4e0 [x86][avx512] more changes in intrinsics to be align with gcc format
Differential Revision: http://reviews.llvm.org/D15328

llvm-svn: 255012
2015-12-08 12:34:38 +00:00
Asaf Badouh 3e5111e313 [avx512] rename gcc intrinsics to be align with gcc format
rename the gcc intrinsics suffix : _mask ->_round

Differential Revision: http://reviews.llvm.org/D15284

llvm-svn: 254906
2015-12-07 13:14:22 +00:00
Paul Robinson 941bc91518 Move _mm256_cvtps_ph and _mm256_cvtph_ps to immintrin.h.
This more closely matches their locations as described by Intel
documentation, and lets us remove a pair of redundant typedefs.

Differential Revision: http://reviews.llvm.org/D15127

llvm-svn: 254528
2015-12-02 18:41:52 +00:00
Craig Topper 5ec97a7b9b [X86] Improve codegen for AVX2 gather with an all 1s mask.
Use undefined instead of setzero as the pass through input since its going to be fully overwritten. Use cmpeq of two zero vectors to produce the all 1s vector. Casting -1 to a double and vectorizing causes a constant load of a -1.0 floating point value.

llvm-svn: 254389
2015-12-01 07:12:59 +00:00
Craig Topper e20b8c68ed [X86] _mm256_permutevar8x32_ps should take an integer vector for its shuffle index input.
llvm-svn: 254270
2015-11-29 22:53:32 +00:00
Craig Topper 3a71f35a67 [X86] Remove temporary variables from intrinsic macros. NFC
llvm-svn: 254247
2015-11-29 06:50:33 +00:00
Argyrios Kyrtzidis dcb5653516 [CMake] Add a specific 'install-clang-headers' target.
llvm-svn: 253636
2015-11-20 02:24:03 +00:00
Artem Belevich c29db84419 [CUDA] Added a wrapper header for inclusion of stock CUDA headers.
Header files that come with CUDA are assuming split host/device
compilation and are not usable by clang out of the box.
With a bit of preprocessor magic it's possible to twist them
into something clang can use.

This wrapper always includes CUDA headers exactly the same way during
host and device compilation passes and produces identical preprocessed
content during host and device side compilation for sm_35 GPUs. Device
compilation passes for older GPUs will see a smaller subset of device
functions supported by particular GPU.

The wrapper assumes specific contents of CUDA header files and works
only with CUDA 7.0 and 7.5.

Differential Revision: http://reviews.llvm.org/D13171

llvm-svn: 253388
2015-11-17 22:28:52 +00:00
Hans Wennborg 1acf955a6a bmiintrin.h: Allow using the tzcnt intrinsics for non-BMI targets
The tzcnt intrinsics are used non non-BMI targets by code (e.g. ffmpeg)
that uses it as a potentially faster BSF.

The TZCNT instruction is special in that it's encoded in a
backward-compatible way and behaves as BSF on non-BMI targets.

Differential Revision: http://reviews.llvm.org/D14748

llvm-svn: 253358
2015-11-17 18:46:48 +00:00
Oliver Stannard 7aa90f5735 [ARM,AArch64] Fix __rev16l and __rev16ll intrinsics
These two intrinsics are defined in arm_acle.h.

__rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits.
For AArch64, where long is 64 bits, this would still be wrong.

__rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than
each 16-bit halfword. The correct implementation is to apply __rev16 to the top
and bottom words of the 64-bit value.

For AArch32 targets, these get compiled down to the hardware rev16 instruction
at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two
32-bit rev16 instructions, because there is not currently a pattern for the
64-bit rev16 instruction.

Differential Revision: http://reviews.llvm.org/D14609

llvm-svn: 253211
2015-11-16 14:58:50 +00:00
Craig Topper fb79b5f273 [X86] Add 'pause' builtin that's already in llvm and use it instead of inline assembly to implement _mm_pause.
llvm-svn: 252712
2015-11-11 08:13:33 +00:00