llvm-project

Commit Graph

Author	SHA1	Message	Date
Coby Tayree	3d9c88cfec	[x86][icelake][vnni] added vnni feature recognition added intrinsics support for VNNI instructions _mm256_mask_dpbusd_epi32 _mm256_maskz_dpbusd_epi32 _mm256_dpbusd_epi32 _mm256_mask_dpbusds_epi32 _mm256_maskz_dpbusds_epi32 _mm256_dpbusds_epi32 _mm256_mask_dpwssd_epi32 _mm256_maskz_dpwssd_epi32 _mm256_dpwssd_epi32 _mm256_mask_dpwssds_epi32 _mm256_maskz_dpwssds_epi32 _mm256_dpwssds_epi32 _mm128_mask_dpbusd_epi32 _mm128_maskz_dpbusd_epi32 _mm128_dpbusd_epi32 _mm128_mask_dpbusds_epi32 _mm128_maskz_dpbusds_epi32 _mm128_dpbusds_epi32 _mm128_mask_dpwssd_epi32 _mm128_maskz_dpwssd_epi32 _mm128_dpwssd_epi32 _mm128_mask_dpwssds_epi32 _mm128_maskz_dpwssds_epi32 _mm128_dpwssds_epi32 _mm512_mask_dpbusd_epi32 _mm512_maskz_dpbusd_epi32 _mm512_dpbusd_epi32 _mm512_mask_dpbusds_epi32 _mm512_maskz_dpbusds_epi32 _mm512_dpbusds_epi32 _mm512_mask_dpwssd_epi32 _mm512_maskz_dpwssd_epi32 _mm512_dpwssd_epi32 _mm512_mask_dpwssds_epi32 _mm512_maskz_dpwssds_epi32 _mm512_dpwssds_epi32 matching a similar work on the backend (D40208) Differential Revision: https://reviews.llvm.org/D41558 llvm-svn: 321484	2017-12-27 10:37:51 +00:00
Coby Tayree	2268576fa0	[x86][icelake][bitalg] added bitalg feature recognition added intrinsics support for bitalg instructions _mm512_popcnt_epi16 _mm512_mask_popcnt_epi16 _mm512_maskz_popcnt_epi16 _mm512_popcnt_epi8 _mm512_mask_popcnt_epi8 _mm512_maskz_popcnt_epi8 _mm512_mask_bitshuffle_epi64_mask _mm512_bitshuffle_epi64_mask _mm256_popcnt_epi16 _mm256_mask_popcnt_epi16 _mm256_maskz_popcnt_epi16 _mm128_popcnt_epi16 _mm128_mask_popcnt_epi16 _mm128_maskz_popcnt_epi16 _mm256_popcnt_epi8 _mm256_mask_popcnt_epi8 _mm256_maskz_popcnt_epi8 _mm128_popcnt_epi8 _mm128_mask_popcnt_epi8 _mm128_maskz_popcnt_epi8 _mm256_mask_bitshuffle_epi32_mask _mm256_bitshuffle_epi32_mask _mm128_mask_bitshuffle_epi16_mask _mm128_bitshuffle_epi16_mask matching a similar work on the backend (D40222) Differential Revision: https://reviews.llvm.org/D41564 llvm-svn: 321483	2017-12-27 10:01:00 +00:00
Coby Tayree	cf96c876c6	[x86][icelake][vpclmulqdq] added vpclmulqdq feature recognition added intrinsics support for vpclmulqdq instructions _mm256_clmulepi64_epi128 _mm512_clmulepi64_epi128 matching a similar work on the backend (D40101) Differential Revision: https://reviews.llvm.org/D41573 llvm-svn: 321480	2017-12-27 09:00:31 +00:00
Coby Tayree	f4811ebc39	[x86][icelake][gfni] added gfni feature recognition added intrinsics support for gfni instructions _mm_gf2p8affineinv_epi64_epi8 _mm_mask_gf2p8affineinv_epi64_epi8 _mm_maskz_gf2p8affineinv_epi64_epi8 _mm256_gf2p8affineinv_epi64_epi8 _mm256_mask_gf2p8affineinv_epi64_epi8 _mm256_maskz_gf2p8affineinv_epi64_epi8 _mm512_gf2p8affineinv_epi64_epi8 _mm512_mask_gf2p8affineinv_epi64_epi8 _mm512_maskz_gf2p8affineinv_epi64_epi8 _mm_gf2p8affine_epi64_epi8 _mm_mask_gf2p8affine_epi64_epi8 _mm_maskz_gf2p8affine_epi64_epi8 _mm256_gf2p8affine_epi64_epi8 _mm256_mask_gf2p8affine_epi64_epi8 _mm256_maskz_gf2p8affine_epi64_epi8 _mm512_gf2p8affine_epi64_epi8 _mm512_mask_gf2p8affine_epi64_epi8 _mm512_maskz_gf2p8affine_epi64_epi8 _mm_gf2p8mul_epi8 _mm_mask_gf2p8mul_epi8 _mm_maskz_gf2p8mul_epi8 _mm256_gf2p8mul_epi8 _mm256_mask_gf2p8mul_epi8 _mm256_maskz_gf2p8mul_epi8 _mm512_gf2p8mul_epi8 _mm512_mask_gf2p8mul_epi8 _mm512_maskz_gf2p8mul_epi8 matching a similar work on the backend (D40373) Differential Revision: https://reviews.llvm.org/D41582 llvm-svn: 321477	2017-12-27 08:37:47 +00:00
Coby Tayree	a1e5f0c339	[x86][icelake][vaes] added vaes feature recognition added intrinsics support for vaes instructions, matching a similar work on the backend (D40078) _mm256_aesenc_epi128 _mm512_aesenc_epi128 _mm256_aesenclast_epi128 _mm512_aesenclast_epi128 _mm256_aesdec_epi128 _mm512_aesdec_epi128 _mm256_aesdeclast_epi128 _mm512_aesdeclast_epi128 llvm-svn: 321474	2017-12-27 08:16:54 +00:00
Craig Topper	b846d1ff76	[X86] Add builtins and tests for 128 and 256 bit vpopcntdq. llvm-svn: 320915	2017-12-16 06:02:31 +00:00
Shoaib Meenai	669cae1f28	[clang] Use add_llvm_install_targets Use this function to create the install targets rather than doing so manually, which gains us the `-stripped` install targets to perform stripped installations. Differential Revision: https://reviews.llvm.org/D40675 llvm-svn: 319489	2017-11-30 22:35:02 +00:00
Oren Ben Simhon	fec21ec0c6	Control-Flow Enforcement Technology - Shadow Stack and Indirect Branch Tracking support (Clang side) Shadow stack solution introduces a new stack for return addresses only. The stack has a Shadow Stack Pointer (SSP) that points to the last address to which we expect to return. If we return to a different address an exception is triggered. This patch includes shadow stack intrinsics as well as the corresponding CET header. It includes CET clang flags for shadow stack and Indirect Branch Tracking. For more information, please see the following: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf Differential Revision: https://reviews.llvm.org/D40224 Change-Id: I79ad0925a028bbc94c8ecad75f6daa2f214171f1 llvm-svn: 318995	2017-11-26 12:34:54 +00:00
Craig Topper	89cd7533f7	[X86] Add CLWB intrinsic. clang part Reviewers: RKSimon, zvi, igorb Reviewed By: RKSimon Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D38781 llvm-svn: 315607	2017-10-12 18:57:15 +00:00
Mandeep Singh Grang	79249e1be7	[clang] Add ARM64 support to armintr.h for MSVC compatibility Summary: This fixes compiling with headers from the Windows SDK for ARM64. Reviewers: compnerd, ruiu, mstorsjo Reviewed By: compnerd, mstorsjo Subscribers: mgorny, aemerson, javed.absar, kristof.beyls, llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D35862 llvm-svn: 309081	2017-07-26 05:29:40 +00:00
Oren Ben Simhon	140c1fb9ec	[X86] Adding avx512_vpopcntdq feature set and its intrinsics AVX512_VPOPCNTDQ is a new feature set that was published by Intel. The patch represents the Clang side of the addition of six intrinsics for two new machine instructions (vpopcntd and vpopcntq). It also includes the addition of the new feature set. Differential Revision: https://reviews.llvm.org/D33170 llvm-svn: 303857	2017-05-25 13:44:11 +00:00
Simon Pilgrim	3511348dbb	[X86][LWP] Add clang support for LWP instructions. This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4). Differential Revision: https://reviews.llvm.org/D32770 llvm-svn: 302418	2017-05-08 12:09:45 +00:00
Craig Topper	4574226c3f	[X86] Clzero flag addition and inclusion under znver1 1. Adds the command line flag for clzero. 2. Includes the clzero flag under znver1. 3. Defines the macro for clzero. 4. Adds a new file which has the intrinsic definition for clzero instruction. Patch by Ganesh Gopalasubramanian with some additional tests from me. Differential revision: https://reviews.llvm.org/D29386 llvm-svn: 294559	2017-02-09 06:10:14 +00:00
Justin Lebar	ebeeab87a1	[CUDA] Move device placement new definitions into a wrapper header. Previously, these were always included -- after this change, you have to #include <new>, which is consistent with how things ought to work. llvm-svn: 285251	2016-10-26 22:13:26 +00:00
Justin Lebar	49ec14692a	[CUDA] Re-land support for <complex> (r283683 and r283680). These were reverted in r283753 and r283747. The first patch added a header to the root 'Headers' install directory, instead of into 'Headers/cuda_wrappers'. This was fixed in the second patch, but by then the damage was done: The bad header stayed in the 'Headers' directory, continuing to break the build. We reverted both patches in an attempt to fix things, but that still didn't get rid of the header, so the Windows boostrap build remained broken. It's probably worth fixing up our cmake logic to remove things from the install dirs, but in the meantime, re-land these patches, since we believe they no longer have this bug. llvm-svn: 283907	2016-10-11 17:36:03 +00:00
Nico Weber	21b9c7a6dc	Revert r283683 because r283680 got reverted. llvm-svn: 283753	2016-10-10 14:20:35 +00:00
Nico Weber	67dd74ef89	Revert r283680. Breaks bootstrap builds on (at least) Windows: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\lib\Support\Allocator.cpp:14: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/Allocator.h:24: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/ADT/SmallVector.h:20: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/MathExtras.h:19: D:\buildslave\clang-x64-ninja-win7\stage1.install\bin\..\lib\clang\4.0.0\include\algorithm(63,8) : error: unknown type name '__device__' inline __device__ const __T & llvm-svn: 283747	2016-10-10 14:10:00 +00:00
Justin Lebar	3b593f56fc	[CUDA] Don't install cuda_wrappers/{algorithm,complex} into the main include dir. This is obviously wrong -- if we do this, then all compiles will pick up these wrappers, which is not what we want. llvm-svn: 283683	2016-10-09 00:27:39 +00:00
Justin Lebar	d3c5d2a4de	[CUDA] Support <complex> and std::min/max on the device. Summary: We do this by wrapping <complex> and <algorithm>. Tests are in the test-suite. Reviewers: tra Subscribers: jhen, beanz, cfe-commits, mgorny Differential Revision: https://reviews.llvm.org/D24979 llvm-svn: 283680	2016-10-08 22:16:12 +00:00
Justin Lebar	2dfbe9a3b4	[CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h. Summary: This matches the idiom we use for our other CUDA wrapper headers. Reviewers: tra Subscribers: beanz, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D24978 llvm-svn: 283679	2016-10-08 22:16:08 +00:00
Simon Dardis	3d9c763816	[mips] MSA intrinsics header file This patch adds the msa.h header file containing the shorter names for the MSA instrinsics, e.g. msa_sll_b for builtin_msa_sll_b. Reviewers: vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24674 llvm-svn: 281975	2016-09-20 15:07:36 +00:00
Saleem Abdulrasool	afdef205d8	Headers: Add ARM support to intrin.h for MSVC compatibility This fixes compiling with headers from the Windows SDK for ARM, where the YieldProcessor function (in winnt.h) refers to _ARM_BARRIER_ISHST. The actual MSVC armintr.h contains a lot more definitions, but this is enough to build code that uses the Windows SDK but doesn't use ARM intrinsics directly. An alternative would to just keep the addition to intrin.h (to include armintr.h), but not actually ship armintr.h, instead having clang's intrin.h include armintr.h from MSVC's include directory. (That one works fine with clang, at least for building code that uses the Windows SDK.) Patch by Martin Storsjö! llvm-svn: 277928	2016-08-06 17:58:24 +00:00
Michael Zuckerman	b920665493	[Clang][Feature] Adding CLFLUSHOPT feature and intrinsic to clang Differential Revision: http://reviews.llvm.org/D21792 llvm-svn: 274559	2016-07-05 15:56:03 +00:00
Hans Wennborg	f8b91f8336	s/Intrin.h/intrin.h/, trying to fix the build after r272701 llvm-svn: 272702	2016-06-14 20:14:24 +00:00
Yaxun Liu	e8f49b9db7	[OpenCL] Add the default header file opencl-c.h for OpenCL C language OpenCL has large number of "builtin" functions ("builtin" in the sense of OpenCL spec) which are defined in header files. To compile OpenCL kernels using these builtin functions, a header file is needed. This header file is based on the Khronos implementation (https://github.com/KhronosGroup/SPIR/blob/spirv-1.0/lib/Headers/opencl.h) with heavy refactoring. Re-commit after fixing failures on ppc64/systemz etc. Differential Revision: http://reviews.llvm.org/D18369 llvm-svn: 271197	2016-05-30 02:22:28 +00:00
Yaxun Liu	898eb39bfc	Revert r271136 [OpenCL] Add the default header file opencl-c.h for OpenCL C language due to build failure on ppc64/hexagon/systemz. llvm-svn: 271144	2016-05-28 19:50:40 +00:00
Yaxun Liu	e54d7c44d0	[OpenCL] Add the default header file opencl-c.h for OpenCL C language OpenCL has large number of "builtin" functions ("builtin" in the sense of OpenCL spec) which are defined in header files. To compile OpenCL kernels using these builtin functions, a header file is needed. This header file is based on the Khronos implementation (https://github.com/KhronosGroup/SPIR/blob/spirv-1.0/lib/Headers/opencl.h) with heavy refactoring. Differential Revision: http://reviews.llvm.org/D18369 llvm-svn: 271136	2016-05-28 19:09:01 +00:00
Richard Smith	b391930bbf	Re-alphabetize this file list. llvm-svn: 270170	2016-05-20 01:07:10 +00:00
Justin Lebar	2e4ecfdebe	[CUDA] Implement __ldg using intrinsics. Summary: Previously it was implemented as inline asm in the CUDA headers. This change allows us to use the [addr+imm] addressing mode when executing ld.global.nc instructions. This translates into a 1.3x speedup on some benchmarks that call this instruction from within an unrolled loop. Reviewers: tra, rsmith Subscribers: jhen, cfe-commits, jholewinski Differential Revision: http://reviews.llvm.org/D19990 llvm-svn: 270150	2016-05-19 22:49:13 +00:00
Ashutosh Nema	51c9dd0081	Add new intrinsic support for MONITORX and MWAITX instructions Summary: MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT pair while adding a timer function, such that another termination of the MWAITX instruction occurs when the timer expires. The presence of the MONITORX and MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29. The MONITORX and MWAITX instructions are intercepted by the same bits that intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be monitored. MWAITX instruction causes the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is "0F 01 FB". These opcode information is used in adding tests for the disassembler. These instructions are enabled for AMD's bdver4 architecture. Patch by Ganesh Gopalasubramanian! Reviewers: echristo, craig.topper Subscribers: RKSimon, joker.eph, llvm-commits, cfe-commits Differential Revision: http://reviews.llvm.org/D19796 llvm-svn: 269907	2016-05-18 11:56:23 +00:00
Michael Zuckerman	8c2900f44d	[Clang][BuiltIn][AVX512] Adding intrinsics without mask for VBROADCAST and VPBROADCAST instruction set . Differential Revision: http://reviews.llvm.org/D19196 llvm-svn: 267696	2016-04-27 11:43:14 +00:00
Michael Zuckerman	4fa96af4db	[Clang][AVX512][BuiltIn] Adding intrinsics of VGATHER{DPS\|DPD} , VPGATHER{QD\|QQ\|DD\|DQ} and VGATHERPF{0\|1}{DPS\|QPS\|DPD\|QPD} instruction set . Differential Revision: http://reviews.llvm.org/D19224 llvm-svn: 266983	2016-04-21 12:47:27 +00:00
Justin Lebar	0cda764430	[CUDA] Add math forward declares to CUDA header wrapper. Summary: This is necessary for a future patch which will make all constexpr functions implicitly host+device. cmath may declare constexpr functions, but these we do not want to be host+device. The forward declares added in this patch prevent this (because the rule will be, constexpr functions become implicitly host+device unless they're preceeded by a decl with __device__). Reviewers: tra Subscribers: cfe-commits, rnk, rsmith Differential Revision: http://reviews.llvm.org/D18539 llvm-svn: 264963	2016-03-30 23:30:14 +00:00
Michael Zuckerman	9f33848f04	[CLANG][AVX512][BUILTIN] Adding new feature flag headed files and new BUILTIN vpermi2varq{i\|t}{128\|256\|512}{mask\|maskz} Differential Revision: http://reviews.llvm.org/D17917 llvm-svn: 262834	2016-03-07 17:04:11 +00:00
Michael Zuckerman	0190c65571	[CLANG][AVX512][BUILTIN] Adding new feature flag header file and new builtin vpmadd52{h\|l}uq{128\|256\|512}{mask\|maskz} Differential Revision: http://reviews.llvm.org/D17915 llvm-svn: 262820	2016-03-07 09:55:55 +00:00
Chris Bieneman	2c6c01a4fc	[CMake] Fixing install-clang-headers dependencies to depend on generating the headers. llvm-svn: 261911	2016-02-25 18:39:19 +00:00
Artem Belevich	c5f41a34e5	[CUDA] Implemented device-side support functions in <cmath>. CUDA expects math functions in std:: namespace to work on device side. In order to make it work with clang without allowing device-side code generation for functions w/o appropriate target attributes, this patch provides device-side implementations for <cmath> functions. Most of them call global-scope math functions provided by CUDA headers. In few cases we use clang builtins. Tested out-of tree by compiling and running thrust's unit_tests. https://github.com/thrust/thrust/tree/master/testing Differential Revision: http://reviews.llvm.org/D16593 llvm-svn: 258880	2016-01-26 23:37:29 +00:00
Asaf Badouh	a9d1e18f48	[X86][PKU] add clang intrinsic for {RD\|WR}PKRU Differential Revision: http://reviews.llvm.org/D15837 llvm-svn: 256672	2015-12-31 14:14:07 +00:00
Artem Belevich	7fda3c9ff3	[CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h Currently it's easy to break CUDA compilation by passing "-isystem /path/to/cuda/include" to compiler which leads to compiler including real cuda_runtime.h from there instead of the wrapper we need. Renaming the wrapper ensures that we can include the wrapper regardless of user-specified include paths and files. Differential Revision: http://reviews.llvm.org/D15534 llvm-svn: 255802	2015-12-16 18:51:59 +00:00
Argyrios Kyrtzidis	dcb5653516	[CMake] Add a specific 'install-clang-headers' target. llvm-svn: 253636	2015-11-20 02:24:03 +00:00
Artem Belevich	c29db84419	[CUDA] Added a wrapper header for inclusion of stock CUDA headers. Header files that come with CUDA are assuming split host/device compilation and are not usable by clang out of the box. With a bit of preprocessor magic it's possible to twist them into something clang can use. This wrapper always includes CUDA headers exactly the same way during host and device compilation passes and produces identical preprocessed content during host and device side compilation for sm_35 GPUs. Device compilation passes for older GPUs will see a smaller subset of device functions supported by particular GPU. The wrapper assumes specific contents of CUDA header files and works only with CUDA 7.0 and 7.5. Differential Revision: http://reviews.llvm.org/D13171 llvm-svn: 253388	2015-11-17 22:28:52 +00:00
Amjad Aboud	2b9b8a5921	[X86] Add XSAVE intrinsic family Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13014 llvm-svn: 250158	2015-10-13 12:29:35 +00:00
Ulrich Weigand	ca25643a05	[SystemZ] Add support for vecintrin.h vector built-in functions This patch adds support for the System Z vector built-in functions. The API-defined header file has the name vecintrin.h. The user-level functions are defined in the same style as the clang version of altivec.h, making heavy use of the __overloadable__ and __always_inline__ attributes. Where possible the functions expand to generic operations rather than specific built-in functions, in the hope that that form can be optimised better. Where a built-in routine is specified to require an immediate integer argument, the __enable_if__ attribute is used to verify the argument is in fact constant and in the appropriate range. Based on a patch by Richard Sandiford. llvm-svn: 243643	2015-07-30 14:10:43 +00:00
Michael Kuperstein	a3c7b74208	[X86] Add FXSR intrinsics Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64) These were previously declared in Intrin.h for MSVC compatibility, but now that we have them implemented, these declarations can be removed. llvm-svn: 241053	2015-06-30 09:45:38 +00:00
Asaf Badouh	a45b7cab7b	[x86][AVX512CD] Add conflict and lzcnt intrinsics in their 512bit versions include tests review http://reviews.llvm.org/D10795 llvm-svn: 240941	2015-06-29 12:51:53 +00:00
Nico Weber	ac64b97771	Add new file from r240741 to CMakeLists.txt. llvm-svn: 240743	2015-06-26 00:19:32 +00:00
Eric Christopher	3d920eed5d	Move xtest to its own file to match the gcc header organization. llvm-svn: 239926	2015-06-17 18:42:07 +00:00
Elena Demikhovsky	e7d4c2e229	AVX-512: Added AVX-512 intrinsics and tests by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236218	2015-04-30 09:24:29 +00:00
Artem Belevich	4e192df778	[cuda] Added support for CUDA built-in variables. Added cuda_builtin_vars.h which implements built-in CUDA variables using __declattr(property). Fields of built-in variables (except for warpSize) are implemented using __declattr(property) which replaces read/write of a member field with a call to a getter/setter member function, in this case with appropriate NVPTX builtin. Added a test case to check diagnostics on attempt to construct or improperly access a built-in variable. Differential Revision: http://reviews.llvm.org/D9064 llvm-svn: 235448	2015-04-21 22:14:13 +00:00
Artem Belevich	a050112bba	Revert r235398 "[cuda] Added support for CUDA built-in variables." r235398 was causing buildbot break due to missing Makefile changes. llvm-svn: 235401	2015-04-21 18:36:42 +00:00

1 2 3

120 Commits