llvm-project

Commit Graph

Author	SHA1	Message	Date
Amy Kwan	efa57f9a7a	[PowerPC] Implement Vector Expand Mask builtins in LLVM/Clang This patch implements the vec_expandm function prototypes in altivec.h in order to utilize the vector expand with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82727	2020-09-06 17:13:21 -05:00
Nemanja Ivanovic	2d652949be	[PowerPC] Provide vec_cmpne on pre-Power9 architectures in altivec.h These overloads are listed in appendix A of the ELFv2 ABI specification without a requirement for ISA 3.0. So these need to be available on all Altivec-capable architectures. The implementation in altivec.h erroneously had them guarded for Power9 due to the availability of the VCMPNE[BHW] instructions. However these need to be implemented in terms of the VCMPEQ[BHW] instructions on older architectures. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47423	2020-09-04 21:48:38 -04:00
Nemanja Ivanovic	54205f0bd2	[PowerPC] Allow const pointers for load builtins in altivec.h The load builtins in altivec.h do not have const in the signature for the pointer parameter. This prevents using them for loading from constant pointers. A notable case for such a use is Eigen. This patch simply adds the missing const. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47408	2020-09-04 13:56:39 -04:00
Albion Fung	5d1fe3f903	[PowerPC] Implemented Vector Multiply Builtins This patch implements the builtins for Vector Multiply Builtins (vmulxxd family of instructions), and adds the appropriate test cases for these builtins. The builtins utilize the vector multiply instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D83955	2020-09-02 14:16:21 -05:00
Richard Smith	0e00a95b4f	Add new warning for compound punctuation tokens that are split across macro expansions or split by whitespace. For example: #define FOO(x) (x) FOO({}); ... forms a statement-expression after macro expansion. This warning applies to '({' and '})' delimiting statement-expressions, '[[' and ']]' delimiting attributes, and '::*' introducing a pointer-to-member. The warning for forming these compound tokens across macro expansions (or across files!) is enabled by default; the warning for whitespace within the tokens is not, but is included in -Wall. Differential Revision: https://reviews.llvm.org/D86751	2020-08-28 13:35:50 -07:00
Albion Fung	331dcc43ea	[PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D82502#inline-797941	2020-08-28 11:28:58 -05:00
Amy Kwan	76b0f99ea8	[PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang This patch implements the function prototypes vec_mulh and vec_dive in order to utilize the vector multiply high (vmulh[s\|u][w\|d]) and vector divide extended (vdive[s\|u][w\|d]) instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82609	2020-08-26 23:14:34 -05:00
Simon Pilgrim	a1dc3d241b	[X86] Enable constexpr on ROTL/ROTR intrinsics (PR31446) This enables constexpr rotate intrinsics defined in ia32intrin.h, including the MS specific builtins.	2020-08-23 16:11:58 +01:00
Simon Pilgrim	f8e0e5db48	[X86] Enable constexpr on _cast fp<-> uint intrinsics (PR31446) As suggested by @rsmith on PR47267, by replacing the builtin_memcpy bitcast pattern with builtin_bit_cast we can use _castf32_u32, _castu32_f32, _castf64_u64 and _castu64_f64 inside constant expresssions (constexpr). Although __builtin_bit_cast was added for c++20 it works on all clang c/c++ modes. Differential Revision: https://reviews.llvm.org/D86398	2020-08-23 10:27:46 +01:00
Simon Pilgrim	42b993d97d	[X86] ia32intrin.h - pull out common attributes used in cast helpers into define. NFCI.	2020-08-22 15:25:15 +01:00
Simon Pilgrim	9ffc412e1a	[X86] Enable constexpr on BITSCAN intrinsics (PR31446) This enables constexpr BSF/BSR intrinsics defined in ia32intrin.h	2020-08-21 11:44:20 +01:00
Simon Pilgrim	c8e6bf0a65	[X86] Enable constexpr on BSWAP intrinsics (PR31446) This enables constexpr BSWAP intrinsics defined in ia32intrin.h	2020-08-21 10:55:15 +01:00
Simon Pilgrim	c6863a4ab8	[X86] Enable constexpr on POPCNT intrinsics (PR31446) Followup to D86229, this enables constexpr on the alternative (which fallback to generic code) POPCNT intrinsics defined in ia32intrin.h	2020-08-21 10:20:37 +01:00
Simon Pilgrim	33bb80bc7a	[X86] ia32intrin.h - pull out common attributes into defines. NFCI. Matches what we do in most other x86 headers	2020-08-21 10:03:28 +01:00
Simon Pilgrim	cff0db0876	[X86] Enable constexpr on POPCNT intrinsics (PR31446) This is a first step patch to enable constexpr support and testing to a large number of x86 intrinsics. All I've done here is provide a DEFAULT_FN_ATTRS_CONSTEXPR variant to our existing DEFAULT_FN_ATTRS tag approach that adds constexpr on c++ builds. The clang cuda headers do something similar. I've started with POPCNT mainly as its tiny and are wrappers to generic __builtin_* intrinsics which already act as constexpr. Differential Revision: https://reviews.llvm.org/D86229	2020-08-20 21:38:04 +01:00
Amy Kwan	c7ec3a7e33	[PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang This patch implements the vec_extractm function prototypes in altivec.h in order to utilize the vector extract with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82675	2020-08-17 21:14:17 -05:00
Albion Fung	3136cbe29e	[PowerPC] Implement Vector Shift Builtins This patch implements the builtins for the vector shifts (shl, srl, sra), and adds the appropriate test cases for these builtins. The builtins utilize the vector shift instructions introduced within ISA 3.1. Differential Revision: https://reviews.llvm.org/D83338	2020-08-12 18:26:58 -05:00
Yaxun (Sam) Liu	ac3e720dc1	Make clang HIP headers compatible with C++98 Automation to detect compiler features, such as CMake's target_compile_features, would attempt to detect compiler features by explicitly using langugage flags. This change ensures that the HIP headers would still work with C++98. Patch by Siu Chi Chan Differential Revision: https://reviews.llvm.org/D85471 Change-Id: I304e964b18a525b0fde55efd841da74b6c4dc8ed	2020-08-07 13:50:22 -04:00
biplmish	cce1b0e891	[PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D84622	2020-08-07 01:02:29 -05:00
Thomas Lively	f496950001	[WebAssembly] Fix types in wasm_simd128.h and add tests `47f7174ffa` changed the types used in the Wasm SIMD builtin functions, but not all of their uses in wasm_simd128.h were updated. This commit fixes wasm_simd128.h and adds tests to make sure similar problems do not pass uncaught in the future. Differential Revision: https://reviews.llvm.org/D85347	2020-08-05 14:00:01 -07:00
Artem Belevich	7d057efddc	[CUDA] Work around a bug in rint/nearbyint caused by a broken implementation provided by CUDA. Normally math functions are forwarded to __nv_* counterparts provided by CUDA's libdevice bitcode. However, __nv_rint()/__nv_nearbyint() functions there have a bug -- they use round() which rounds up instead of rounding towards the nearest integer, so we end up with rint(2.5f) producing 3.0 instead of expected 2.0. The broken bitcode is not actually used by NVCC itself, which has both a work-around in CUDA headers and, in recent versions, uses correct implementations in NVCC's built-ins. This patch implements equivalent workaround and directs rint/nearbyint to __builtin_* variants that produce correct results. Differential Revision: https://reviews.llvm.org/D85236	2020-08-05 13:13:48 -07:00
Dan Gohman	47f7174ffa	[WebAssembly] Use "signed char" instead of "char" in SIMD intrinsics. This allows people to use `int8_t` instead of `char`, -funsigned-char, and generally decouples SIMD from the specialness of `char`. And it makes intrinsics like `__builtin_wasm_add_saturate_s_i8x16` and `__builtin_wasm_add_saturate_u_i8x16` use signed and unsigned element types, respectively. Differential Revision: https://reviews.llvm.org/D85074	2020-08-04 12:48:40 -07:00
Amy Kwan	c4e5743232	[PowerPC] Implement low-order Vector Modulus Builtins, and add Vector Multiply/Divide/Modulus Builtins Tests Power10 introduces new instructions for vector multiply, divide and modulus. These instructions can be exploited by the builtin functions: vec_mul, vec_div, and vec_mod, respectively. This patch aims adds the function prototype, vec_mod, as vec_mul and vec_div been previously implemented in altivec.h. This patch also adds the following front end tests: vec_mul for v2i64 vec_div for v4i32 and v2i64 vec_mod for v4i32 and v2i64 Differential Revision: https://reviews.llvm.org/D82576	2020-07-31 10:58:07 -05:00
Thomas Lively	11bb7eef41	[WebAssembly] Remove intrinsics for SIMD widening ops Instead, pattern match extends of extract_subvectors to generate widening operations. Since extract_subvector is not a legal node, this is implemented via a custom combine that recognizes extract_subvector nodes before they are legalized. The combine produces custom ISD nodes that are later pattern matched directly, just like the intrinsic was. Also removes the clang builtins for these operations since the instructions can now be generated from portable code sequences. Differential Revision: https://reviews.llvm.org/D84556	2020-07-28 18:25:55 -07:00
Amy Kwan	74790a5dde	[PowerPC] Implement Truncate and Store VSX Vector Builtins This patch implements the `vec_xst_trunc` function in altivec.h in order to utilize the Store VSX Vector Rightmost [byte \| half \| word \| doubleword] Indexed instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82467	2020-07-24 19:22:39 -05:00
Jon Chesterfield	679158e662	Make hip math headers easier to use from C Summary: Make hip math headers easier to use from C Motivation is a step towards using the hip math headers to implement math.h for openmp, which needs to work with C as well as C++. NFC for C++ code. Reviewers: yaxunl, jdoerfert Reviewed By: yaxunl Subscribers: sstefan1, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D84476	2020-07-24 20:50:46 +01:00
Yaxun (Sam) Liu	ce04d4e39c	Fix pow and ldexp in HIP header	2020-07-21 17:39:46 -04:00
Amy Kwan	62f5ba624b	[PowerPC][Power10] Implement Test LSB by Byte Builtins in LLVM/Clang This patch implements builtins for the Test LSB by Byte instruction introduced in Power10. Differential Revision: https://reviews.llvm.org/D82431	2020-07-13 22:47:47 -05:00
Johannes Doerfert	b5667d00e0	[OpenMP][CUDA] Fix std::complex in GPU regions The old way worked to some degree for C++-mode but in C mode we actually tried to introduce variants of macros (e.g., isinf). To make both modes work reliably we get rid of those extra variants and directly use NVIDIA intrinsics in the complex implementation. While this has to be revisited as we add other GPU targets which want to reuse the code, it should be fine for now. Reviewed By: tra, JonChesterfield, yaxunl Differential Revision: https://reviews.llvm.org/D83591	2020-07-11 00:40:05 -05:00
Johannes Doerfert	7f1e6fcff9	[OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in wrapper headers Due to recent changes we cannot use OpenMP in CUDA files anymore (PR45533) as the math handling of CUDA is different when _OPENMP is defined. We actually want this different behavior only if we are offloading with OpenMP to NVIDIA, thus generating NVPTX. With this patch we do not interfere with the CUDA math handling except if we are in NVPTX offloading mode, as indicated by the presence of __OPENMP_NVPTX__. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D78155	2020-07-10 18:53:34 -05:00
Johannes Doerfert	e3e47e8035	[OpenMP] Make complex soft-float functions on the GPU weak definitions To avoid linkage errors we have to ensure the linkage allows multiple definitions of these compiler inserted functions. Since they are on the cold path of complex computations, we want to avoid `inline`. Instead, we opt for `weak` and `noinline` for now.	2020-07-09 01:06:55 -05:00
Johannes Doerfert	d999cbc988	[OpenMP] Initial support for std::complex in target regions This simply follows the scheme we have for other wrappers. It resolves the current link problem, e.g., `__muldc3 not found`, when std::complex operations are used on a device. This will not allow complex make math function calls to work properly, e.g., sin, but that is more complex (pan intended) anyway. Reviewed By: tra, JonChesterfield Differential Revision: https://reviews.llvm.org/D80897	2020-07-08 17:33:59 -05:00
Craig Topper	82206e7fb4	[X86] Enabled a bunch of 64-bit Interlocked* functions intrinsics on 32-bit Windows to match recent MSVC This enables _InterlockedAnd64/_InterlockedOr64/_InterlockedXor64/_InterlockedDecrement64/_InterlockedIncrement64/_InterlockedExchange64/_InterlockedExchangeAdd64/_InterlockedExchangeSub64 on 32-bit Windows The backend already knows how to expand these to a loop using cmpxchg8b on 32-bit targets. Fixes PR46595 Differential Revision: https://reviews.llvm.org/D83254	2020-07-08 10:39:56 -07:00
Xiang1 Zhang	939d8309db	[X86-64] Support Intel AMX Intrinsic INTEL ADVANCED MATRIX EXTENSIONS (AMX). AMX is a new programming paradigm, it has a set of 2-dimensional registers (TILES) representing sub-arrays from a larger 2-dimensional memory image and operate on TILES. These intrinsics use direct TMM register number as its params. Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D83111	2020-07-07 10:13:40 +08:00
Biplob Mishra	0c6b6e28e7	[PowerPC] Implement Vector Splat Immediate Builtins in Clang Implements builtins for the following prototypes: vector signed int vec_splati (const signed int); vector float vec_splati (const float); vector double vec_splatid (const float); vector signed int vec_splati_ins (vector signed int, const unsigned int, const signed int); vector unsigned int vec_splati_ins (vector unsigned int, const unsigned int, const unsigned int); vector float vec_splati_ins (vector float, const unsigned int, const float); Differential Revision: https://reviews.llvm.org/D82520	2020-07-06 20:29:33 -05:00
Lei Huang	e359ab1eca	[PowerPC][NFC] Fix indentation	2020-07-03 16:47:24 -05:00
Biplob Mishra	0939e04e41	[PowerPC] Implement Vector Insert Builtins in LLVM/Clang Implements vec_insertl() and vec_inserth(). Differential Revision: https://reviews.llvm.org/D82365	2020-07-03 15:30:41 -05:00
Biplob Mishra	ca464639a1	[PowerPC] Implement Vector Blend Builtins in LLVM/Clang Implements vec_blendv() Differential Revision: https://reviews.llvm.org/D82774	2020-07-02 16:52:52 -05:00
Biplob Mishra	286073484f	[PowerPC]Implement Vector Permute Extended Builtin Implements vector permute builtin: vec_permx() Differential Revision: https://reviews.llvm.org/D82869	2020-07-02 14:53:18 -05:00
Biplob Mishra	88874f0746	[PowerPC]Implement Vector Shift Double Bit Immediate Builtins Implement Vector Shift Double Bit Immediate Builtins in LLVM/Clang. * vec_sldb (); * vec_srdb (); Differential Revision: https://reviews.llvm.org/D82440	2020-07-01 20:34:53 -05:00
Amy Kwan	e0c02dc980	[PowerPC][Power10] Implement centrifuge, vector gather every nth bit, vector evaluate Builtins in LLVM/Clang This patch implements builtins for the following prototypes: unsigned long long __builtin_cfuged (unsigned long long, unsigned long long); vector unsigned long long vec_cfuge (vector unsigned long long, vector unsigned long long); unsigned long long vec_gnb (vector unsigned __int128, const unsigned int); vector unsigned char vec_ternarylogic (vector unsigned char, vector unsigned char, vector unsigned char, const unsigned int); vector unsigned short vec_ternarylogic (vector unsigned short, vector unsigned short, vector unsigned short, const unsigned int); vector unsigned int vec_ternarylogic (vector unsigned int, vector unsigned int, vector unsigned int, const unsigned int); vector unsigned long long vec_ternarylogic (vector unsigned long long, vector unsigned long long, vector unsigned long long, const unsigned int); vector unsigned __int128 vec_ternarylogic (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128, const unsigned int); Differential Revision: https://reviews.llvm.org/D80970	2020-06-25 21:34:41 -05:00
Amy Kwan	d82f26cc4b	[PowerPC][Power10] Implement Count Leading/Trailing Zeroes Builtins under bit Mask in LLVM/Clang This patch implements builtins for the following prototypes: unsigned long long __builtin_cntlzdm (unsigned long long, unsigned long long) unsigned long long __builtin_cnttzdm (unsigned long long, unsigned long long) vector unsigned long long vec_cntlzm (vector unsigned long long, vector unsigned long long) vector unsigned long long vec_cnttzm (vector unsigned long long, vector unsigned long long) Differential Revision: https://reviews.llvm.org/D80941	2020-06-24 16:03:45 -05:00
Amy Kwan	19df9e2959	[PowerPC][Power10] Implement VSX PCV Generate Operations in LLVM/Clang This patch implements builtins for the following prototypes for the VSX Permute Control Vector Generate with Mask Instructions: vector unsigned char vec_genpcvm (vector unsigned char, const int); vector unsigned short vec_genpcvm (vector unsigned short, const int); vector unsigned int vec_genpcvm (vector unsigned int, const int); vector unsigned long long vec_genpcvm (vector unsigned long long, const int); Differential Revision: https://reviews.llvm.org/D81774	2020-06-22 21:09:34 -05:00
Amy Kwan	cc95635b1b	[PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang This patch implements builtins for the following prototypes: ``` vector signed char vec_clrl (vector signed char a, unsigned int n); vector unsigned char vec_clrl (vector unsigned char a, unsigned int n); vector signed char vec_clrr (vector signed char a, unsigned int n); vector signed char vec_clrr (vector unsigned char a, unsigned int n); ``` Differential Revision: https://reviews.llvm.org/D81707	2020-06-20 18:29:16 -05:00
Amy Kwan	c45c161130	[PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang This patch implements builtins for the following prototypes: vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long); vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b); unsigned long long __builtin_pdepd (unsigned long long, unsigned long long); unsigned long long __builtin_pextd (unsigned long long, unsigned long long); Revision Depends on D80758 Differential Revision: https://reviews.llvm.org/D80935	2020-06-18 16:23:56 -05:00
Thomas Lively	d2c394e74f	[WebAssembly] Add intrinsic for i64x2.mul Summary: This instruction was implemented in `3181273be7`, but that commit did not add an intrinsic for it. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81757	2020-06-12 14:08:18 -07:00
Yaxun (Sam) Liu	af00eb25f8	Fix __clang_cuda_math_forward_declares.h Recent change from `#if !defined(__CUDA__)` to `#if !__CUDA__` caused regression on ROCm 3.5 since there is `#define __CUDA__` before inclusion of the header file, which causes `#if !__CUDA__` to be invalid. Change `#if !__CUDA__` back to `#if !defined(__CUDA__)` for backward compatibility.	2020-06-10 23:47:13 -04:00
Yaxun (Sam) Liu	4615abc11f	Rename arg name in __clang_hip_math.h __sptr is a keyword for ms-extension. Change it from __sptr to __sinptr.	2020-06-08 14:13:32 -04:00
Yaxun (Sam) Liu	8422bc9efc	recommit "[HIP] Add default header and include path" recommit `11d06b9511` with fix for lit tests.	2020-06-06 14:21:22 -04:00
Nico Weber	2920348063	Revert "recommit "[HIP] Add default header and include path"" This reverts commit `1fa43e0b34`. Still breaks tests on several bots, see https://reviews.llvm.org/D81176	2020-06-05 21:50:04 -04:00

1 2 3 4 5 ...

1683 Commits