llvm-project

Commit Graph

Author	SHA1	Message	Date
Artem Belevich	9a75c06cd9	[CUDA] Work around compatibility issue with libstdc++ 11.1.0 libstdc++ redeclares __failed_assertion multiple times and that results in the function declared with conflicting set of attributes when we include <complex> with __host__ __device__ attributes force-applied to all functions. In order to work around the issue, we rename __failed_assertion within the region with forced attributes. See https://bugs.llvm.org/show_bug.cgi?id=50383 for the details. Differential Revision: https://reviews.llvm.org/D102936	2021-05-24 11:07:09 -07:00
Yaxun (Sam) Liu	91dfd68e90	[NFC][HIP] fix comments in __clang_hip_cmath.h	2021-05-21 17:44:18 -04:00
Nemanja Ivanovic	7cd2833311	[PowerPC] Add vec_vupkhpx and vec_vupklpx for XL compatibility These are old names for these functions that XL still supports.	2021-05-14 08:02:00 -05:00
Aaron En Ye Shi	a249ffa421	[HIP] Clean up llvm intrinsics using __asm Instead of using inline asm, use clang builtins for llvm intrinsics. Differential Revision: https://reviews.llvm.org/D102427	2021-05-13 18:55:51 +00:00
Nemanja Ivanovic	39e4676ca7	[PowerPC] Provide doubleword vector predicate form comparisons on Power7 There are two reasons this shouldn't be restricted to Power8 and up: 1. For XL compatibility 2. Because clang will expand comparison operators to these intrinsics* *Without this patch, the following causes a selection error: int test(vector signed long a, vector signed long b) { return a < b; } This patch provides the handling for the intrinsics in the back end and removes the Power8 guards from the predicate functions (vec_{all\|any}_{eq\|ne\|gt\|ge\|lt\|le}).	2021-05-13 04:56:56 -05:00
Anastasia Stulova	58d18dde5c	[OpenCL] Remove pragma requirement from Arm dot extension. This removed the pointless need for extension pragma since it doesn't disable anything properly and it doesn't need to enable anything that is not possible to disable. The change doesn't break existing kernels since it allows to compile more cases i.e. without pragma statements but the pragma continues to be accepted. Differential Revision: https://reviews.llvm.org/D100985	2021-05-12 16:25:33 +01:00
Thomas Lively	1e9c39a3f9	[WebAssembly] Use functions instead of macros for const SIMD intrinsics To improve hygiene, consistency, and usability, it would be good to replace all the macro intrinsics in wasm_simd128.h with functions. The reason for using macros in the first place was to enforce the use of constants for some arguments using `_Static_assert` with `__builtin_constant_p`. This commit switches to using functions and uses the `__diagnose_if__` attribute rather than `_Static_assert` to enforce constantness. The remaining macro intrinsics cannot be made into functions until the builtin functions they are implemented with can be replaced with normal code patterns because the builtin functions themselves require that their arguments are constants. This commit also fixes a bug with the const_splat intrinsics in which the f32x4 and f64x2 variants were incorrectly producing integer vectors. Differential Revision: https://reviews.llvm.org/D102018	2021-05-07 11:50:19 -07:00
Thomas Lively	b198b9b897	[WebAssembly] Fix argument types in SIMD narrowing intrinsics The builtins were updated to take signed parameters in `627a526955`, but the intrinsics that use those builtins were not updated as well. The intrinsic test did not catch this sign mismatch because it is only reported as an error under -fno-lax-vector-conversions. This commit fixes the type mismatch and adds -fno-lax-vector-conversions to the test to catch similar problems in the future. Differential Revision: https://reviews.llvm.org/D101979	2021-05-06 10:07:45 -07:00
Nemanja Ivanovic	1faf3b195e	[PowerPC] Re-commit `ed87f512bb` This was reverted in `3761b9a234` just as I was about to commit the fix. This patch inlcudes the necessary fix.	2021-05-06 09:50:12 -05:00
Nico Weber	3761b9a234	Revert "[PowerPC] Provide some P8-specific altivec overloads for P7" This reverts commit `ed87f512bb`. Breaks check-clang, see e.g. https://lab.llvm.org/buildbot/#/builders/139/builds/3818	2021-05-06 10:01:16 -04:00
Nemanja Ivanovic	ed87f512bb	[PowerPC] Provide some P8-specific altivec overloads for P7 This adds additional support for XL compatibility. There are a number of functions in altivec.h that produce a single instruction (or a very short sequence) for Power8 but can be done on Power7 without scalarization. XL provides these implementations. This patch adds the following overloads for doubleword vectors: vec_add vec_cmpeq vec_cmpgt vec_cmpge vec_cmplt vec_cmple vec_sl vec_sr vec_sra	2021-05-06 08:37:36 -05:00
Johannes Doerfert	5d8d994dfb	[OpenMP] Make sure classes work on the device as they do on the host We do provide `operator delete(void*)` in `<new>` but it should be available by default. This is mostly boilerplate to test it and the unconditional include of `<new>` in the header we always in include on the device. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D100620	2021-05-06 02:10:30 -05:00
Thomas Lively	81fce29d6e	[WebAssembly] Add SIMD const_splat intrinsics These intrinsics do not correspond to their own underlying instruction, but are a convenience for the common case of materializing a constant vector that has the same value in each lane. Differential Revision: https://reviews.llvm.org/D101885	2021-05-05 13:46:45 -07:00
Thomas Lively	602f318cfd	[WebAssembly] Fix constness of pointer params to load intrinsics Update the SIMD builtin load functions to take pointers to const data and update the intrinsics themselves to not cast away constness. Differential Revision: https://reviews.llvm.org/D101884	2021-05-05 13:16:56 -07:00
Nemanja Ivanovic	bfd60b36f8	[PowerPC] Add floating point overloads for vec_sldw These are added for compatibility with XLC.	2021-04-30 20:29:03 -05:00
Nemanja Ivanovic	c3da07d216	[PowerPC] Provide fastmath sqrt and div functions in altivec.h This adds the long overdue implementations of these functions that have been part of the ABI document and are now part of the "Power Vector Intrinsic Programming Reference" (PVIPR). The approach is to add new builtins and to emit code with the fast flag regardless of whether fastmath was specified on the command line. Differential revision: https://reviews.llvm.org/D101209	2021-04-30 19:17:48 -05:00
Anastasia Stulova	3ec82e5195	[OpenCL] Prevent adding vendor extensions for all targets Removed extension begin/end pragma as it has no effect and it is added unconditionally for all targets. Differential Revision: https://reviews.llvm.org/D92244	2021-04-30 14:42:51 +01:00
Wang, Pengfei	e0c7db7d8c	[MS] Preserve base register %rbx around cpuid This patch copies implementation from cpuid.h, which preserve base register %rbx around cpuid. It fixes PR50133. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101338	2021-04-30 10:16:25 +08:00
Luo, Yuanke	d6c6db2fea	[X86][AMX] Add description for AMX new interface. Differential Revision: https://reviews.llvm.org/D101059	2021-04-27 16:05:11 +08:00
Thomas Lively	502f54049d	[WebAssembly] Finalize wasm_simd128.h intrinsics Adds new intrinsics for instructions that are in the final SIMD spec but did not previously have intrinsics. Also updates the names of existing intrinsics to reflect the final names of the underlying instructions in the spec. Keeps the old names as deprecated functions to ease the transition to the new names. Differential Revision: https://reviews.llvm.org/D101112	2021-04-23 13:37:27 -07:00
Nemanja Ivanovic	19b29b1ed1	[PowerPC] Provide XL-compatible builtins in altivec.h There are some interfaces in altivec.h that are not compatible between Clang and XL (although Clang is compatible with GCC). Currently, we have found 3 but there may be others. Clang/GCC signatures: vector double vec_ctf(vector signed long long) vector double vec_ctf(vector unsigned long long) vector signed long long vec_cts(vector double) vector unsigned long long vec_ctu(vector double) XL signatures: vector float vec_ctf(vector signed long long) vector float vec_ctf(vector unsigned long long) vector signed int vec_cts(vector double) vector unsigned int vec_ctu(vector double) This patch provides the XL behaviour under the __XL_COMPAT_ALTIVEC__ macro for users that rely on XL behaviour. Differential revision: https://reviews.llvm.org/D101130	2021-04-23 15:13:46 -05:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Wang, Pengfei	e8bce83996	[X86] Enable compilation of user interrupt handlers. Add __uintr_frame structure and use UIRET instruction for functions with x86 interrupt calling convention when UINTR is present. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D99708	2021-04-23 11:43:57 +08:00
Yaxun (Sam) Liu	8baba6890d	[HIP] Support overloaded math functions for hipRTC Remove the dependence on standard C++ header for overloaded math functions in HIP header since standard C++ header is not available for hipRTC. Reviewed by: Artem Belevich, Justin Lebar Differential Revision: https://reviews.llvm.org/D100794	2021-04-22 19:06:51 -04:00
Nemanja Ivanovic	7a5641d651	[PowerPC] Add missing casts for vec_xlds and vec_load_splats The previous commits just missed some pointer casts and ended up producing warnings.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	1cc1d9db28	[PowerPC] Add vec_vclz as an alias for vec_cntlz in altivec.h Another addition for compatibility with XLC. The functions have the same overloads so just add it as a preprocessor define.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	e43963db24	[PowerPC] Add vec_load_splats to altivec.h Add these overloads for compatibility with XLC. This is a word load-and-splat.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	a0e6189712	[PowerPC] Add vec_xlds to altivec.h Add these overloads for compatibility with XLC. This is a doubleword load-and-splat.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	a1d325af67	[PowerPC] Add vec_roundz as alias for vec_trunc in altivec.h Add the overloads for compatibility with XLC.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	1550c47c18	[PowerPC] Add vec_roundp as alias for vec_ceil Add the overloads for compatibility with XLC.	2021-04-22 10:30:59 -05:00
Nemanja Ivanovic	51692c6c63	[PowerPC] Add missing VSX guard for vec_roundm with vector double The guard was missed in the previous commit.	2021-04-22 10:30:59 -05:00
Nemanja Ivanovic	3a46667059	[PowerPC] Add vec_roundm as alias for vec_floor in altivec.h Add the overloads for compatibility with XLC.	2021-04-22 10:30:59 -05:00
Nemanja Ivanovic	3bcd0ece43	[PowerPC] Add vec_roundc as alias for vec_rint in altivec.h For compatibility with XLC, add these overloads.	2021-04-22 05:31:38 -05:00
Liu, Chen3	72e4bf12ee	[X86] Support some missing intrinsics Support for _mm512_i32logather_pd, _mm512_mask_i32logather_pd, _mm512_i32logather_epi64, _mm512_mask_i32logather_epi64, _mm512_i32loscatter_pd, _mm512_mask_i32loscatter_pd, _mm512_i32loscatter_epi64, _mm512_mask_i32loscatter_epi64. Differential Revision: https://reviews.llvm.org/D100368	2021-04-21 10:50:37 +08:00
Yaxun (Sam) Liu	d8805574c1	[CUDA][HIP] Allow non-ODR use of host var in device Reviewed by: Artem Belevich, Richard Smith Differential Revision: https://reviews.llvm.org/D98193	2021-04-19 14:45:24 -04:00
Yaxun (Sam) Liu	6823af0ca8	[HIP] Support hipRTC in header hipRTC compiles HIP device code at run time. Since the system may not have development tools installed, when a HIP program is compiled through hipRTC, there is no standard C or C++ header available. As such, the HIP headers should not depend on standard C or C++ headers when used with hipRTC. Basically when hipRTC is used, HIP headers only provides definitions of HIP device API functions. This is in line with what nvRTC does. This patch adds support of hipRTC to HIP headers in clang. Basically hipRTC defines a macro __HIPCC_RTC__ when compile HIP code at run time. When this macro is defined, HIP headers do not include standard C/C++ headers. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D100652	2021-04-17 11:34:52 -04:00
Sven van Haastregt	35bc7569f8	[OpenCL] Add as_size/ptrdiff/intptr/uintptr_t operators size_t and friends are built-in scalar data types and s6.4.4.2 of the OpenCL C Specification says the as_type() operator must be available for these data types. Differential Revision: https://reviews.llvm.org/D98959	2021-04-07 10:16:41 +01:00
Yaxun (Sam) Liu	85ff35a952	[HIP] remove overloaded abs in header This function seems to be introduced by accident by `aa2b593f14` Such overloaded abs function did not exist before the refactoring, and does not exist in https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__clang_cuda_cmath.h Conceptually it also does not make sense, since it adds something like double abs(int x) { return ::abs((double)x); } It caused regressions in CuPy. Reviewed by: Aaron Enye Shi, Artem Belevich Differential Revision: https://reviews.llvm.org/D99738	2021-04-01 12:23:29 -04:00
Sven van Haastregt	b5995fced4	[OpenCL] Limit popcount to OpenCL 1.2 and above s6.15.3 of the OpenCL C Specification v3.0.6 states that OpenCL 1.2 or newer is required.	2021-03-31 09:54:18 +01:00
Craig Topper	3fb40ce167	[X86] Don't define vpclmulqdq or vaes intrinsics in the headers unless avx512fintrin.h has been included. The intrinsics won't compile unless avx512fintrin.h has declared the 512 bit types.	2021-03-28 11:26:30 -07:00
Zakk Chen	821547cabb	[RISCV][Clang] Update new overloading rules for RVV intrinsics. RVV intrinsics has new overloading rule, please see `82aac7dad4` Changed: 1. Rename `generic` to `overloaded` because the new rule is not using C11 generic. 2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations support overloading api. 3. Add more overloaded tests due to overloading rule changed. Differential Revision: https://reviews.llvm.org/D99189	2021-03-28 09:04:35 -07:00
Nemanja Ivanovic	06411edb9f	[PowerPC][NFC] Provide legacy names for VSX loads and stores Before we unified the names of the builtins across all the compilers, there were a number of synonyms between them. There is code out there that uses XL naming for some of these loads and stores. This just adds those names.	2021-03-25 06:32:40 -05:00
Nemanja Ivanovic	4020932706	[PowerPC] Make altivec.h work with AIX which has no __int128 There are a number of functions in altivec.h that use vector __int128 which isn't supported on AIX. Those functions need to be guarded for targets that don't support the type. Furthermore, the functions that produce quadword instructions without using the type need a builtin. This patch adds the macro guards to altivec.h using the __SIZEOF_INT128__ which is only defined on targets that support the __int128 type.	2021-03-24 00:35:51 -05:00
Nemanja Ivanovic	4146864735	[PowerPC][NFC] Use valid type for offset in altivec.h We currently use signed long long instead of ptrdiff_t for offsets in altivec.h. This has never really presented a problem because all platforms where we use these are 64-bit. However, now that we have 32-bit targets, we need to use a meaningful type.	2021-03-23 08:45:37 -05:00
Nemanja Ivanovic	2f782a796a	[PowerPC] Add more missing overloads to altivec.h Add overloads that perform subtraction on v1i128 that take and produce vector unsigned char to avoid needing to use __int128. The overloads are suffixed with _u128 and are needed for targets where __int128 isn't supported (AIX).	2021-03-23 05:52:36 -05:00
Sven van Haastregt	1c6521a0dd	[OpenCL] Remove mixed signedness atomic_fetch_ from opencl-c.h The OpenCL C specification v3.0.6 s6.15.12.7.5 mentions: For atomic_fetch and modify functions with key = or, xor, and, min and max on atomic type atomic_intptr_t, M is intptr_t, and on atomic type atomic_uintptr_t, M is uintptr_t. Remove the atomic_fetch_* overloads from opencl-c.h that mix intptr_t and uintptr_t in the same declaration. Differential Revision: https://reviews.llvm.org/D98418	2021-03-23 10:20:13 +00:00
Nemanja Ivanovic	54e4654f04	[PowerPC] Add more missing overloads to altivec.h Add overloads that perform addition on v1i128 that take and produce vector unsigned char to avoid needing to use __int128. The overloads are suffixed with _u128 and are needed for targets where __int128 isn't supported (AIX).	2021-03-23 05:09:19 -05:00
Nemanja Ivanovic	10cc5bcd86	[PowerPC] Add more missing overloads to altivec.h Add vec_permi as a synonym for vec_xxpermdi (but only for doubleword vectors).	2021-03-22 23:09:41 -05:00
Nemanja Ivanovic	b5e96e0ad6	[PowerPC] Add more missing overloads to altivec.h Add vec_gbb as a synonym for vec_vgbbd but for doubleword vectors.	2021-03-22 22:25:28 -05:00
Nemanja Ivanovic	d8e574c8e6	[PowerPC] Add more missing overloads to altivec.h Add vec_cvf as a synonym for vec_doublee/vec_floate.	2021-03-22 22:08:43 -05:00
Nemanja Ivanovic	bef2cb9062	[PowerPC] Add more missing overloads to altivec.h Add vec_ctd which is similar to vec_ctf except the return type is vector double rather than vector float.	2021-03-22 20:23:07 -05:00
Thomas Lively	cbab2cd6bf	[WebAssembly] Remove experimental instructions from wasm_simd128.h These experimental builtin functions and the feature macro they were gated behind have been removed. Reviewed By: aheejin Differential Revision: https://reviews.llvm.org/D98907	2021-03-18 17:13:50 -07:00
Thomas Lively	f5764a8654	[WebAssembly] Finalize SIMD names and opcodes Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all SIMD instructions to match the finalized SIMD spec. Deliberately does not change the public interface in wasm_simd128.h yet; that will require more care. Depends on D98466. Differential Revision: https://reviews.llvm.org/D98676	2021-03-18 11:21:25 -07:00
Bing1 Yu	320b72e9cd	[X86][AMX] Rename amx-bf16 intrinsic according to correct naming convention __tile_tdpbf16ps should be renamed with __tile_dpbf16ps Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D98685	2021-03-17 11:22:52 +08:00
Stelios Ioannou	ab86edbc88	[AArch64] Implement __rndr, __rndrrs intrinsics This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined. These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator. The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838 Differential Revision: https://reviews.llvm.org/D98264 Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc	2021-03-15 17:51:48 +00:00
Nemanja Ivanovic	b5fae4b9b2	[PowerPC] Add more missing overloads to altivec.h We are missing more predicate forms for 'vector double' and some tests. This adds the missing overloads and completes the set of test cases for them.	2021-03-12 10:51:57 -06:00
Zakk Chen	d6a0560bf2	[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics. Demonstrate how to generate vadd/vfadd intrinsic functions 1. add -gen-riscv-vector-builtins for clang builtins. 2. add -gen-riscv-vector-builtin-codegen for clang codegen. 3. add -gen-riscv-vector-header for riscv_vector.h. It also generates ifdef directives with extension checking, base on D94403. 4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h. Generate overloading version Header for generic api. https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface 5. update tblgen doc for riscv related options. riscv_vector.td also defines some unused type transformers for vadd, because I think it could demonstrate how tranfer type work and we need them for the whole intrinsic functions implementation in the future. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D95016	2021-03-10 18:43:43 -08:00
Jingu Kang	25951c5ab8	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
Nemanja Ivanovic	f4ad7a1a15	[PowerPC] Add missing double precision vec_all overloads to altivec.h We somehow missed vec_all_nlt, vec_all_nle and vec_all_numeric overloads for double precision vectors when VSX is enabled.	2021-03-05 18:42:12 -06:00
Nemanja Ivanovic	1ff93618e5	[PowerPC] Add missing overloads of vec_promote to altivec.h The VSX-only overloads (for 8-byte element vectors) are missing. Add the missing overloads and convert element numbering to modulo arithmetic to match GCC and XLC.	2021-03-01 21:40:30 -06:00
Nemanja Ivanovic	38a34e207f	[PowerPC] Use modulo arithmetic for vec_extract in altivec.h These interfaces are not covered in the ELFv2 ABI but are rather implemented to emulate those available in GCC/XLC. However, the ones in the other compilers are documented to perform modulo arithmetic on the element number. This patch just brings clang inline with the other compilers at -O0 (with optimization, clang already does the right thing).	2021-03-01 19:49:26 -06:00
Artem Belevich	32e0645276	[CUDA] Remove `noreturn` attribute from __assertfail(). `noreturn` complicates control flow and tends to trigger a known bug in ptxas if the assert is used within loops in sufficiently complicated code. https://bugs.llvm.org/show_bug.cgi?id=27738 Differential Revision: https://reviews.llvm.org/D97708	2021-03-01 13:59:22 -08:00
Liu, Chen3	4bc7c8631a	[X86] Support amx-bf16 intrinsic. Adding support for intrinsics of AMX-BF16. This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong predicate. Differential Revision: https://reviews.llvm.org/D97358	2021-02-25 09:06:48 +08:00
Sven van Haastregt	612d0ef173	[OpenCL] Move remaining defines to opencl-c-base.h Move any remaining preprocessor defines from `opencl-c.h` to `opencl-c-base.h`, such that they are shared with `-fdeclare-opencl-builtins` too. In particular, move: - the `as_type` and `as_typen` definitions, and - the `kernel_exec` and `__kernel_exec` definitions. Also clang-format the changes. Differential Revision: https://reviews.llvm.org/D96948	2021-02-23 10:18:14 +00:00
Liu, Chen3	f8b9035aae	[X86] Support amx-int8 intrinsic. Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD. Differential Revision: https://reviews.llvm.org/D97259	2021-02-23 17:08:05 +08:00
Sven van Haastregt	5a4a01460f	[OpenCL] Move printf declaration to opencl-c-base.h Supporting `printf` with `-fdeclare-opencl-builtins` would require special handling (for e.g. varargs and format attributes) for just this one function. Instead, move the `printf` declaration to the shared base header. Differential Revision: https://reviews.llvm.org/D96789	2021-02-18 11:27:19 +00:00
Wang, Pengfei	61da20575d	[X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) This is a follow up of D92940. We have successfully converted fadd/fmul _mm_reduce_* intrinsics to llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax too, i.e. llvm.reduction + nnan flag. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93179	2021-02-15 08:52:06 +08:00
Jonas Paulsson	b3ac5b84cd	[SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst. vec_xl() and vec_xst() should not emit alignment hints since they take a scalar pointer and also add a byte offset if passed. This patch uses memcpy to achieve the desired result. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D96471	2021-02-12 18:26:36 -06:00
Wang, Pengfei	dd2460ed5d	[X86] Always assign reassoc flag for intrinsics reduce_add/mul_ps/pd. Intrinsics reduce_add/mul_ps/pd have assumption that the elements in the vector are reassociable. So we need to always assign the reassoc flag when we call _mm_reduce_* intrinsics. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D96231	2021-02-09 21:14:06 +08:00
Anton Zabaznov	d88c55ab95	[OpenCL] Add macro definitions of OpenCL C 3.0 features This patch adds possibility to define OpenCL C 3.0 feature macros via command line option or target setting. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D95776	2021-02-05 18:42:25 +03:00
Yaxun (Sam) Liu	0211877a07	[HIP] Add __managed__ macro to header	2021-02-04 16:22:42 -05:00
Wolfgang Pieb	231a82a150	[X86] Correct some cross references in avxintrin.h.	2021-01-25 18:49:28 -08:00
Wolfgang Pieb	350395d82f	[x86] Fix trivial typo in emmintrin.h	2021-01-25 17:28:05 -08:00
Michael Liao	7b5d7c7b0a	[hip] Fix `<complex>` compilation on Windows with VS2019. Differential Revision: https://reviews.llvm.org/D95075	2021-01-20 16:43:44 -05:00
Luo, Yuanke	7e1d2224b4	[X86][AMX] Fix the typo. The dpbsud should be dpbssd. Differential Revision: https://reviews.llvm.org/D94943	2021-01-19 16:57:34 +08:00
Aaron En Ye Shi	be40c12040	[HIP] Add signbit(long double) decl An _MSC_VER version of signbit(long double) is required for MSVC headers. Fixes: SWDEV-256409 Differential Revision: https://reviews.llvm.org/D93062	2021-01-14 18:23:37 +00:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Esme-Yi	ffa67873a3	[PowerPC] Add variants of 64-bit vector types for vec_sel. Summary: This patch added variants of vec_sel and fixed bugzilla 46770. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D94162	2021-01-11 03:52:16 +00:00
Michael Liao	f78d6af731	[hip] Enable HIP compilation with `<complex`> on MSVC. - MSVC has different `<complex>` implementation which calls into functions declared in `<ymath.h>`. Provide their device-side implementation to enable `<complex>` compilation on HIP Windows. Differential Revision: https://reviews.llvm.org/D93638	2021-01-07 17:41:28 -05:00
Luo, Yuanke	08665b1805	Support tilezero intrinsic and c interface for AMX. Differential Revision: https://reviews.llvm.org/D92837	2020-12-31 13:24:57 +08:00
Michael Liao	bb8d20d9f3	[cuda][hip] Fix typoes in header wrappers.	2020-12-21 13:02:47 -05:00
Simon Pilgrim	4855a1004d	[X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well. Differential Revision: https://reviews.llvm.org/D92940	2020-12-13 15:37:35 +00:00
Anastasia Stulova	a84599f177	[OpenCL] Implement extended subgroups fully in headers. Extended subgroups are library style extensions and therefore they require no changes in the frontend. This commit: 1. Moves extension macro definitions to the internal headers. 2. Removes extension pragmas because they are not needed. Tags: #clang Differential Revision: https://reviews.llvm.org/D92231	2020-12-10 16:40:15 +00:00
Luo, Yuanke	f80b29878b	[X86] AMX programming model. This patch implements amx programming model that discussed in llvm-dev (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html). Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet. This patch implemeted 7 components. 1. The c interface to end user. 2. The AMX intrinsics in LLVM IR. 3. Transform load/store <256 x i32> to AMX intrinsics or split the type into two <128 x i32>. 4. The Lowering from AMX intrinsics to AMX pseudo instruction. 5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx intruction. 6. The register allocation for tile register. 7. Morph AMX pseudo instruction to AMX real instruction. Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0 Differential Revision: https://reviews.llvm.org/D87981	2020-12-10 17:01:54 +08:00
Masoud Ataei	fc750f609d	[PPC] Fixing a typo in altivec.h. Commenting out an unnecessary macro	2020-12-08 19:21:02 +00:00
Artem Belevich	4326792942	[CUDA] Another attempt to fix early inclusion of <new> from libstdc++ Previous patch (`9a465057a6`) did not fix the problem. https://bugs.llvm.org/show_bug.cgi?id=48228 If the <new> is included too early, before CUDA-specific defines are available, just include-next the standard <new> and undo the include guard. CUDA-specific variants of operator new/delete will be declared if/when <new> is used from the CUDA source itself, when all CUDA-related macros are available. Differential Revision: https://reviews.llvm.org/D91807	2020-12-04 12:03:35 -08:00
Martin Storsjö	c17fdca188	[clang] [Headers] Use the corresponding _aligned_free or __mingw_aligned_free in _mm_free Differential Revision: https://reviews.llvm.org/D92570	2020-12-04 11:34:12 +02:00
Aaron En Ye Shi	ba2612ce01	[HIP] cmath demote long double args to double Since there is no ROCm Device Library support for long double, demote them to double, and use the fp64 math functions. Differential Revision: https://reviews.llvm.org/D92130	2020-12-03 23:00:14 +00:00
Reid Kleckner	1e843a987d	[MS] Add more 128bit cmpxchg intrinsics for AArch64 The MSVC STL for requires this on ARM64. Requested in https://llvm.org/pr47099 Depends on D92061 Differential Revision: https://reviews.llvm.org/D92062	2020-11-25 12:07:28 -08:00
Artem Belevich	9a465057a6	[CUDA] Unbreak CUDA compilation with -std=c++20 Standard libc++ headers in stdc++ mode include <new> which picks up cuda_wrappers/new before any of the CUDA macros have been defined. We can not include CUDA headers that early, so the work-around is to define __device__ in the wrapper header itself. Differential Revision: https://reviews.llvm.org/D91807	2020-11-19 10:35:47 -08:00
Sven van Haastregt	f0c690018a	[OpenCL] Stop opencl-c-base.h leaking extension enabling opencl-c.h disables all extensions at its end, but opencl-c-base.h does not, and that causes any inclusion of only opencl-c-base.h to leave some extensions (such as cl_khr_fp16) enabled. This affects the -fdeclare-opencl-builtins option for example. This violates the OpenCL Extension Specification which specifies that "The initial state of the compiler is as if the directive #pragma OPENCL EXTENSION all : disable was issued". Fix by disabling all extensions at the end of opencl-c-base.h and enable extensions inside opencl.h which relied on opencl-c-base.h enabling the cl_khr_fp16/64 extensions. Differential Revision: https://reviews.llvm.org/D91429	2020-11-17 12:07:40 +00:00
Roland McGrath	cf36142d34	[clang] Add missing header guard in <cpuid.h> This header has long lacked a standard multiple inclusion guard like other headers have, for no apparent reason. The GCC header of the same name likewise lacks one up through release 10.1, but trunk GCC (release 11, and perhaps future 10.x) has fixed it (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96238). Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D91226	2020-11-10 19:34:25 -08:00
Qiu Chaofan	979a4d268a	[PowerPC] [Clang] Port SSE4.1-compatible insert intrinsics This patch adds three intrinsics compatible to x86's SSE 4.1 on PowerPC target, with tests: - _mm_insert_epi8 - _mm_insert_epi32 - _mm_insert_epi64 The intrinsics implementation is contributed by Paul Clarke. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D89242	2020-11-10 10:52:13 +08:00
Freddy Ye	5e312e0041	[X86] use macros to split GFNI intrinsics into different kinds Tremont microarchitecture only has GFNI(SSE) version, not AVX and AVX512 version. This patch is to avoid compiling fail on Windows when using -march=tremont to invoke one of GFNI(SSE) intrinsic. Differential Revision: https://reviews.llvm.org/D90822	2020-11-06 16:03:38 +08:00
Albion Fung	1af037f643	[PowerPC] Correct cpsgn's behaviour on PowerPC to match that of the ABI This patch fixes the reversed behaviour exhibited by cpsgn on PPC. It now matches the ABI. Differential Revision: https://reviews.llvm.org/D84962	2020-11-05 15:35:14 -05:00
Aaron En Ye Shi	ca5b31502c	[HIP] Math Headers to use type promotion Similar to libcxx implementation of cmath function overloads, use type promotion templates to determine return types of multi-argument math functions. Fixes: SWDEV-256825 Reviewed By: tra, yaxunl Differential Revision: https://reviews.llvm.org/D90409	2020-11-03 18:40:26 +00:00
Liu, Chen3	756f597841	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Joachim Meyer	eaee608448	[OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in complex wrapper headers. This is very similar to `7f1e6fcff9`, just fixing a left-over. With this, it should be possible to use both, -x cuda and -fopenmp in the same invocation, enabling to use both OpenMP, targeting CPU, and CUDA, targeting the GPU. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90415	2020-10-29 23:24:49 +01:00
Johannes Doerfert	17c8251bca	[OpenMP][CUDA][FIX] Use the new `remquo` overload only for OpenMP CUDA buildbots complained about a redefinition when I landed D89971. This is odd and I fail to understand where in the CUDA headers the other definition is supposed to be. For now, given that CUDA doesn't need the overload (AFAIKT), we simply restrict it to the OpenMP mode.	2020-10-27 23:52:59 -05:00
Johannes Doerfert	b1a90e1599	[OpenMP][CUDA] Add missing overload for `remquo(float,float,int*)` Reported by Colleen Bertoni <bertoni@anl.gov> after running the OvO test suite: https://github.com/TApplencourt/OvO/ The template overload is still hidden behind an ifdef for OpenMP. In the future we probably want to remove the ifdef but that requires further testing. Reviewed By: JonChesterfield, tra Differential Revision: https://reviews.llvm.org/D89971	2020-10-27 19:12:51 -05:00

1 2 3 4 5 ...

1862 Commits