llvm-project

Commit Graph

Author	SHA1	Message	Date
Mirko Brkusanin	5ba931a84a	[Mips] Add intrinsics for 4-byte and 8-byte MSA loads/stores. New intrinisics are implemented for when we need to port SIMD code from other arhitectures and only load or store portions of MSA registers. Following intriniscs are added which only load/store element 0 of a vector: v4i32 __builtin_msa_ldrq_w (const void , imm_n2048_2044); v2i64 __builtin_msa_ldr_d (const void , imm_n4096_4088); void __builtin_msa_strq_w (v4i32, void , imm_n2048_2044); void __builtin_msa_str_d (v2i64, void , imm_n4096_4088); Differential Revision: https://reviews.llvm.org/D73644	2020-02-11 11:47:30 +01:00
Alexey Bataev	fd3437a4f7	[OPENMP][NVPTX]Add NVPTX specific definitions for new/delete operators. Summary: To use new/delete in NVPTX code we need to define them. Implementation copied from CUDA wrappers. Reviewers: hfinkel, jdoerfert Subscribers: mgorny, guansong, kkwli0, caomhin, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73128	2020-02-05 09:57:53 -05:00
Alexey Sotkin	f780e15caf	[OpenCL] Fix support for cl_khr_mipmap_image_writes Text of the extension is available here: https://github.com/KhronosGroup/OpenCL-Docs/blob/master/ext/cl_khr_mipmap_image.asciidoc Patch by Ilya Mashkov Differential Revision: https://reviews.llvm.org/D71460	2020-02-05 14:55:32 +03:00
Artem Belevich	12fefeef20	[CUDA] Assume the latest known CUDA version if we've found an unknown one. This makes clang somewhat forward-compatible with new CUDA releases without having to patch it for every minor release without adding any new function. If an unknown version is found, clang issues a warning (can be disabled with -Wno-cuda-unknown-version) and assumes that it has detected the latest known version. CUDA releases are usually supersets of older ones feature-wise, so it should be sufficient to keep released clang versions working with minor CUDA updates without having to upgrade clang, too. Differential Revision: https://reviews.llvm.org/D73231	2020-01-28 10:11:42 -08:00
Artem Belevich	cc14de88da	[CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>). Wrong argument order resulted in broken shfl ops for 64-bit types.	2020-01-23 13:17:52 -08:00
Craig Topper	16b9410caa	[X86] Cast to __v4hi instead of __m64 in the implementation of _mm_extract_pi16 and _mm_insert_pi16. __m64 is a vector of 1 long long. But the builtins these intrinsics are calling expect a vector of 4 shorts. Fixes PR44589	2020-01-22 16:00:23 -06:00
Ulrich Weigand	cebba7ce39	[SystemZ] Avoid unnecessary conversions in vecintrin.h Use floating-point instead of integer zero constants to avoid creating implicit conversions, which currently cause suboptimal code to be generated with -ffp-exception-behavior=strict. NFC otherwise.	2020-01-16 18:58:14 +01:00
Richard Smith	388eaa1270	Work around PR43337: don't try to use the vec_sel overloads for vector long long, since clang's <altivec.h> doesn't provide it yet!	2020-01-15 13:14:57 -08:00
Warren Ristow	7fcd9e3f70	[X86] Mark various pointer arguments in builtins as const Enabling `-Wcast-qual` identified many casts in various system headers that were dropping the `const` qualifier. Fixing those missing qualifiers pointed out that a few of the definitions of the builtins did not properly identify their arguments as `const` pointers. This commit fixes those builtin definitions, and the system header files so that they no longer drop the qualifier. Differential Revision: https://reviews.llvm.org/D71718	2019-12-19 11:42:11 -08:00
Momchil Velikov	600d123c6f	[ARM][CMSE] Add CMSE header and builtins This is patch C2 as mentioned in RFC http://lists.llvm.org/pipermail/cfe-dev/2019-March/061834.html This adds CMSE builtin functions, and introduces arm_cmse.h header which has useful macros, functions, and data types for end-users of CMSE. Patch by Javed Absar. Diferential Revision: https://reviews.llvm.org/D70817	2019-12-12 15:01:14 +00:00
Craig Topper	890c6ef1fb	[X86] Remove forward declaration of _invpcid from intrin.h. Rely on inline version from immintrin.h The forward declaration had a cdecl calling convention, but the inline version did not. This leads to a conflict if the default calling convention is not cdecl. Fix this by just removing the forward declaration. Fixes PR41503	2019-11-25 16:27:39 -08:00
Craig Topper	3cec2a17de	[X86] Fix the implementation of __readcr3/__writecr3 to work in 64-bit mode We need to use a 64-bit type in 64-bit mode so a 64-bit register will get used in the generated assembly. I've also changed the constraints to just use "r" intead of "q". "q" forces to a only an a/b/c/d register in 32-bit mode, but I see no reason that would matter here. Fixes Nico's note in PR19301 over 4 years ago. Differential Revision: https://reviews.llvm.org/D70101	2019-11-14 13:21:36 -08:00
Nemanja Ivanovic	e0407f5496	[PowerPC][Altivec] Fix offsets for vec_xl and vec_xst As we currently have it implemented in altivec.h, the offsets for these two intrinsics are element offsets. The documentation in the ABI (as well as the implementation in both XL and GCC) states that these should be byte offsets. Differential revision: https://reviews.llvm.org/D63636	2019-11-07 20:58:11 -06:00
Nemanja Ivanovic	070e4027b0	[PowerPC][Altivec] Emit correct builtin for single precision vec_all_ne We currently emit a double precision comparison instruction for this, whereas we need to emit the single precision version. Differential revision: https://reviews.llvm.org/D64024	2019-11-07 20:40:32 -06:00
Eli Friedman	98286b569d	[Headers] Fix compatibility between arm_acle.h and intrin.h Make sure they don't both define __nop. Differential Revision: https://reviews.llvm.org/D69012	2019-10-29 14:52:56 -07:00
vhscampos	f6e11a36c4	[ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLE Summary: Writing support for three ACLE functions: unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) unsigned int __clsll(uint64_t x) CLS stands for "Count number of leading sign bits". In AArch64, these two intrinsics can be translated into the 'cls' instruction directly. In AArch32, on the other hand, this functionality is achieved by implementing it in terms of clz (count number of leading zeros). Reviewers: compnerd Reviewed By: compnerd Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69250	2019-10-28 11:06:58 +00:00
vhscampos	5d35b7d9e1	[ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64 Summary: Adding support for ACLE intrinsics. Patch by Michael Platings. Reviewers: chill, t.p.northover, efriedma Reviewed By: chill Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69297	2019-10-28 10:59:18 +00:00
Greg Bedwell	d4758d4a8d	Fix a spelling mistake in a couple of intrinsic description comments. NFC	2019-10-27 09:42:14 +00:00
Simon Tatham	08074cc965	[clang,ARM] Initial ACLE intrinsics for MVE. This commit sets up the infrastructure for auto-generating <arm_mve.h> and doing clang-side code generation for the builtins it relies on, and demonstrates that it works by implementing a representative sample of the ACLE intrinsics, more or less matching the ones introduced in LLVM IR by D67158,D68699,D68700. Like NEON, that header file will provide a set of vector types like uint16x8_t and C functions with names like vaddq_u32(). Unlike NEON, the ACLE spec for <arm_mve.h> includes a polymorphism system, so that you can write plain vaddq() and disambiguate by the vector types you pass to it. Unlike the corresponding NEON code, I've arranged to make every user- facing ACLE intrinsic into a clang builtin, and implement all the code generation inside clang. So <arm_mve.h> itself contains nothing but typedefs and function declarations, with the latter all using the new `__attribute__((__clang_builtin))` system to arrange that the user- facing function names correspond to the right internal BuiltinIDs. So the new MveEmitter tablegen system specifies the full sequence of IRBuilder operations that each user-facing ACLE intrinsic should translate into. Where possible, the ACLE intrinsics map to standard IR operations such as vector-typed `add` and `fadd`; where no standard representation exists, I call down to the sample IR intrinsics introduced in an earlier commit. Doing it like this means that you get the polymorphism for free just by using __attribute__((overloadable)): the clang overload resolution decides which function declaration is the relevant one, and _then_ its BuiltinID is looked up, so by the time we're doing code generation, that's all been resolved by the standard system. It also means that you get really nice error messages if the user passes the wrong combination of types: clang will show the declarations from the header file and explain why each one doesn't match. (The obvious alternative approach would be to have wrapper functions in <arm_mve.h> which pass their arguments to the underlying builtins. But that doesn't work in the case where one of the arguments has to be a constant integer: the wrapper function can't pass the constantness through. So you'd have to do that case using a macro instead, and then use C11 `_Generic` to handle the polymorphism. Then you have to add horrible workarounds because `_Generic` requires even the untaken branches to type-check successfully, and //then// if the user gets the types wrong, the error message is totally unreadable!) Reviewers: dmgreen, miyuki, ostannard Subscribers: mgorny, javed.absar, kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D67161	2019-10-24 16:33:13 +01:00
Craig Topper	282eff3847	[X86] Always define the tzcnt intrinsics even when _MSC_VER is defined. These intrinsics use llvm.cttz intrinsics so are always available even without the bmi feature. We already don't check for the bmi feature on the intrinsics themselves. But we were blocking the include of the header file with _MSC_VER unless BMI was enabled on the command line. Fixes PR30506. llvm-svn: 374516	2019-10-11 06:07:53 +00:00
Pengfei Wang	1f3a15c397	[x86] Adding support for some missing intrinsics: _castf32_u32, _castf64_u64, _castu32_f32, _castu64_f64 Summary: Adding support for some missing intrinsics: _castf32_u32, _castf64_u64, _castu32_f32, _castu64_f64 Reviewers: craig.topper, LuoYuanke, RKSimon, pengfei Reviewed By: RKSimon Subscribers: llvm-commits Patch by yubing (Bing Yu) Differential Revision: https://reviews.llvm.org/D67212 llvm-svn: 372802	2019-09-25 02:24:05 +00:00
Richard Smith	5b2ba5afa9	Fix reliance on -flax-vector-conversions in AVX intrinsics headers and corresponding tests. llvm-svn: 372063	2019-09-17 03:56:30 +00:00
Richard Smith	a50884abad	Remove reliance on lax vector conversions from altivec.h in VSX mode. llvm-svn: 372061	2019-09-17 03:56:26 +00:00
Richard Smith	aeb279dd88	Remove reliance on lax vector conversions from altivec.h and its test. llvm-svn: 371814	2019-09-13 05:19:12 +00:00
Jinsong Ji	5309189d9b	[PowerPC][Altivec] Fix constant argument for vec_dss Summary: This is similar to vec_ct* in https://reviews.llvm.org/rL304205. The argument must be a constant, otherwise instruction selection will fail. always_inline is not enough for isel to always fold everything away at -O0. The fix is to turn the function into macros in altivec.h. Fixes https://bugs.llvm.org/show_bug.cgi?id=43072 Reviewers: nemanjai, hfinkel, #powerpc, wuzish Reviewed By: #powerpc, wuzish Subscribers: wuzish, kbarton, MaskRay, shchenz, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D66699 llvm-svn: 370902	2019-09-04 14:01:47 +00:00
Artem Belevich	ce94ec661f	[CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+ vote.ballot instruction is gone in recent CUDA versions and vote.sync.ballot can not be used because it needs a thread mask parameter. Fortunately PTX 6.2 (introduced with CUDA-9.2) provides activemask.b32 instruction for this. Differential Revision: https://reviews.llvm.org/D66665 llvm-svn: 370792	2019-09-03 17:31:58 +00:00
Pengfei Wang	dea9cad10e	[x86] Fix bugs of some intrinsic functions in CLANG : _mm512_stream_ps, _mm512_stream_pd, _mm512_stream_si512 Reviewers: craig.topper, pengfei, LuoYuanke, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Patch by Bing Yu (yubing) Differential Revision: https://reviews.llvm.org/D66786 llvm-svn: 370691	2019-09-03 02:06:15 +00:00
Pengfei Wang	caac097fbf	[x86] Adding support for some missing intrinsics: _mm512_cvtsi512_si32 Summary: Adding support for some missing intrinsics: _mm512_cvtsi512_si32 Reviewers: craig.topper, pengfei, LuoYuanke, spatel, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits Patch by Bing Yu (yubing) Differential Revision: https://reviews.llvm.org/D66785 llvm-svn: 370297	2019-08-29 06:18:34 +00:00
Yaxun Liu	af478e240b	[OpenCL] Fix declaration of enqueue_marker Differential Revision: https://reviews.llvm.org/D66512 llvm-svn: 369641	2019-08-22 11:18:59 +00:00
Anastasia Stulova	ef58804ebc	[OpenCL] Fix lang mode predefined macros for C++ mode. In C++ mode we should only avoid adding __OPENCL_C_VERSION__, all other predefined macros about the language mode are still valid. This change also fixes the language version check in the headers accordingly. Differential Revision: https://reviews.llvm.org/D65941 llvm-svn: 368552	2019-08-12 10:44:07 +00:00
Raphael Isemann	c4b5b66a05	[clang] Fixed x86 cpuid NSC signature Summary: The signature "Geode by NSC" for NSC vendor is wrong. In lib/Headers/cpuid.h, signature_NSC_edx and signature_NSC_ecx constants are inverted (cpuid signature order is ebx # edx # ecx). Reviewers: teemperor, rsmith, craig.topper Reviewed By: teemperor, craig.topper Subscribers: craig.topper, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D65978 llvm-svn: 368510	2019-08-10 10:14:01 +00:00
Qiu Chaofan	e9efaf3529	[PowerPC] [Clang] Port SSE3, SSSE3 and SSE4 intrinsics to PowerPC Port existing headers which include x86 intrinsics implementation to PowerPC platform (using Altivec), along with tests. Also, tests about including these intrinsic headers are combined. The headers are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D65630 llvm-svn: 368392	2019-08-09 03:39:55 +00:00
Momchil Velikov	a36d31478c	[AArch64] Add support for Transactional Memory Extension (TME) Re-commit r366322 after some fixes TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Differential Revision: https://reviews.llvm.org/D64416 Patch by Javed Absar and Momchil Velikov llvm-svn: 367428	2019-07-31 12:52:17 +00:00
Qiu Chaofan	852d444671	[PowerPC] [Clang] Add platform guards to PPC vector intrinsics headers Move the platform check out of PPC Linux toolchain code and add platform guards to the intrinsic headers, since they are supported currently only on 64-bit PowerPC targets. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D64849 llvm-svn: 367281	2019-07-30 02:18:11 +00:00
Paul Robinson	d2c0eefd5c	[X86] Remove const from some intrinsics that shouldn't have them llvm-svn: 366699	2019-07-22 16:14:09 +00:00
Sven van Haastregt	e9e59ad79f	[OpenCL] Define CLK_NULL_EVENT without cast Defining CLK_NULL_EVENT with a `(void*)` cast has the (unintended?) side-effect that the address space will be fixed (as generic in OpenCL 2.0 mode). The consequence is that any target specific address space for the clk_event_t type will not be applied. It is not clear why the void pointer cast was needed in the first place, and it seems we can do without it. Differential Revision: https://reviews.llvm.org/D63876 llvm-svn: 366546	2019-07-19 09:11:48 +00:00
Qiu Chaofan	03aaef8e72	[PowerPC][Clang] Remove use of malloc in mm_malloc Remove dependency of malloc in implementation of mm_malloc function in PowerPC intrinsics and alignment assumption on glibc. Reviewed By: Hal Finkel Differential Revision: https://reviews.llvm.org/D64850 llvm-svn: 366406	2019-07-18 06:20:12 +00:00
Momchil Velikov	0e2b74a2b0	Revert [AArch64] Add support for Transactional Memory Extension (TME) This reverts r366322 (git commit `4b8da3a503`) llvm-svn: 366355	2019-07-17 17:43:32 +00:00
Momchil Velikov	4b8da3a503	[AArch64] Add support for Transactional Memory Extension (TME) TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322	2019-07-17 13:23:27 +00:00
Kyrylo Tkachov	eb72138340	[AArch64] Implement __jcvt intrinsic from Armv8.3-A The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined. This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function. The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used. I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example). make check-all didn't show any new failures. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics Differential Revision: https://reviews.llvm.org/D64495 llvm-svn: 366197	2019-07-16 09:27:39 +00:00
Ulrich Weigand	b98bf60ef7	[SystemZ] Add support for new cpu architecture - arch13 This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10303. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365933	2019-07-12 18:14:51 +00:00
Craig Topper	caf6b71ab2	[X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669	2019-07-10 17:11:29 +00:00
Sven van Haastregt	b502a44110	[OpenCL] Restore ATOMIC_VAR_INIT We accidentally lost the ATOMIC_VAR_INIT and ATOMIC_FLAG_INIT macros in r363794. Also put the `memory_order` typedef back inside a `>= CL2.0` guard. llvm-svn: 364174	2019-06-24 10:06:40 +00:00
Sven van Haastregt	853dfab799	[OpenCL] Remove more duplicates from opencl-c.h Identified the duplicate declarations using sort lib/Headers/opencl-c.h \| uniq -c \| grep ' 2' llvm-svn: 364173	2019-06-24 10:06:34 +00:00
Anastasia Stulova	999f676d75	[OpenCL][PR41963] Add generic addr space to old atomics in C++ mode Add overloads with generic address space pointer to old atomics. This is currently only added for C++ compilation mode. Differential Revision: https://reviews.llvm.org/D62335 llvm-svn: 364071	2019-06-21 16:19:16 +00:00
Sven van Haastregt	772a7a7680	[OpenCL] Remove duplicate read_image declarations Patch by Pierre Gondois. llvm-svn: 364020	2019-06-21 10:26:10 +00:00
Craig Topper	6d9fb68c53	[X86] Make _mm_mask_cvtps_ph, _mm_maskz_cvtps_ph, _mm256_mask_cvtps_ph, and _mm256_maskz_cvtps_ph aliases for their corresponding cvt_roundps_ph intrinsic. These intrinsics should always take an immediate for the rounding mode. The base instruction comes from before EVEX embdedded rounding. The user should always provide the immediate rather than us assuming CUR_DIRECTION. Make the 512-bit versions also explicit aliases instead of copy pasting the code. llvm-svn: 363961	2019-06-20 18:24:29 +00:00
Xing Xue	ab4bcd844a	AIX system headers need stdint.h and inttypes.h to be re-enterable Summary: AIX system headers need stdint.h and inttypes.h to be re-enterable when macro _STD_TYPES_T is defined so that limit macro definitions such as UINT32_MAX can be found. This patch attempts to allow that on AIX. Reviewers: hubert.reinterpretcast, jasonliu, mclow.lists, EricWF Reviewed by: hubert.reinterpretcast, mclow.lists Subscribers: jfb, jsji, christof, cfe-commits, libcxx-commits, llvm-commits Tags: #LLVM, #clang, #libc++ Differential Revision: https://reviews.llvm.org/D59253 llvm-svn: 363939	2019-06-20 15:36:32 +00:00
Craig Topper	24151619a0	[X86] Correct the __min_vector_width__ attribute on a few intrinsics. llvm-svn: 363890	2019-06-19 23:27:04 +00:00
Sven van Haastregt	af1c230e70	[OpenCL] Split type and macro definitions into opencl-c-base.h Using the -fdeclare-opencl-builtins option will require a way to predefine types and macros such as `int4`, `CLK_GLOBAL_MEM_FENCE`, etc. Move these out of opencl-c.h into opencl-c-base.h such that the latter can be shared by -fdeclare-opencl-builtins and -finclude-default-header. This changes the behaviour of -finclude-default-header when -fdeclare-opencl-builtins is specified: instead of including the full header, it will include the header with only the base definitions. Differential revision: https://reviews.llvm.org/D63256 llvm-svn: 363794	2019-06-19 12:48:22 +00:00

1 2 3 4 5 ...

1593 Commits