Commit Graph

2010 Commits

Author SHA1 Message Date
Xiang1 Zhang caf5ad5da7 Revert "[x86] Support 3 builtin functions for 32-bits mode"
This reverts commit a69c219a8c.
2022-04-22 09:11:40 +08:00
Xiang1 Zhang a69c219a8c [x86] Support 3 builtin functions for 32-bits mode
_mm_cvtsi128_si64, _mm_cvtsi64_si128, _mm_extract_epi64
2022-04-22 09:06:25 +08:00
Evgeny Mankov c23147106f [clang][CUDA][Windows] Fix compilation error on Windows with `uint32_t __nvvm_get_smem_pointer`
The change fixes https://github.com/llvm/llvm-project/issues/54609 (the second reported issue) by eliminating a compilation error occurring only on Windows while trying to compile any CUDA source file by clang (-x cuda).

[Repro]
clang -x cuda <any_cu_source>

[Error]

__clang_cuda_runtime_wrapper.h:473:
__clang_cuda_intrinsics.h(517,19): error GC871EEFB: unknown type name 'uint32_t'; did you mean 'cuuint32_t'?
__device__ inline uint32_t __nvvm_get_smem_pointer(void *__ptr) {
                          ^
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/include\cuda.h:57:26: note: 'cuuint32_t' declared here
typedef unsigned __int32 cuuint32_t;

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D122897
2022-04-21 00:41:20 +03:00
Pengxuan Zheng 38612fbc89 Reland "[COFF, ARM64] Add __break intrinsic"
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

Reland after fixing the test failure. The failure was due to conflict with a
change (D122983) which was merged right before this patch.

Reviewed By: rnk, mstorsjo

Differential Revision: https://reviews.llvm.org/D124032
2022-04-20 13:01:30 -07:00
Pengxuan Zheng bff8356b19 Revert "[COFF, ARM64] Add __break intrinsic"
This reverts commit 8a9b4fb4aa.
2022-04-20 11:57:49 -07:00
Pengxuan Zheng 8a9b4fb4aa [COFF, ARM64] Add __break intrinsic
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

Reviewed By: rnk, mstorsjo

Differential Revision: https://reviews.llvm.org/D124032
2022-04-20 11:20:26 -07:00
Sven van Haastregt e67b1b0ccf [OpenCL] Add missing __opencl_c_atomic_scope_device guards
Update opencl-c.h after the specification clarification in
https://github.com/KhronosGroup/OpenCL-Docs/pull/775
2022-04-20 11:02:50 +01:00
Qiongsi Wu 2512a875cc [clang] Adding Platform/Architecture Specific Resource Header Installation Targets
The goal of this patch is to improve distribution build's flexibility to include only applicable header files.

Currently, the clang-resource-headers target contains nearly all the files in clang/lib/Headers. Most of these files are platform specific (e.g. immintrin.h is x86 specific). A distribution build will have to either include all the headers for all the platforms, or not include any headers. For example, if a distribution build for powerpc includes the clang-resource-headers target, it will include all the x86 specific headers, even-though the x86 specific headers cannot be used.

This patch breaks up the clang-resource-headers list to a core list and platform specific lists. With the patch, a distribution build can now include the ppc-resource-headers to include the headers applicable to the powerpc platform.

Specifically, one can now have

cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;ppc-resource-headers" ... ../llvm
ninja install-distribution then installs the powerpc headers.

Similarly, one can do

cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;x86-resource-headers" ... ../llvm
to include headers applicable to the x86 platform in a distribution installation.

To implement this behaviour, the patch does two things:
* It breaks up the long files header file list to a core list and platform specific lists.
* It adds numerous platform specific installation targets.

Differential Revision: https://reviews.llvm.org/D123498
2022-04-19 10:10:07 -04:00
Sven van Haastregt f3ee0afc67 [OpenCL] opencl-c.h: Add const to get_image_num_samples
Align with the `-fdeclare-opencl-builtins` option and other
get_image_* builtins which have the const attribute.

Differential Revision: https://reviews.llvm.org/D122728
2022-04-19 10:16:44 +01:00
Sven van Haastregt 77c74fd877 [OpenCL] Remove argument names from math builtins
This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" the argument name identifiers.

Continues the direction set out in D119560.
2022-04-06 11:43:59 +01:00
Sven van Haastregt de30408b3b [OpenCL] opencl-c.h: remove a/b/c/i/p/n/v arg names
This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" any single-letter identifiers.

Continues the direction set out in D119560.
2022-03-29 10:16:27 +01:00
Sven van Haastregt 677d0e7495 [OpenCL] opencl-c.h: remove x/y/z arg names
This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" the identifiers "x", "y" and
"z".

Continues the direction set out in D119560.
2022-03-24 13:55:41 +00:00
Qiu Chaofan 895e5b2d80 [NFC] Format and uglify PowerPC intrinsics headers
This change formats PowerPC intrinsics wrapper headers into LLVM style,
and add extra prefix '__' to all variables to prevent conflict with user
code.
2022-03-24 21:14:55 +08:00
Qiu Chaofan 406bde9a15 [PowerPC] [Clang] Add SSE4 and BMI intrinsics implementation
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D119407
2022-03-24 20:03:08 +08:00
Sven van Haastregt 22548032be [OpenCL] opencl-c.h: remove arg names for vload/vstore builtins
This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" the identifiers "data" and
"offset".

Continues the direction set out in D119560.
2022-03-23 11:12:50 +00:00
Alan Zhao 8cd8bd4a5c Implement __cpuid and __cpuidex as Clang builtins
https://reviews.llvm.org/D23944 implemented the #pragma intrinsic from
MSVC. This causes the statement #pragma intrinsic(cpuid) to fail [0]
on Clang because cpuid is currently implemented in intrin.h instead
of a Clang builtin. Reimplementing cpuid (as well as it's releated
function, cpuidex) should resolve this.

[0]: https://crbug.com/1279344

Differential revision: https://reviews.llvm.org/D121653
2022-03-18 18:13:52 +01:00
Simon Pilgrim 4e4f839ac2 [X86] Use the unaligned vector typedefs for the lddqu intrinsics pointer arguments (PR20670)
Extension to 4390c721cb - similar to the vanilla load/store intrinsics, _mm_lddqu_si128/_mm256_lddqu_si256 should take an unaligned pointer, but were using the aligned m128i/m256i types which can cause alignment warnings.

The existing sse3-builtins.c and avx-builtins.c tests in llvm-project\clang\test\CodeGen\X86 should cover this.

Differential Revision: https://reviews.llvm.org/D121815
2022-03-17 10:42:29 +00:00
Kazushi (Jam) Marukawa 9df395bb68 [Clang][VE] Add vector mask intrinsics to clang
Add vector mask intrinsics instructions to clang.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D121816
2022-03-17 18:52:28 +09:00
Thomas Lively 7e8913d775 [WebAssembly] Fix names of SIMD instructions containing '_zero'
Fix the instruction names to match the WebAssembly spec:

 - `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero`
 - `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero`

Also rename related things like intrinsics, builtins, and test functions to
match.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D121661
2022-03-16 13:34:57 -07:00
Kazushi (Jam) Marukawa c2f62ab84b [Clang][VE] Add the rest of intrinsics to clang
Add the rest of intrinsics to clang except intrinsics using vector mask
registers.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D121586
2022-03-17 00:17:21 +09:00
Kazushi (Jam) Marukawa b1b4b6f366 [Clang][VE] Add vector load intrinsics
Add vector load intrinsic instructions for VE.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D121049
2022-03-12 09:09:57 +09:00
Timm Bäder e5ccd66801 [clang][sema] Enable first-class bool support for C2x
Implement N2395 for C2x.

This also covers adding "bool", which is part of N2394.

Differential Revision: https://reviews.llvm.org/D120244
2022-03-09 15:04:24 +01:00
Phoebe Wang 4de9a752d6 [X86] Add helper enum for ternary intrinsics
Reviewed By: RKSimon, LuoYuanke

Differential Revision: https://reviews.llvm.org/D120307
2022-03-08 11:19:05 +08:00
Kristina Bessonova 57aaab3b17 [NVPTX] Fix nvvm.match.sync*.i64 intrinsics return type (i64 -> i32)
NVVM IR specification defines them with i32 return type:

  declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value)
  declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value)
  ...
  The i32 return value is a 32-bit mask where bit position in mask corresponds
  to thread’s laneid.

as well as PTX ISA:

  9.7.12.8. Parallel Synchronization and Communication Instructions: match.sync

  match.any.sync.type  d, a, membermask;
  match.all.sync.type  d[|p], a, membermask;
  ...
  Destination d is a 32-bit mask where bit position in mask corresponds
  to thread’s laneid.

Additionally, ptxas doesn't accept intructions, produced by NVPTX backend.
After this patch, it compiles with no issues.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120499
2022-03-01 12:26:16 +02:00
Sven van Haastregt b48e3c805c [OpenCL] opencl-c.h: Fix incorrect get_image_width guard
`cl_khr_3d_image_writes` should not guard `read_only image3d_t`.
2022-02-25 11:05:56 +00:00
Sven van Haastregt 88182e2dfd [OpenCL] opencl-c.h: remove arg names for image builtins
This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" the identifiers "image",
"image_array", "coord", "sampler", "sample", "gradientX", "gradientY",
"lod", and "color".

Continues the direction set out in D119560.
2022-02-24 11:52:32 +00:00
Sven van Haastregt aa9c2d19d9 [OpenCL] Align subgroup builtin guards
Until now, subgroup builtins are available with `opencl-c.h` when at
least one of `cl_intel_subgroups`, `cl_khr_subgroups`, or
`__opencl_c_subgroups` is defined.  With `-fdeclare-opencl-builtins`,
subgroup builtins are conditionalized on `cl_khr_subgroups` only.

Align `-fdeclare-opencl-builtins` to `opencl-c.h` by introducing the
internal `__opencl_subgroup_builtins` macro.

Differential Revision: https://reviews.llvm.org/D120254
2022-02-23 12:22:09 +00:00
Sven van Haastregt e7e17b30d0 [OpenCL] opencl-c.h: use uint/ulong consistently
Most places already seem to use the short spelling instead of
'unsigned int/long', so perform the following substitutions:

  s/unsigned int /uint /g
  s/unsigned long /ulong /g

This simplifies completeness comparisons against OpenCLBuiltins.td.

Differential Revision: https://reviews.llvm.org/D120032
2022-02-22 10:15:40 +00:00
Sven van Haastregt 52df866615 [OpenCL] opencl-c.h: remove arg names from atomics; NFC
This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" the identifiers "success",
"failure", "desired", "value".

Differential Revision: https://reviews.llvm.org/D119560
2022-02-21 11:29:10 +00:00
Dávid Bolvanský 2c91754a13 [Clang] Add attributes alloc_size and alloc_align to mm_malloc
LLVM optimizes source codes with mm_malloc better, especially due to alignment info.

alloc align https://clang.llvm.org/docs/AttributeReference.html#alloc-align
alloc size https://clang.llvm.org/docs/AttributeReference.html#alloc-size

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D117091
2022-02-17 19:59:18 +01:00
Sven van Haastregt 477bc8e8b9 [OpenCL] Guard atomic_double with cl_khr_int64_*
It is necessary to guard atomic_double type according to
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#_footnotedef_54.
Platform that disable cl_khr_int64_base_atomics and
cl_khr_int64_extended_atomics will have compiling errors even if
atomic_double is not used.

Patch by Haonan Yang.

Differential Revision: https://reviews.llvm.org/D119398
2022-02-16 10:07:35 +00:00
Sven van Haastregt 074451bd33 [OpenCL] opencl-c.h: fix atomic_fetch_max with addrspace
Commit 3c7d2f1b67 ("[OpenCL] opencl-c.h: add CL 3.0 non-generic
address space atomics", 2021-07-30) added some atomic_fetch_add/sub
overloads with uintptr_t arguments twice.  Instead, they should have
been atomic_fetch_max overloads with non-generic address spaces.
2022-02-15 12:12:03 +00:00
Aaron Ballman a766545402 Update the diagnostic behavior of [[noreturn]] in C2x
Post-commit review feedback suggested dropping the deprecated
diagnostic for the 'noreturn' macro (the diagnostic from the header
file suffices and the macro diagnostic could be confusing) and to only
issue the deprecated diagnostic for [[_Noreturn]] when the attribute
identifier is either directly written or not from a system macro.

Amends the commit made in 5029dce492.
2022-02-14 14:04:32 -05:00
Aaron Ballman 5029dce492 Implement WG14 N2764 the [[noreturn]] attribute
This adds support for http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2764.pdf,
which was adopted at the Feb 2022 WG14 meeting. That paper adds
[[noreturn]] and [[_Noreturn]] to the list of supported attributes in
C2x. These attributes have the same semantics as the [[noreturn]]
attribute in C++.

The [[_Noreturn]] attribute was added as a deprecated feature so that
translation units which include <stdnoreturn.h> do not get an error on
use of [[noreturn]] because the macro expands to _Noreturn. Users can
use -Wno-deprecated-attributes to silence the diagnostic.

Use of <stdnotreturn.h> or the noreturn macro were both deprecated.
Users can define the _CLANG_DISABLE_CRT_DEPRECATION_WARNINGS macro to
suppress the deprecation diagnostics coming from the header file.
2022-02-14 09:38:26 -05:00
Sven van Haastregt 50f8abb9f4 [OpenCL] Add OpenCL 3.0 atomics to -fdeclare-opencl-builtins
Add the atomic overloads for the `global` and `local` address spaces,
which are new in OpenCL 3.0.  Ensure the preexisting `generic`
overloads are guarded by the generic address space feature macro.

Ensure a subset of the atomic builtins are guarded by the
`__opencl_c_atomic_order_seq_cst` and `__opencl_c_atomic_scope_device`
feature macros, and enable those macros for SPIR/SPIR-V targets in
`opencl-c-base.h`.

Also guard the `cl_ext_float_atomics` builtins with the atomic order
and scope feature macros.

Differential Revision: https://reviews.llvm.org/D119420
2022-02-11 10:14:14 +00:00
Simon Pilgrim 09857a4bd1 [X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions

This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:

__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
  // CHECK-LABEL: test_mm256_adds_epi8
  // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
  return _mm256_adds_epi8(a, b);
}
2022-02-08 15:00:10 +00:00
Simon Pilgrim a59faf272e Revert rG6c174ab2ad0676b295f11f6c3913eff9289fa6b9 "[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat"
Missed some legacy builtin tests that need cleaning up first
2022-02-08 14:45:28 +00:00
Simon Pilgrim 6c174ab2ad [X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions

This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:

__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
  // CHECK-LABEL: test_mm256_adds_epi8
  // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
  return _mm256_adds_epi8(a, b);
}
2022-02-08 14:21:20 +00:00
Sven van Haastregt 9b8a93e3b6 [OpenCL] opencl-c.h: remove arg names from arm_dot; NFC
This simplifies completeness comparisons against OpenCLBuiltins.td.
2022-02-08 13:42:24 +00:00
Sven van Haastregt c15782bcf5 [OpenCL] opencl-c.h: make attribute order consistent; NFC
For most builtins, `__purefn` always comes after `__ovld`, but the
read_image functions did not follow this pattern.
2022-02-07 10:54:55 +00:00
Piotr Kubaj f2f4080c10 [PowerPC] Fix SSE translation on FreeBSD
This patch drops throws specifier in posix_memalign declaration because
that's different between glibc and other libc, and Clang has a hack.

Differential Revision: https://reviews.llvm.org/D117972
2022-02-06 01:20:31 +08:00
tyb0807 51e188d079 [AArch64] Support for memset tagged intrinsic
This introduces a new ACLE intrinsic for memset tagged
(https://github.com/ARM-software/acle/blob/next-release/main/acle.md#memcpy-family-of-operations-intrinsics---mops).

  void *__builtin_arm_mops_memset_tag(void *, int, size_t)

A corresponding LLVM intrinsic is introduced:

  i8* llvm.aarch64.mops.memset.tag(i8*, i8, i64)

The types match llvm.memset but the return type is not void.

This is part 1/4 of a series of patches split from
https://reviews.llvm.org/D117405 to facilitate reviewing.

Patch by Tomas Matheson

Differential Revision: https://reviews.llvm.org/D117753
2022-01-31 20:49:34 +00:00
Aaron Ballman a6cabd9802 Revert fad7e491a0 with fixes applied
fad7e491a0 was a revert of
86797fdb6f due to build failures. This
hopefully fixes them.
2022-01-29 08:12:16 -05:00
Jan Korous fad7e491a0 Revert "Add BITINT_MAXWIDTH support"
This reverts commit 86797fdb6f.

Differential Revision: https://reviews.llvm.org/D117238
2022-01-28 15:18:49 -08:00
Aaron Ballman 86797fdb6f Add BITINT_MAXWIDTH support
Part of the _BitInt feature in C2x
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf) is a new
macro in limits.h named BITINT_MAXWIDTH that can be used to determine
the maximum width of a bit-precise integer type. This macro must expand
to a value that is at least as large as ULLONG_WIDTH.

This adds an implementation-defined macro named __BITINT_MAXWIDTH__ to
specify that value, which is used by limits.h for the standard macro.

This also limits the maximum bit width to 128 bits because backends do
not currently support all mathematical operations (such as division) on
wider types yet. This maximum is expected to be increased in the future.
2022-01-28 15:04:29 -05:00
David Tenty 27ee91162d [AIX][clang] include_next through clang provided float.h
AIX provides additional definitions in the system libc float.h that we
would like to be available to users, so we need to include_next through,
similar to what is done on some other platforms.

We also adjust the guards for some definitions which are restricted
based on language level to also be provide with the _ALL_SOURCE feature
test macro on AIX, similar to what is done by the platform float.h
header, so we don't run into cases where we don't provide the compiler
macro but still have a different definition from the system.

Differential Revision: https://reviews.llvm.org/D117935
2022-01-28 13:27:10 -05:00
Sven van Haastregt bfd8210f6f [OpenCL] opencl-c.h: refactor named addrspace builtins
The named address space overloads of builtins that take a pointer
argument are conditionalized on the `__opencl_c_generic_address_space`
feature macro (in a `#else` body).  Introduce an internal feature
macro instead, such that their availability can be controlled in a
single place and independently of the generic address space feature
macro.

This commit does not change the available builtins.

Differential Revision: https://reviews.llvm.org/D118158
2022-01-28 10:24:47 +00:00
Anton Zabaznov a5de66c4c5 [OpenCL] Add support of __opencl_c_device_enqueue feature macro.
This feature requires support of __opencl_c_generic_address_space and
__opencl_c_program_scope_global_variables so diagnostics for that is provided as well.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D115640
2022-01-27 14:25:59 +03:00
Sven van Haastregt 35fff208ca [OpenCL] opencl-c.h: add missing read_write image guards
The get_image_num_mip_levels overloads that take a read_write image
parameter were missing the __opencl_c_read_write_images guard.
2022-01-27 10:33:12 +00:00
Sven van Haastregt 91a0b464a8 [OpenCL] Make read_write images optional for -fdeclare-opencl-builtins
Ensure any use of a `read_write` image is guarded behind the
`__opencl_c_read_write_images` feature macro.

Differential Revision: https://reviews.llvm.org/D117899
2022-01-25 11:40:31 +00:00