Commit Graph

1923 Commits

Author SHA1 Message Date
Joachim Meyer 5d689cf2a6 [NFC][CUDA] Fix order of round(f) definition in __clang_cuda_math.h for non-LP64.
This broke ARM builds e.g.: https://lab.llvm.org/buildbot/#/builders/187/builds/212
2021-07-02 21:55:48 +02:00
Brian Cain 28b01c59c9 [hexagon] Add {hvx,}hexagon_{protos,circ_brev...}
Add definitions for Hexagon, Hexagon circular/bit-reverse and HVX
intrinsics.
2021-06-30 22:58:56 -05:00
Xiang1 Zhang 6d234a6908 [X86] Zero some outputs of Kelocker intrinsics in error case
Reviewed By: WangPengfei

Differential Revision: https://reviews.llvm.org/D104766
2021-06-29 13:35:40 +08:00
Nemanja Ivanovic ef906573a1 [PowerPC] Fix vec_add for 64-bit on pre-Power7 subtargets
The shift of the carry was actually incorrect.
2021-06-24 18:42:44 -05:00
Ethan Stewart 5dfdc1812d [OpenMP][AMDGCN] Apply fix for isnan, isinf and isfinite for amdgcn.
This fixes issues with various return types(bool/int) and was already
in place for nvptx headers, adjusted to work for amdgcn. This does
not affect hip as the change is guarded with OPENMP_AMDGCN.
Similar to D85879.

Reviewed By: jdoerfert, JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D104677
2021-06-23 15:26:09 +01:00
Yaxun (Sam) Liu 186f2ac612 [HIP] Add support functions for C++ polymorphic types
Add runtime functions to detect invalid calls to pure or deleted virtual
functions.

Patch by: Siu Chi Chan

Reviewed by: Yaxun Liu

Differential Revision: https://reviews.llvm.org/D104392
2021-06-21 11:41:07 -04:00
Bing1 Yu 56d5c46b49 [X86] Support __tile_stream_loadd intrinsic for new AMX interface
Adding support for __tile_stream_loadd intrinsic.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D103784
2021-06-11 17:28:43 +08:00
Sven van Haastregt d54e7b731e [OpenCL] Add memory_scope_all_devices
Add the `memory_scope_all_devices` enum value, which is restricted to
OpenCL 3.0 or newer and the `__opencl_c_atomic_scope_all_devices`
feature.  Also guard `memory_scope_all_svm_devices` accordingly, which
is already available in OpenCL 2.0.

The `__opencl_c_atomic_scope_all_devices` feature is header-only, so
set its define to 1 in `opencl-c-base.h`.  This is done
unconditionally at the moment, as the mechanism for disabling
header-only options hasn't been decided yet.

This patch only adds a negative test for now.  Ideally adding a CL3.0
run line to atomic-ops.cl should suffice as a positive test, but we
cannot do that yet until (at least) generic address spaces and program
scope variables are supported in OpenCL 3.0 mode.

Differential Revision: https://reviews.llvm.org/D103241
2021-06-08 11:51:12 +01:00
Stuart Brady 9b14670f3c [OpenCL] Add const attribute to ctz() builtins
Reviewed By: svenvh

Differential Revision: https://reviews.llvm.org/D97725
2021-06-07 11:41:52 +01:00
Stuart Brady 86c24493ea [OpenCL][NFC] Test commit: tidy up whitespace in comment 2021-06-04 14:44:12 +01:00
Qiu Chaofan c0b3071833 [PowerPC] Fix x86 vector intrinsics wrapper compilation under C++
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D103386
2021-06-01 01:19:12 +08:00
Artem Belevich 9a75c06cd9 [CUDA] Work around compatibility issue with libstdc++ 11.1.0
libstdc++ redeclares __failed_assertion multiple times and that results in the
function declared with conflicting set of attributes when we include <complex>
with __host__ __device__ attributes force-applied to all functions.

In order to work around the issue, we rename __failed_assertion within the
region with forced attributes.

See https://bugs.llvm.org/show_bug.cgi?id=50383 for the details.

Differential Revision: https://reviews.llvm.org/D102936
2021-05-24 11:07:09 -07:00
Yaxun (Sam) Liu 91dfd68e90 [NFC][HIP] fix comments in __clang_hip_cmath.h 2021-05-21 17:44:18 -04:00
Nemanja Ivanovic 7cd2833311 [PowerPC] Add vec_vupkhpx and vec_vupklpx for XL compatibility
These are old names for these functions that XL still supports.
2021-05-14 08:02:00 -05:00
Aaron En Ye Shi a249ffa421 [HIP] Clean up llvm intrinsics using __asm
Instead of using inline asm, use clang builtins
for llvm intrinsics.

Differential Revision: https://reviews.llvm.org/D102427
2021-05-13 18:55:51 +00:00
Nemanja Ivanovic 39e4676ca7 [PowerPC] Provide doubleword vector predicate form comparisons on Power7
There are two reasons this shouldn't be restricted to Power8 and up:
1. For XL compatibility
2. Because clang will expand comparison operators to these intrinsics*

*Without this patch, the following causes a selection error:

int test(vector signed long a, vector signed long b) {
  return a < b;
}

This patch provides the handling for the intrinsics in the back
end and removes the Power8 guards from the predicate functions
(vec_{all|any}_{eq|ne|gt|ge|lt|le}).
2021-05-13 04:56:56 -05:00
Anastasia Stulova 58d18dde5c [OpenCL] Remove pragma requirement from Arm dot extension.
This removed the pointless need for extension pragma since
it doesn't disable anything properly and it doesn't need to
enable anything that is not possible to disable.

The change doesn't break existing kernels since it allows to
compile more cases i.e. without pragma statements but the
pragma continues to be accepted.

Differential Revision: https://reviews.llvm.org/D100985
2021-05-12 16:25:33 +01:00
Thomas Lively 1e9c39a3f9 [WebAssembly] Use functions instead of macros for const SIMD intrinsics
To improve hygiene, consistency, and usability, it would be good to replace all
the macro intrinsics in wasm_simd128.h with functions. The reason for using
macros in the first place was to enforce the use of constants for some arguments
using `_Static_assert` with `__builtin_constant_p`. This commit switches to
using functions and uses the `__diagnose_if__` attribute rather than
`_Static_assert` to enforce constantness.

The remaining macro intrinsics cannot be made into functions until the builtin
functions they are implemented with can be replaced with normal code patterns
because the builtin functions themselves require that their arguments are
constants.

This commit also fixes a bug with the const_splat intrinsics in which the f32x4
and f64x2 variants were incorrectly producing integer vectors.

Differential Revision: https://reviews.llvm.org/D102018
2021-05-07 11:50:19 -07:00
Thomas Lively b198b9b897 [WebAssembly] Fix argument types in SIMD narrowing intrinsics
The builtins were updated to take signed parameters in 627a526955, but the
intrinsics that use those builtins were not updated as well. The intrinsic test
did not catch this sign mismatch because it is only reported as an error under
-fno-lax-vector-conversions.

This commit fixes the type mismatch and adds -fno-lax-vector-conversions to the
test to catch similar problems in the future.

Differential Revision: https://reviews.llvm.org/D101979
2021-05-06 10:07:45 -07:00
Nemanja Ivanovic 1faf3b195e [PowerPC] Re-commit ed87f512bb
This was reverted in 3761b9a234 just
as I was about to commit the fix. This patch inlcudes the
necessary fix.
2021-05-06 09:50:12 -05:00
Nico Weber 3761b9a234 Revert "[PowerPC] Provide some P8-specific altivec overloads for P7"
This reverts commit ed87f512bb.
Breaks check-clang, see e.g.
https://lab.llvm.org/buildbot/#/builders/139/builds/3818
2021-05-06 10:01:16 -04:00
Nemanja Ivanovic ed87f512bb [PowerPC] Provide some P8-specific altivec overloads for P7
This adds additional support for XL compatibility. There are a number
of functions in altivec.h that produce a single instruction (or a
very short sequence) for Power8 but can be done on Power7 without
scalarization. XL provides these implementations.
This patch adds the following overloads for doubleword vectors:
vec_add
vec_cmpeq
vec_cmpgt
vec_cmpge
vec_cmplt
vec_cmple
vec_sl
vec_sr
vec_sra
2021-05-06 08:37:36 -05:00
Johannes Doerfert 5d8d994dfb [OpenMP] Make sure classes work on the device as they do on the host
We do provide `operator delete(void*)` in `<new>` but it should be
available by default. This is mostly boilerplate to test it and the
unconditional include of `<new>` in the header we always in include
on the device.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D100620
2021-05-06 02:10:30 -05:00
Thomas Lively 81fce29d6e [WebAssembly] Add SIMD const_splat intrinsics
These intrinsics do not correspond to their own underlying instruction, but are
a convenience for the common case of materializing a constant vector that has
the same value in each lane.

Differential Revision: https://reviews.llvm.org/D101885
2021-05-05 13:46:45 -07:00
Thomas Lively 602f318cfd [WebAssembly] Fix constness of pointer params to load intrinsics
Update the SIMD builtin load functions to take pointers to const data and update
the intrinsics themselves to not cast away constness.

Differential Revision: https://reviews.llvm.org/D101884
2021-05-05 13:16:56 -07:00
Nemanja Ivanovic bfd60b36f8 [PowerPC] Add floating point overloads for vec_sldw
These are added for compatibility with XLC.
2021-04-30 20:29:03 -05:00
Nemanja Ivanovic c3da07d216 [PowerPC] Provide fastmath sqrt and div functions in altivec.h
This adds the long overdue implementations of these functions
that have been part of the ABI document and are now part of
the "Power Vector Intrinsic Programming Reference" (PVIPR).

The approach is to add new builtins and to emit code with
the fast flag regardless of whether fastmath was specified
on the command line.

Differential revision: https://reviews.llvm.org/D101209
2021-04-30 19:17:48 -05:00
Anastasia Stulova 3ec82e5195 [OpenCL] Prevent adding vendor extensions for all targets
Removed extension begin/end pragma as it has no effect and
it is added unconditionally for all targets.

Differential Revision: https://reviews.llvm.org/D92244
2021-04-30 14:42:51 +01:00
Wang, Pengfei e0c7db7d8c [MS] Preserve base register %rbx around cpuid
This patch copies implementation from cpuid.h, which preserve base register %rbx around cpuid. It fixes PR50133.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101338
2021-04-30 10:16:25 +08:00
Luo, Yuanke d6c6db2fea [X86][AMX] Add description for AMX new interface.
Differential Revision: https://reviews.llvm.org/D101059
2021-04-27 16:05:11 +08:00
Thomas Lively 502f54049d [WebAssembly] Finalize wasm_simd128.h intrinsics
Adds new intrinsics for instructions that are in the final SIMD spec but did not
previously have intrinsics. Also updates the names of existing intrinsics to
reflect the final names of the underlying instructions in the spec. Keeps the
old names as deprecated functions to ease the transition to the new names.

Differential Revision: https://reviews.llvm.org/D101112
2021-04-23 13:37:27 -07:00
Nemanja Ivanovic 19b29b1ed1 [PowerPC] Provide XL-compatible builtins in altivec.h
There are some interfaces in altivec.h that are not compatible
between Clang and XL (although Clang is compatible with GCC).
Currently, we have found 3 but there may be others.

Clang/GCC signatures:

vector double vec_ctf(vector signed long long)
vector double vec_ctf(vector unsigned long long)
vector signed long long vec_cts(vector double)
vector unsigned long long vec_ctu(vector double)

XL signatures:

vector float vec_ctf(vector signed long long)
vector float vec_ctf(vector unsigned long long)
vector signed int vec_cts(vector double)
vector unsigned int vec_ctu(vector double)

This patch provides the XL behaviour under the __XL_COMPAT_ALTIVEC__
macro for users that rely on XL behaviour.

Differential revision: https://reviews.llvm.org/D101130
2021-04-23 15:13:46 -05:00
Nemanja Ivanovic 6725b90a02 [PowerPC] Add vec_ctsl and vec_ctul to altivec.h
These are added for compatibility with XLC. They are similar to
vec_cts and vec_ctu except that the result is a doubleword vector
regardless of the parameter type.
2021-04-23 11:03:38 -05:00
Wang, Pengfei e8bce83996 [X86] Enable compilation of user interrupt handlers.
Add __uintr_frame structure and use UIRET instruction for functions with
x86 interrupt calling convention when UINTR is present.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D99708
2021-04-23 11:43:57 +08:00
Yaxun (Sam) Liu 8baba6890d [HIP] Support overloaded math functions for hipRTC
Remove the dependence on standard C++ header
for overloaded math functions in HIP header
since standard C++ header is not available for hipRTC.

Reviewed by: Artem Belevich, Justin Lebar

Differential Revision: https://reviews.llvm.org/D100794
2021-04-22 19:06:51 -04:00
Nemanja Ivanovic 7a5641d651 [PowerPC] Add missing casts for vec_xlds and vec_load_splats
The previous commits just missed some pointer casts and ended up
producing warnings.
2021-04-22 10:31:00 -05:00
Nemanja Ivanovic 1cc1d9db28 [PowerPC] Add vec_vclz as an alias for vec_cntlz in altivec.h
Another addition for compatibility with XLC. The functions have the
same overloads so just add it as a preprocessor define.
2021-04-22 10:31:00 -05:00
Nemanja Ivanovic e43963db24 [PowerPC] Add vec_load_splats to altivec.h
Add these overloads for compatibility with XLC. This is a word
load-and-splat.
2021-04-22 10:31:00 -05:00
Nemanja Ivanovic a0e6189712 [PowerPC] Add vec_xlds to altivec.h
Add these overloads for compatibility with XLC. This is a doubleword
load-and-splat.
2021-04-22 10:31:00 -05:00
Nemanja Ivanovic a1d325af67 [PowerPC] Add vec_roundz as alias for vec_trunc in altivec.h
Add the overloads for compatibility with XLC.
2021-04-22 10:31:00 -05:00
Nemanja Ivanovic 1550c47c18 [PowerPC] Add vec_roundp as alias for vec_ceil
Add the overloads for compatibility with XLC.
2021-04-22 10:30:59 -05:00
Nemanja Ivanovic 51692c6c63 [PowerPC] Add missing VSX guard for vec_roundm with vector double
The guard was missed in the previous commit.
2021-04-22 10:30:59 -05:00
Nemanja Ivanovic 3a46667059 [PowerPC] Add vec_roundm as alias for vec_floor in altivec.h
Add the overloads for compatibility with XLC.
2021-04-22 10:30:59 -05:00
Nemanja Ivanovic 3bcd0ece43 [PowerPC] Add vec_roundc as alias for vec_rint in altivec.h
For compatibility with XLC, add these overloads.
2021-04-22 05:31:38 -05:00
Liu, Chen3 72e4bf12ee [X86] Support some missing intrinsics
Support for _mm512_i32logather_pd, _mm512_mask_i32logather_pd,
_mm512_i32logather_epi64, _mm512_mask_i32logather_epi64, _mm512_i32loscatter_pd,
_mm512_mask_i32loscatter_pd, _mm512_i32loscatter_epi64,
_mm512_mask_i32loscatter_epi64.

Differential Revision: https://reviews.llvm.org/D100368
2021-04-21 10:50:37 +08:00
Yaxun (Sam) Liu d8805574c1 [CUDA][HIP] Allow non-ODR use of host var in device
Reviewed by: Artem Belevich, Richard Smith

Differential Revision: https://reviews.llvm.org/D98193
2021-04-19 14:45:24 -04:00
Yaxun (Sam) Liu 6823af0ca8 [HIP] Support hipRTC in header
hipRTC compiles HIP device code at run time. Since the system may not
have development tools installed, when a HIP program is compiled through
hipRTC, there is no standard C or C++ header available. As such, the HIP
headers should not depend on standard C or C++ headers when used
with hipRTC. Basically when hipRTC is used, HIP headers only provides
definitions of HIP device API functions. This is in line with what nvRTC does.

This patch adds support of hipRTC to HIP headers in clang. Basically hipRTC
defines a macro __HIPCC_RTC__ when compile HIP code at run time. When
this macro is defined, HIP headers do not include standard C/C++ headers.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D100652
2021-04-17 11:34:52 -04:00
Sven van Haastregt 35bc7569f8 [OpenCL] Add as_size/ptrdiff/intptr/uintptr_t operators
size_t and friends are built-in scalar data types and s6.4.4.2 of the
OpenCL C Specification says the as_type() operator must be available
for these data types.

Differential Revision: https://reviews.llvm.org/D98959
2021-04-07 10:16:41 +01:00
Yaxun (Sam) Liu 85ff35a952 [HIP] remove overloaded abs in header
This function seems to be introduced by accident by
aa2b593f14

Such overloaded abs function did not exist before
the refactoring, and does not exist in
https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__clang_cuda_cmath.h

Conceptually it also does not make sense, since it adds something like

double abs(int x) {
  return ::abs((double)x);
}

It caused regressions in CuPy.

Reviewed by: Aaron Enye Shi, Artem Belevich

Differential Revision: https://reviews.llvm.org/D99738
2021-04-01 12:23:29 -04:00
Sven van Haastregt b5995fced4 [OpenCL] Limit popcount to OpenCL 1.2 and above
s6.15.3 of the OpenCL C Specification v3.0.6 states that OpenCL 1.2 or
newer is required.
2021-03-31 09:54:18 +01:00
Craig Topper 3fb40ce167 [X86] Don't define vpclmulqdq or vaes intrinsics in the headers unless avx512fintrin.h has been included.
The intrinsics won't compile unless avx512fintrin.h has declared
the 512 bit types.
2021-03-28 11:26:30 -07:00
Zakk Chen 821547cabb [RISCV][Clang] Update new overloading rules for RVV intrinsics.
RVV intrinsics has new overloading rule, please see
82aac7dad4

Changed:
1. Rename `generic` to `overloaded` because the new rule is not using C11 generic.
2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations
   support overloading api.
3. Add more overloaded tests due to overloading rule changed.

Differential Revision: https://reviews.llvm.org/D99189
2021-03-28 09:04:35 -07:00
Nemanja Ivanovic 06411edb9f [PowerPC][NFC] Provide legacy names for VSX loads and stores
Before we unified the names of the builtins across all the
compilers, there were a number of synonyms between them. There
is code out there that uses XL naming for some of these loads and
stores. This just adds those names.
2021-03-25 06:32:40 -05:00
Nemanja Ivanovic 4020932706 [PowerPC] Make altivec.h work with AIX which has no __int128
There are a number of functions in altivec.h that use
vector __int128 which isn't supported on AIX. Those functions
need to be guarded for targets that don't support the type.
Furthermore, the functions that produce quadword instructions
without using the type need a builtin. This patch adds the
macro guards to altivec.h using the __SIZEOF_INT128__ which
is only defined on targets that support the __int128 type.
2021-03-24 00:35:51 -05:00
Nemanja Ivanovic 4146864735 [PowerPC][NFC] Use valid type for offset in altivec.h
We currently use signed long long instead of ptrdiff_t for offsets
in altivec.h. This has never really presented a problem because
all platforms where we use these are 64-bit. However, now that
we have 32-bit targets, we need to use a meaningful type.
2021-03-23 08:45:37 -05:00
Nemanja Ivanovic 2f782a796a [PowerPC] Add more missing overloads to altivec.h
Add overloads that perform subtraction on v1i128 that take and
produce vector unsigned char to avoid needing to use __int128.
The overloads are suffixed with _u128 and are needed for targets
where __int128 isn't supported (AIX).
2021-03-23 05:52:36 -05:00
Sven van Haastregt 1c6521a0dd [OpenCL] Remove mixed signedness atomic_fetch_ from opencl-c.h
The OpenCL C specification v3.0.6 s6.15.12.7.5 mentions:

    For atomic_fetch and modify functions with key = or, xor, and, min
    and max on atomic type atomic_intptr_t, M is intptr_t, and on
    atomic type atomic_uintptr_t, M is uintptr_t.

Remove the atomic_fetch_* overloads from opencl-c.h that mix intptr_t
and uintptr_t in the same declaration.

Differential Revision: https://reviews.llvm.org/D98418
2021-03-23 10:20:13 +00:00
Nemanja Ivanovic 54e4654f04 [PowerPC] Add more missing overloads to altivec.h
Add overloads that perform addition on v1i128 that take and produce
vector unsigned char to avoid needing to use __int128. The overloads
are suffixed with _u128 and are needed for targets where __int128
isn't supported (AIX).
2021-03-23 05:09:19 -05:00
Nemanja Ivanovic 10cc5bcd86 [PowerPC] Add more missing overloads to altivec.h
Add vec_permi as a synonym for vec_xxpermdi (but only for
doubleword vectors).
2021-03-22 23:09:41 -05:00
Nemanja Ivanovic b5e96e0ad6 [PowerPC] Add more missing overloads to altivec.h
Add vec_gbb as a synonym for vec_vgbbd but for doubleword vectors.
2021-03-22 22:25:28 -05:00
Nemanja Ivanovic d8e574c8e6 [PowerPC] Add more missing overloads to altivec.h
Add vec_cvf as a synonym for vec_doublee/vec_floate.
2021-03-22 22:08:43 -05:00
Nemanja Ivanovic bef2cb9062 [PowerPC] Add more missing overloads to altivec.h
Add vec_ctd which is similar to vec_ctf except the return type is
vector double rather than vector float.
2021-03-22 20:23:07 -05:00
Thomas Lively cbab2cd6bf [WebAssembly] Remove experimental instructions from wasm_simd128.h
These experimental builtin functions and the feature macro they were gated
behind have been removed.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D98907
2021-03-18 17:13:50 -07:00
Thomas Lively f5764a8654 [WebAssembly] Finalize SIMD names and opcodes
Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all
SIMD instructions to match the finalized SIMD spec. Deliberately does not change
the public interface in wasm_simd128.h yet; that will require more care.

Depends on D98466.

Differential Revision: https://reviews.llvm.org/D98676
2021-03-18 11:21:25 -07:00
Bing1 Yu 320b72e9cd [X86][AMX] Rename amx-bf16 intrinsic according to correct naming convention
__tile_tdpbf16ps should be renamed with __tile_dpbf16ps

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D98685
2021-03-17 11:22:52 +08:00
Stelios Ioannou ab86edbc88 [AArch64] Implement __rndr, __rndrrs intrinsics
This patch implements the __rndr and __rndrrs intrinsics to provide access to the random
number instructions introduced in Armv8.5-A. They are only defined for the AArch64
execution state and are available when __ARM_FEATURE_RNG is defined.

These intrinsics store the random number in their pointer argument and return a status
code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter
intrinsic reseeds the random number generator.

The instructions write the NZCV flags indicating the success of the operation that we can
then read with a CSET.

[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
[2] https://bugs.llvm.org/show_bug.cgi?id=47838

Differential Revision: https://reviews.llvm.org/D98264

Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
2021-03-15 17:51:48 +00:00
Nemanja Ivanovic b5fae4b9b2 [PowerPC] Add more missing overloads to altivec.h
We are missing more predicate forms for 'vector double' and some
tests. This adds the missing overloads and completes the set of
test cases for them.
2021-03-12 10:51:57 -06:00
Zakk Chen d6a0560bf2 [Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.
Demonstrate how to generate vadd/vfadd intrinsic functions

1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.

riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D95016
2021-03-10 18:43:43 -08:00
Jingu Kang 25951c5ab8 [AArch64] Add missing intrinsics for scalar FP rounding
Differential Revision: https://reviews.llvm.org/D98269
2021-03-10 13:22:29 +00:00
Nemanja Ivanovic f4ad7a1a15 [PowerPC] Add missing double precision vec_all overloads to altivec.h
We somehow missed vec_all_nlt, vec_all_nle and vec_all_numeric
overloads for double precision vectors when VSX is enabled.
2021-03-05 18:42:12 -06:00
Nemanja Ivanovic 1ff93618e5 [PowerPC] Add missing overloads of vec_promote to altivec.h
The VSX-only overloads (for 8-byte element vectors) are missing.
Add the missing overloads and convert element numbering to
modulo arithmetic to match GCC and XLC.
2021-03-01 21:40:30 -06:00
Nemanja Ivanovic 38a34e207f [PowerPC] Use modulo arithmetic for vec_extract in altivec.h
These interfaces are not covered in the ELFv2 ABI but are rather
implemented to emulate those available in GCC/XLC. However, the
ones in the other compilers are documented to perform modulo
arithmetic on the element number. This patch just brings clang
inline with the other compilers at -O0 (with optimization, clang
already does the right thing).
2021-03-01 19:49:26 -06:00
Artem Belevich 32e0645276 [CUDA] Remove `noreturn` attribute from __assertfail().
`noreturn` complicates control flow and tends to trigger a known bug in ptxas if
the assert is used within loops in sufficiently complicated code.
https://bugs.llvm.org/show_bug.cgi?id=27738

Differential Revision: https://reviews.llvm.org/D97708
2021-03-01 13:59:22 -08:00
Liu, Chen3 4bc7c8631a [X86] Support amx-bf16 intrinsic.
Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.

Differential Revision: https://reviews.llvm.org/D97358
2021-02-25 09:06:48 +08:00
Sven van Haastregt 612d0ef173 [OpenCL] Move remaining defines to opencl-c-base.h
Move any remaining preprocessor defines from `opencl-c.h` to
`opencl-c-base.h`, such that they are shared with
`-fdeclare-opencl-builtins` too.

In particular, move:
 - the `as_type` and `as_typen` definitions, and
 - the `kernel_exec` and `__kernel_exec` definitions.

Also clang-format the changes.

Differential Revision: https://reviews.llvm.org/D96948
2021-02-23 10:18:14 +00:00
Liu, Chen3 f8b9035aae [X86] Support amx-int8 intrinsic.
Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD.

Differential Revision: https://reviews.llvm.org/D97259
2021-02-23 17:08:05 +08:00
Sven van Haastregt 5a4a01460f [OpenCL] Move printf declaration to opencl-c-base.h
Supporting `printf` with `-fdeclare-opencl-builtins` would require
special handling (for e.g. varargs and format attributes) for just
this one function.  Instead, move the `printf` declaration to the
shared base header.

Differential Revision: https://reviews.llvm.org/D96789
2021-02-18 11:27:19 +00:00
Wang, Pengfei 61da20575d [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)
This is a follow up of D92940.

We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax
too, i.e. llvm.reduction + nnan flag.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93179
2021-02-15 08:52:06 +08:00
Jonas Paulsson b3ac5b84cd [SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst.
vec_xl() and vec_xst() should not emit alignment hints since they take a
scalar pointer and also add a byte offset if passed.

This patch uses memcpy to achieve the desired result.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D96471
2021-02-12 18:26:36 -06:00
Wang, Pengfei dd2460ed5d [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.
Intrinsics *reduce_add/mul_ps/pd have assumption that the elements in
the vector are reassociable. So we need to always assign the reassoc
flag when we call _mm_reduce_* intrinsics.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D96231
2021-02-09 21:14:06 +08:00
Anton Zabaznov d88c55ab95 [OpenCL] Add macro definitions of OpenCL C 3.0 features
This patch adds possibility to define OpenCL C 3.0 feature macros
via command line option or target setting.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D95776
2021-02-05 18:42:25 +03:00
Yaxun (Sam) Liu 0211877a07 [HIP] Add __managed__ macro to header 2021-02-04 16:22:42 -05:00
Wolfgang Pieb 231a82a150 [X86] Correct some cross references in avxintrin.h. 2021-01-25 18:49:28 -08:00
Wolfgang Pieb 350395d82f [x86] Fix trivial typo in emmintrin.h 2021-01-25 17:28:05 -08:00
Michael Liao 7b5d7c7b0a [hip] Fix `<complex>` compilation on Windows with VS2019.
Differential Revision: https://reviews.llvm.org/D95075
2021-01-20 16:43:44 -05:00
Luo, Yuanke 7e1d2224b4 [X86][AMX] Fix the typo.
The dpbsud should be dpbssd.

Differential Revision: https://reviews.llvm.org/D94943
2021-01-19 16:57:34 +08:00
Aaron En Ye Shi be40c12040 [HIP] Add signbit(long double) decl
An _MSC_VER version of signbit(long double) is required for MSVC headers.

Fixes: SWDEV-256409

Differential Revision: https://reviews.llvm.org/D93062
2021-01-14 18:23:37 +00:00
Lucas Prates 2b1e25befe [AArch64] Adding ACLE intrinsics for the LS64 extension
This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes
atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`,
and `__arm_st64bv0`. These are selected into the LS64 instructions
LD64B, ST64B, ST64BV and ST64BV0, respectively.

Based on patches written by Simon Tatham.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D93232
2021-01-14 09:43:58 +00:00
Esme-Yi ffa67873a3 [PowerPC] Add variants of 64-bit vector types for vec_sel.
Summary: This patch added variants of vec_sel and fixed bugzilla 46770.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D94162
2021-01-11 03:52:16 +00:00
Michael Liao f78d6af731 [hip] Enable HIP compilation with `<complex`> on MSVC.
- MSVC has different `<complex>` implementation which calls into functions
  declared in `<ymath.h>`. Provide their device-side implementation to enable
  `<complex>` compilation on HIP Windows.

Differential Revision: https://reviews.llvm.org/D93638
2021-01-07 17:41:28 -05:00
Luo, Yuanke 08665b1805 Support tilezero intrinsic and c interface for AMX.
Differential Revision: https://reviews.llvm.org/D92837
2020-12-31 13:24:57 +08:00
Michael Liao bb8d20d9f3 [cuda][hip] Fix typoes in header wrappers. 2020-12-21 13:02:47 -05:00
Simon Pilgrim 4855a1004d [X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)
Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well.

Differential Revision: https://reviews.llvm.org/D92940
2020-12-13 15:37:35 +00:00
Anastasia Stulova a84599f177 [OpenCL] Implement extended subgroups fully in headers.
Extended subgroups are library style extensions and therefore
they require no changes in the frontend. This commit:
1. Moves extension macro definitions to the internal headers.
2. Removes extension pragmas because they are not needed.

Tags: #clang

Differential Revision: https://reviews.llvm.org/D92231
2020-12-10 16:40:15 +00:00
Luo, Yuanke f80b29878b [X86] AMX programming model.
This patch implements amx programming model that discussed in llvm-dev
 (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html).
 Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet.
 This patch implemeted 7 components.

1. The c interface to end user.
2. The AMX intrinsics in LLVM IR.
3. Transform load/store <256 x i32> to AMX intrinsics or split the
   type into two <128 x i32>.
4. The Lowering from AMX intrinsics to AMX pseudo instruction.
5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx
   intruction.
6. The register allocation for tile register.
7. Morph AMX pseudo instruction to AMX real instruction.

Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0

Differential Revision: https://reviews.llvm.org/D87981
2020-12-10 17:01:54 +08:00
Masoud Ataei fc750f609d [PPC] Fixing a typo in altivec.h. Commenting out an unnecessary macro 2020-12-08 19:21:02 +00:00
Artem Belevich 4326792942 [CUDA] Another attempt to fix early inclusion of <new> from libstdc++
Previous patch (9a465057a6) did not fix the problem.
https://bugs.llvm.org/show_bug.cgi?id=48228

If the <new> is included too early, before CUDA-specific defines are available,
just include-next the standard <new> and undo the include guard.  CUDA-specific
variants of operator new/delete will be declared if/when <new> is used from the
CUDA source itself, when all CUDA-related macros are available.

Differential Revision: https://reviews.llvm.org/D91807
2020-12-04 12:03:35 -08:00
Martin Storsjö c17fdca188 [clang] [Headers] Use the corresponding _aligned_free or __mingw_aligned_free in _mm_free
Differential Revision: https://reviews.llvm.org/D92570
2020-12-04 11:34:12 +02:00
Aaron En Ye Shi ba2612ce01 [HIP] cmath demote long double args to double
Since there is no ROCm Device Library support for
long double, demote them to double, and use the fp64
math functions.

Differential Revision: https://reviews.llvm.org/D92130
2020-12-03 23:00:14 +00:00
Reid Kleckner 1e843a987d [MS] Add more 128bit cmpxchg intrinsics for AArch64
The MSVC STL for requires this on ARM64.
Requested in https://llvm.org/pr47099

Depends on D92061

Differential Revision: https://reviews.llvm.org/D92062
2020-11-25 12:07:28 -08:00
Artem Belevich 9a465057a6 [CUDA] Unbreak CUDA compilation with -std=c++20
Standard libc++ headers in stdc++ mode include <new> which picks up
cuda_wrappers/new before any of the CUDA macros have been defined.

We can not include CUDA headers that early, so the work-around is to define
__device__ in the wrapper header itself.

Differential Revision: https://reviews.llvm.org/D91807
2020-11-19 10:35:47 -08:00
Sven van Haastregt f0c690018a [OpenCL] Stop opencl-c-base.h leaking extension enabling
opencl-c.h disables all extensions at its end, but opencl-c-base.h
does not, and that causes any inclusion of only opencl-c-base.h to
leave some extensions (such as cl_khr_fp16) enabled.  This affects the
-fdeclare-opencl-builtins option for example.

This violates the OpenCL Extension Specification which specifies that
"The initial state of the compiler is as if the directive #pragma
OPENCL EXTENSION all : disable was issued".

Fix by disabling all extensions at the end of opencl-c-base.h and
enable extensions inside opencl.h which relied on opencl-c-base.h
enabling the cl_khr_fp16/64 extensions.

Differential Revision: https://reviews.llvm.org/D91429
2020-11-17 12:07:40 +00:00
Roland McGrath cf36142d34 [clang] Add missing header guard in <cpuid.h>
This header has long lacked a standard multiple inclusion guard
like other headers have, for no apparent reason.  The GCC header
of the same name likewise lacks one up through release 10.1, but
trunk GCC (release 11, and perhaps future 10.x) has fixed it
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96238).

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D91226
2020-11-10 19:34:25 -08:00
Qiu Chaofan 979a4d268a [PowerPC] [Clang] Port SSE4.1-compatible insert intrinsics
This patch adds three intrinsics compatible to x86's SSE 4.1 on PowerPC
target, with tests:

- _mm_insert_epi8
- _mm_insert_epi32
- _mm_insert_epi64

The intrinsics implementation is contributed by Paul Clarke.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D89242
2020-11-10 10:52:13 +08:00
Freddy Ye 5e312e0041 [X86] use macros to split GFNI intrinsics into different kinds
Tremont microarchitecture only has GFNI(SSE) version, not AVX and
AVX512 version. This patch is to avoid compiling fail on Windows when
using -march=tremont to invoke one of GFNI(SSE) intrinsic.

Differential Revision: https://reviews.llvm.org/D90822
2020-11-06 16:03:38 +08:00
Albion Fung 1af037f643 [PowerPC] Correct cpsgn's behaviour on PowerPC to match that of the ABI
This patch fixes the reversed behaviour exhibited by cpsgn on PPC. It now matches the ABI.

Differential Revision: https://reviews.llvm.org/D84962
2020-11-05 15:35:14 -05:00
Aaron En Ye Shi ca5b31502c [HIP] Math Headers to use type promotion
Similar to libcxx implementation of cmath function
overloads, use type promotion templates to determine
return types of multi-argument math functions.

Fixes: SWDEV-256825

Reviewed By: tra, yaxunl

Differential Revision: https://reviews.llvm.org/D90409
2020-11-03 18:40:26 +00:00
Liu, Chen3 756f597841 [X86] Support Intel avxvnni
This patch mainly made the following changes:

1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.

Differential Revision: https://reviews.llvm.org/D89105
2020-10-31 12:39:51 +08:00
Joachim Meyer eaee608448 [OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in complex wrapper headers.
This is very similar to 7f1e6fcff9, just fixing a left-over.
With this, it should be possible to use both, -x cuda and -fopenmp in the same invocation,
enabling to use both OpenMP, targeting CPU, and CUDA, targeting the GPU.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D90415
2020-10-29 23:24:49 +01:00
Johannes Doerfert 17c8251bca [OpenMP][CUDA][FIX] Use the new `remquo` overload only for OpenMP
CUDA buildbots complained about a redefinition when I landed D89971.
This is odd and I fail to understand where in the CUDA headers the other
definition is supposed to be. For now, given that CUDA doesn't need the
overload (AFAIKT), we simply restrict it to the OpenMP mode.
2020-10-27 23:52:59 -05:00
Johannes Doerfert b1a90e1599 [OpenMP][CUDA] Add missing overload for `remquo(float,float,int*)`
Reported by Colleen Bertoni <bertoni@anl.gov> after running the OvO test
suite: https://github.com/TApplencourt/OvO/

The template overload is still hidden behind an ifdef for OpenMP. In the
future we probably want to remove the ifdef but that requires further
testing.

Reviewed By: JonChesterfield, tra

Differential Revision: https://reviews.llvm.org/D89971
2020-10-27 19:12:51 -05:00
Aaron En Ye Shi 3700556ecb [HIP][NFC] Use correct max in cuda_complex_builtins
Update the clang complex builtins for OpenMP to use the
correct max function from either __nv_* or __ocml_*.
2020-10-27 19:35:09 +00:00
Aaron En Ye Shi b2524eb944 [HIP] Fix HIP rounding math intrinsics
The __ocml_*_rte_f32 and __ocml_*_rte_f64 functions are not
available if OCML_BASIC_ROUNDED_OPERATIONS is not defined.

Reviewed By: b-sumner, yaxunl

Fixes: SWDEV-257235

Differential Revision: https://reviews.llvm.org/D89966
2020-10-22 15:57:09 +00:00
Tianqing Wang be39a6fe6f [X86] Add User Interrupts(UINTR) instructions
For more details about these instructions, please refer to the latest
ISE document:
https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89301
2020-10-22 17:33:07 +08:00
Albion Fung d30155feaa [PowerPC] Implementation of 128-bit Binary Vector Rotate builtins
This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10.

Differential Revision: https://reviews.llvm.org/D86819
2020-10-16 18:03:22 -04:00
Simon Pilgrim 6c23cbc560 [X86] Convert integer _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)
Emit the equivalent integer reduction intrinsics in IR instead of expanding to shuffle+arithmetic sequences.

The fadd/fmul reductions might be trickier as they assume a similar bisection reduction while the generic intrinsics assume a sequential reduction (intel docs are ambiguous on the correct approach) - I'm not sure if we want to always tag them with reassoc? Anyway, that issue can wait until a separate fp patch along with the fmin/fmax reductions.

Differential Revision: https://reviews.llvm.org/D87604
2020-10-13 09:28:39 +01:00
Wang, Pengfei 412cdcf2ed [X86] Add HRESET instruction.
For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89102
2020-10-13 08:47:26 +08:00
Aaron En Ye Shi 8d2a0c115e [HIP] NFC Add comments to cmath functions
Add missing comments to cmath functions.

Differential Revision: https://reviews.llvm.org/D88837
2020-10-06 15:26:56 +00:00
Aaron En Ye Shi aa2b593f14 [HIP] Restructure hip headers to add cmath
Separate __clang_hip_math.h header into __clang_hip_cmath.h
and __clang_hip_math.h. Improve the math function definition,
and add missing definitions or declarations. Add missing
overloads.

Reviewed By: tra, JonChesterfield

Differential Review: https://reviews.llvm.org/D88837
2020-10-06 14:48:53 +00:00
Craig Topper a02b449bb1 [X86] Sync AESENC/DEC Key Locker builtins with gcc.
For the wide builtins, pass a single input and output pointer to
the builtins. Emit the GEPs and input loads from CGBuiltin.
2020-10-04 12:09:41 -07:00
Craig Topper 230c57b0bd [X86] Synchronize the encodekey builtins with gcc. Don't assume void* is 16 byte aligned.
We were taking multiple pointer arguments in the builtin.
gcc accepts a single void*.

The cast from void* to _m128i* caused the IR generation to assume
the pointer was aligned.

Instead make the builtin take a single void*, emit i8* GEPs to
adjust then cast to <2 x i64>* and perform a store with align of 1.
2020-10-04 12:09:35 -07:00
Craig Topper 28595cbbeb [X86] Synchronize the loadiwkey builtin operand order with gcc version. 2020-10-04 12:09:29 -07:00
Craig Topper 6c6cd5f8a9 [X86] Consolidate wide Key Locker intrinsics into the same header as the other Key Locker intrinsics. 2020-10-04 12:09:21 -07:00
Esme-Yi e3475f5b91 [PowerPC] Add builtins for xvtdiv(dp|sp) and xvtsqrt(dp|sp).
Summary: This patch implements the builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp.
The instructions correspond to the following builtins:
int vec_test_swdiv(vector double v1, vector double v2);
int vec_test_swdivs(vector float v1, vector float v2);
int vec_test_swsqrt(vector double v1);
int vec_test_swsqrts(vector float v1);
This patch depends on D88274, which fixes the bug in copying from CRRC to GPRC/G8RC.

Reviewed By: steven.zhang, amyk

Differential Revision: https://reviews.llvm.org/D88278
2020-10-04 16:24:20 +00:00
Xiang1 Zhang 413577a879 [X86] Support Intel Key Locker
Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access
to the raw key value by converting AES keys into “handles”. These handles can be used to perform the
same encryption and decryption operations as the original AES keys, but they only work on the current
system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot),
then any previous handles can no longer be used.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D88398
2020-09-30 18:08:45 +08:00
Yaxun (Sam) Liu d04775e16b Add remquo, frexp and modf overload functions to HIP header 2020-09-29 20:57:56 -04:00
Artem Belevich 30514f0afa [CUDA] Added conversion functions to builtin vars.
This is needed to compile some headers in CUDA-11 that assume that threadIdx is
implicitly convertible to dim3. With NVCC, threadIdx is uint3 and there's
dim3(uint3) constructor. Clang uses a special type for the builtin variables, so
that path does not work. Instead, this patch adds conversion function to the
builtin variable classes. that will allow them to be converted to dim3 and uint3.

Differential Revision: https://reviews.llvm.org/D88250
2020-09-24 14:33:04 -07:00
Amy Kwan 6b136b19cb [Power10] Implement custom codegen for the vec_replace_elt and vec_replace_unaligned builtins.
This patch implements custom codegen for the vec_replace_elt and
vec_replace_unaligned builtins.

These builtins map to the @llvm.ppc.altivec.vinsw and @llvm.ppc.altivec.vinsd
intrinsics depending on the arguments. The main motivation for doing custom
codegen for these intrinsics is because there are float and double versions of
the builtin. Normally, the converting the float to an integer would be done via
fptoui in the IR. This is incorrect as fptoui truncates the value and we must
ensure the value is not truncated. Therefore, we provide custom codegen to utilize
bitcast instead as bitcasts do not truncate.

Differential Revision: https://reviews.llvm.org/D83500
2020-09-23 22:55:25 -05:00
Amy Kwan 2e7117f847 [PowerPC] Implement the 128-bit vec_[all|any]_[eq | ne | lt | gt | le | ge] builtins in Clang/LLVM
This patch implements the vec_[all|any]_[eq | ne | lt | gt | le | ge] builtins for vector signed/unsigned __int128.

Differential Revision: https://reviews.llvm.org/D87910
2020-09-23 16:49:40 -04:00
Albion Fung 88cdbeab41 [PowerPC] Implement Vector signed/unsigned __int128 overloads for the comparison builtins
This patch implements Vector signed/unsigned __int128 overloads for the comparison builtins.

Differential Revision: https://reviews.llvm.org/D87804
2020-09-23 16:49:40 -04:00
Albion Fung d7eb917a7c [PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins
This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10.

Differential: https://reviews.llvm.org/D87394#inline-815858
2020-09-23 01:18:14 -05:00
Amy Kwan 079757b551 [PowerPC] Implement Vector String Isolate Builtins in Clang/LLVM
This patch implements the vector string isolate (predicate and non-predicate
versions) builtins. The predicate builtins are custom selected within PPCISelDAGToDAG.

Differential Revision: https://reviews.llvm.org/D87671
2020-09-22 11:31:44 -05:00
Amy Kwan b3147058de [PowerPC] Implement the 128-bit Vector Divide Extended Builtins in Clang/LLVM
This patch implements the 128-bit vector divide extended builtins in Clang/LLVM.
These builtins map to the vdivesq and vdiveuq instructions respectively.

Differential Revision: https://reviews.llvm.org/D87729
2020-09-22 11:31:44 -05:00
Amy Kwan 37e7673c21 [PowerPC] Implement Move to VSR Mask builtins in LLVM/Clang
This patch implements the vec_gen[b|h|w|d|q]m function prototypes in altivec.h
in order to utilize the move to VSR with mask instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82725
2020-09-18 18:16:14 -05:00
Amy Kwan 2c3bc918db [PowerPC] Implement Vector Count Mask Bits builtins in LLVM/Clang
This patch implements the vec_cntm function prototypes in altivec.h in order to
utilize the vector count mask bits instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82726
2020-09-17 18:20:53 -05:00
Raul Tambre e09107ab80 [Sema] Introduce BuiltinAttr, per-declaration builtin-ness
Instead of relying on whether a certain identifier is a builtin, introduce BuiltinAttr to specify a declaration as having builtin semantics.

This fixes incompatible redeclarations of builtins, as reverting the identifier as being builtin due to one incompatible redeclaration would have broken rest of the builtin calls.
Mostly-compatible redeclarations of builtins also no longer have builtin semantics. They don't call the builtin nor inherit their attributes.
A long-standing FIXME regarding builtins inside a namespace enclosed in extern "C" not being recognized is also addressed.

Due to the more correct handling attributes for builtin functions are added in more places, resulting in more useful warnings.
Tests are updated to reflect that.

Intrinsics without an inline definition in intrin.h had `inline` and `static` removed as they had no effect and caused them to no longer be recognized as builtins otherwise.

A pthread_create() related test is XFAIL-ed, as it relied on it being recognized as a builtin based on its name.
The builtin declaration syntax is too restrictive and doesn't allow custom structs, function pointers, etc.
It seems to be the only case and fixing this would require reworking the current builtin syntax, so this seems acceptable.

Fixes PR45410.

Reviewed By: rsmith, yutsumi

Differential Revision: https://reviews.llvm.org/D77491
2020-09-17 19:28:57 +03:00
Johannes Doerfert 56069b5c71 [OpenMP] Support `std::complex` math functions in target regions
The last (big) missing piece to get "math" working in OpenMP target
regions (that I know of) was complex math functions, e.g.,
`std::sin(std::complex<double>)`. With this patch we overload the system
template functions for these operations with versions that have been
distilled from `libcxx/include/complex`. We use the same
  `omp begin/end declare variant`
mechanism we use for other math functions before, except that we this
time overload templates (via D85735).

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D85777
2020-09-16 13:37:10 -05:00
Johannes Doerfert 5c1084e884 [OpenMP] Context selector extensions for template functions
With this extension the effects of `omp begin declare variant` will be
applied to template function declarations. The behavior is opt-in and
controlled by the `extension(allow_templates)` trait. While generally
useful, this will enable us to implement complex math function calls by
overloading the templates of the standard library with the ones in
libc++.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D85735
2020-09-16 13:37:10 -05:00
Johannes Doerfert 97652202d1 [OpenMP] Overload `std::isnan` and friends multiple times for the GPU
`std::isnan` and friends can be found in two variants in the wild, one
returns `bool`, as the standard defines it, one returns `int`, as the C
macros do. So far we kinda hoped the system versions of these functions
will work for people, e.g. they are definitions that can be compiled for
the target. We know that is not the case always so we leverage the
`disable_implicit_base` OpenMP context extension to specialize both
versions of these functions without causing an invalid redeclaration.

Reviewed By: JonChesterfield, tra

Differential Revision: https://reviews.llvm.org/D85879
2020-09-16 13:37:09 -05:00
Albion Fung 05aa997d51 [PowerPC] Implement __int128 vector divide operations
This patch implements __int128 vector divide operations for ISA3.1.

Differential Revision: https://reviews.llvm.org/D85453
2020-09-15 15:19:35 -04:00
Amy Kwan efa57f9a7a [PowerPC] Implement Vector Expand Mask builtins in LLVM/Clang
This patch implements the vec_expandm function prototypes in altivec.h in order
to utilize the vector expand with mask instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82727
2020-09-06 17:13:21 -05:00
Nemanja Ivanovic 2d652949be [PowerPC] Provide vec_cmpne on pre-Power9 architectures in altivec.h
These overloads are listed in appendix A of the ELFv2 ABI specification
without a requirement for ISA 3.0. So these need to be available on
all Altivec-capable architectures. The implementation in altivec.h
erroneously had them guarded for Power9 due to the availability of
the VCMPNE[BHW] instructions. However these need to be implemented
in terms of the VCMPEQ[BHW] instructions on older architectures.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47423
2020-09-04 21:48:38 -04:00
Nemanja Ivanovic 54205f0bd2 [PowerPC] Allow const pointers for load builtins in altivec.h
The load builtins in altivec.h do not have const in the signature
for the pointer parameter. This prevents using them for loading
from constant pointers. A notable case for such a use is Eigen.

This patch simply adds the missing const.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47408
2020-09-04 13:56:39 -04:00
Albion Fung 5d1fe3f903 [PowerPC] Implemented Vector Multiply Builtins
This patch implements the builtins for Vector Multiply Builtins (vmulxxd family of instructions), and adds the appropriate test cases for these builtins. The builtins utilize the vector multiply instructions itnroduced with ISA 3.1.

Differential Revision: 	https://reviews.llvm.org/D83955
2020-09-02 14:16:21 -05:00
Richard Smith 0e00a95b4f Add new warning for compound punctuation tokens that are split across macro expansions or split by whitespace.
For example:

    #define FOO(x) (x)
    FOO({});

... forms a statement-expression after macro expansion. This warning
applies to '({' and '})' delimiting statement-expressions, '[[' and ']]'
delimiting attributes, and '::*' introducing a pointer-to-member.

The warning for forming these compound tokens across macro expansions
(or across files!) is enabled by default; the warning for whitespace
within the tokens is not, but is included in -Wall.

Differential Revision: https://reviews.llvm.org/D86751
2020-08-28 13:35:50 -07:00
Albion Fung 331dcc43ea [PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins
This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1.

Differential Revision: 	https://reviews.llvm.org/D82502#inline-797941
2020-08-28 11:28:58 -05:00
Amy Kwan 76b0f99ea8 [PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang
This patch implements the function prototypes vec_mulh and vec_dive in order to
utilize the vector multiply high (vmulh[s|u][w|d]) and vector divide extended
(vdive[s|u][w|d]) instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82609
2020-08-26 23:14:34 -05:00
Simon Pilgrim a1dc3d241b [X86] Enable constexpr on ROTL/ROTR intrinsics (PR31446)
This enables constexpr rotate intrinsics defined in ia32intrin.h, including the MS specific builtins.
2020-08-23 16:11:58 +01:00
Simon Pilgrim f8e0e5db48 [X86] Enable constexpr on _cast fp<-> uint intrinsics (PR31446)
As suggested by @rsmith on PR47267, by replacing the builtin_memcpy bitcast pattern with builtin_bit_cast we can use _castf32_u32, _castu32_f32, _castf64_u64 and _castu64_f64 inside constant expresssions (constexpr). Although __builtin_bit_cast was added for c++20 it works on all clang c/c++ modes.

Differential Revision: https://reviews.llvm.org/D86398
2020-08-23 10:27:46 +01:00
Simon Pilgrim 42b993d97d [X86] ia32intrin.h - pull out common attributes used in cast helpers into define. NFCI. 2020-08-22 15:25:15 +01:00