RVV intrinsics has new overloading rule, please see
82aac7dad4
Changed:
1. Rename `generic` to `overloaded` because the new rule is not using C11 generic.
2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations
support overloading api.
3. Add more overloaded tests due to overloading rule changed.
Differential Revision: https://reviews.llvm.org/D99189
Demonstrate how to generate vadd/vfadd intrinsic functions
1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.
riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D95016
This patch mainly made the following changes:
1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.
Differential Revision: https://reviews.llvm.org/D89105
Separate __clang_hip_math.h header into __clang_hip_cmath.h
and __clang_hip_math.h. Improve the math function definition,
and add missing definitions or declarations. Add missing
overloads.
Reviewed By: tra, JonChesterfield
Differential Review: https://reviews.llvm.org/D88837
Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access
to the raw key value by converting AES keys into “handles”. These handles can be used to perform the
same encryption and decryption operations as the original AES keys, but they only work on the current
system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot),
then any previous handles can no longer be used.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D88398
The last (big) missing piece to get "math" working in OpenMP target
regions (that I know of) was complex math functions, e.g.,
`std::sin(std::complex<double>)`. With this patch we overload the system
template functions for these operations with versions that have been
distilled from `libcxx/include/complex`. We use the same
`omp begin/end declare variant`
mechanism we use for other math functions before, except that we this
time overload templates (via D85735).
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D85777
This simply follows the scheme we have for other wrappers. It resolves
the current link problem, e.g., `__muldc3 not found`, when std::complex
operations are used on a device.
This will not allow complex make math function calls to work properly,
e.g., sin, but that is more complex (pan intended) anyway.
Reviewed By: tra, JonChesterfield
Differential Revision: https://reviews.llvm.org/D80897
To support std::complex and some other standard C/C++ functions in HIP device code,
they need to be forced to be __host__ __device__ functions by pragmas. This is done
by some clang standard C++ wrapper headers which are shared between cuda-clang and hip-Clang.
For these standard C++ wapper headers to work properly, specific include path order
has to be enforced:
clang C++ wrapper include path
standard C++ include path
clang include path
Also, these C++ wrapper headers require device version of some standard C/C++ functions
must be declared before including them. This needs to be done by including a default
header which declares or defines these device functions. The default header is always
included before any other headers are included by users.
This patch adds the the default header and include path for HIP.
Differential Revision: https://reviews.llvm.org/D81176
Summary:
Add x86 feature with IBT and/or SHSTK bits to ELF program property if they are enabled. Otherwise, contents in this header file are unused.
This file is mainly design for assembly source code which want to enable CET
Reviewers: hjl.tools, annita.zhang, LuoYuanke, craig.topper, tstellar, pengfei, rsmith
Reviewed By: LuoYuanke
Subscribers: cfe-commits, mgorny
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79617
Summary:
Add x86 feature with IBT and/or SHSTK bits to ELF program property if they are enabled. Otherwise, contents in this header file are unused.
This file is mainly design for assembly source code which want to enable CET
Reviewers: hjl.tools, annita.zhang, LuoYuanke, craig.topper, tstellar, pengfei, rsmith
Reviewed By: LuoYuanke
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D79617
For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions,
we include the appropriate definitions inside of an `omp begin/end declare variant match(device={arch(nvptx)})` scope.
This way, the vendor specific math functions will become specialized versions of the system math functions.
When a system math function is called and specialized version is available the selection logic introduced in D75779
instead call the specialized version. In contrast to the code path we used so far, the system header is actually included.
This means functions without specialized versions are available and so are macro definitions.
This should address PR42061, PR42798, and PR42799.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D75788
This is not supported to change anything but allow us to reuse the math
functions separately from the device functions, e.g., source them at
different times. This will be used by the OpenMP overlay.
This also adds two `return` keywords that were missing.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D77238
Summary:
As the WebAssembly SIMD proposal nears stabilization, there is desire
to use it with toolchains other than Emscripten. Moving the intrinsics
header to clang will make it available to WASI toolchains as well.
Reviewers: aheejin, sunfish
Subscribers: dschuff, mgorny, sbc100, jgravelle-google, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76959
This patch adds 'q' to mean 'scalable vector' in the builtin
type string, and for SVE will return the matching builtin
type as defined in the C/C++ language extensions for SVE.
This patch also adds some scaffolding to generate the arm_sve.h
header file, and some builtin definitions (+CodeGen) to be able
to implement some simple masked load intrinsics that use the
ACLE types, such as:
svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) {
return svld1_s8(pg, base);
}
Reviewers: efriedma, rjmccall, rovka, rsandifo-arm, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75298
Summary:
This patch generalizes the existing code to support CDE intrinsics
which will share some properties with existing MVE intrinsics
(some of the intrinsics will be polymorphic and accept/return values
of MVE vector types).
Specifically the patch:
* Adds new tablegen backends -gen-arm-cde-builtin-def,
-gen-arm-cde-builtin-codegen, -gen-arm-cde-builtin-sema,
-gen-arm-cde-builtin-aliases, -gen-arm-cde-builtin-header based on
existing MVE backends.
* Renames the '__clang_arm_mve_alias' attribute into
'__clang_arm_builtin_alias' (it will be used with CDE intrinsics as
well as MVE intrinsics)
* Implements semantic checks for the coprocessor argument of the CDE
intrinsics as well as the existing coprocessor intrinsics.
* Adds one CDE intrinsic __arm_cx1 to test the above changes
Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen
Reviewed By: simon_tatham
Subscribers: sdesmalen, mgorny, kristof.beyls, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75850
Summary:
To use new/delete in NVPTX code we need to define them. Implementation
copied from CUDA wrappers.
Reviewers: hfinkel, jdoerfert
Subscribers: mgorny, guansong, kkwli0, caomhin, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73128
This commit sets up the infrastructure for auto-generating <arm_mve.h>
and doing clang-side code generation for the builtins it relies on,
and demonstrates that it works by implementing a representative sample
of the ACLE intrinsics, more or less matching the ones introduced in
LLVM IR by D67158,D68699,D68700.
Like NEON, that header file will provide a set of vector types like
uint16x8_t and C functions with names like vaddq_u32(). Unlike NEON,
the ACLE spec for <arm_mve.h> includes a polymorphism system, so that
you can write plain vaddq() and disambiguate by the vector types you
pass to it.
Unlike the corresponding NEON code, I've arranged to make every user-
facing ACLE intrinsic into a clang builtin, and implement all the code
generation inside clang. So <arm_mve.h> itself contains nothing but
typedefs and function declarations, with the latter all using the new
`__attribute__((__clang_builtin))` system to arrange that the user-
facing function names correspond to the right internal BuiltinIDs.
So the new MveEmitter tablegen system specifies the full sequence of
IRBuilder operations that each user-facing ACLE intrinsic should
translate into. Where possible, the ACLE intrinsics map to standard IR
operations such as vector-typed `add` and `fadd`; where no standard
representation exists, I call down to the sample IR intrinsics
introduced in an earlier commit.
Doing it like this means that you get the polymorphism for free just
by using __attribute__((overloadable)): the clang overload resolution
decides which function declaration is the relevant one, and _then_ its
BuiltinID is looked up, so by the time we're doing code generation,
that's all been resolved by the standard system. It also means that
you get really nice error messages if the user passes the wrong
combination of types: clang will show the declarations from the header
file and explain why each one doesn't match.
(The obvious alternative approach would be to have wrapper functions
in <arm_mve.h> which pass their arguments to the underlying builtins.
But that doesn't work in the case where one of the arguments has to be
a constant integer: the wrapper function can't pass the constantness
through. So you'd have to do that case using a macro instead, and then
use C11 `_Generic` to handle the polymorphism. Then you have to add
horrible workarounds because `_Generic` requires even the untaken
branches to type-check successfully, and //then// if the user gets the
types wrong, the error message is totally unreadable!)
Reviewers: dmgreen, miyuki, ostannard
Subscribers: mgorny, javed.absar, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67161
Port existing headers which include x86 intrinsics implementation to
PowerPC platform (using Altivec), along with tests. Also, tests about
including these intrinsic headers are combined.
The headers are mainly developed by Steven Munroe, with contributions
from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.
Reviewed By: Jinsong Ji
Differential Revision: https://reviews.llvm.org/D65630
llvm-svn: 368392
Using the -fdeclare-opencl-builtins option will require a way to
predefine types and macros such as `int4`, `CLK_GLOBAL_MEM_FENCE`,
etc. Move these out of opencl-c.h into opencl-c-base.h such that the
latter can be shared by -fdeclare-opencl-builtins and
-finclude-default-header.
This changes the behaviour of -finclude-default-header when
-fdeclare-opencl-builtins is specified: instead of including the full
header, it will include the header with only the base definitions.
Differential revision: https://reviews.llvm.org/D63256
llvm-svn: 363794
Port emmintrin.h which include Intel SSE2 intrinsics implementation to PowerPC platform (using Altivec).
The new headers containing those implemenations are located into a directory named ppc_wrappers
which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe,
with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.
It's a follow-up patch of D62121.
Patched by: Qiu Chaofan <qiucf@cn.ibm.com>
Differential Revision: https://reviews.llvm.org/D62569
llvm-svn: 363122
Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec).
The new headers containing those implemenations are located into a directory named ppc_wrappers
which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe,
with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.
Patched by: Qiu Chaofan <qiucf@cn.ibm.com>
Reviewed By: Jinsong Ji
Differential Revision: https://reviews.llvm.org/D62121
llvm-svn: 362190
Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec).
The new headers containing those implemenations are located into a directory named ppc_wrappers
which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe,
with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.
Patched by: Qiu Chaofan <qiucf@cn.ibm.com>
Reviewed By: Jinsong Ji
Differential Revision: https://reviews.llvm.org/D62121
llvm-svn: 361928
Summary: This patches fixes an issue in which the __clang_cuda_cmath.h header is being included even when cmath or math.h headers are not included.
Reviewers: jdoerfert, ABataev, hfinkel, caomhin, tra
Reviewed By: tra
Subscribers: tra, mgorny, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61765
llvm-svn: 360626
Summary:
In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.
We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.
Authors:
@gtbercea
@jdoerfert
Reviewers: hfinkel, caomhin, ABataev, tra
Reviewed By: hfinkel, ABataev, tra
Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61399
llvm-svn: 360265
This commit appears to be breaking stage-2 builds on GreenDragon. The
OpenMP wrappers for cmath and math.h are copied into the root of the
resource directory and cause a cyclic dependency in module 'Darwin':
Darwin -> std -> Darwin. This blows up when CMake is testing for modules
support and breaks all stage 2 module builds, including the ThinLTO bot
and all LLDB bots.
CMake Error at cmake/modules/HandleLLVMOptions.cmake:497 (message):
LLVM_ENABLE_MODULES is not supported by this compiler
llvm-svn: 360192
Summary:
In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.
We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.
Authors:
@gtbercea
@jdoerfert
Reviewers: hfinkel, caomhin, ABataev, tra
Reviewed By: hfinkel, ABataev, tra
Subscribers: mgorny, guansong, cfe-commits, jdoerfert
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61399
llvm-svn: 360063
Summary:
1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake;
2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision.
For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
Patch by LiuTianle
Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon
Reviewed By: craig.topper
Subscribers: mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60552
llvm-svn: 360018
Summary:
This is a follow up to r355253 and a better fix than the first attempt
which was r359257.
We can't install anything from ${CMAKE_CFG_INTDIR}, because this value
is only defined at build time, but we still must make sure to copy the
headers into ${CMAKE_CFG_INTDIR}/lib/clang/$VERSION/include, because the lit
tests look for headers there. So for this fix we revert to the
old behavior of copying the headers to ${CMAKE_CFG_INTDIR}/lib/clang/$VERSION/include
during the build and then installing them from the source tree.
Reviewers: smeenai, vzakhari, phosek
Reviewed By: smeenai, vzakhari
Subscribers: mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61220
llvm-svn: 359654
Summary:
This is a follow up to r355253, which inadvertently broke Visual
Studio builds by trying to copy files from CMAKE_CFG_INTDIR.
See https://reviews.llvm.org/D58537#inline-532492
Reviewers: smeenai, vzakhari, phosek
Reviewed By: smeenai
Subscribers: mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61054
llvm-svn: 359257
Port mmintrin.h which include x86 MMX intrinsics implementation to PowerPC platform (using Altivec).
To make the include process correct, PowerPC's toolchain class is overrided to insert new headers directory (named ppc_wrappers) into the path. Basic test cases for several intrinsic functions are added.
The header is mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu.
Reviewed By: Jinsong Ji
Differential Revision: https://reviews.llvm.org/D59924
llvm-svn: 358949
Summary:
The current install-clang-headers target installs clang's resource
directory headers. This is different from the install-llvm-headers
target, which installs LLVM's API headers. We want to introduce the
corresponding target to clang, and the natural name for that new target
would be install-clang-headers. Rename the existing target to
install-clang-resource-headers to free up the install-clang-headers name
for the new target, following the discussion on cfe-dev [1].
I didn't find any bots on zorg referencing install-clang-headers. I'll
send out another PSA to cfe-dev to accompany this rename.
[1] http://lists.llvm.org/pipermail/cfe-dev/2019-February/061365.html
Reviewers: beanz, phosek, tstellar, rnk, dim, serge-sans-paille
Subscribers: mgorny, javed.absar, jdoerfert, #sanitizers, openmp-commits, lldb-commits, cfe-commits, llvm-commits
Tags: #clang, #sanitizers, #lldb, #openmp, #llvm
Differential Revision: https://reviews.llvm.org/D58791
llvm-svn: 355340
Summary:
Replace cut and pasted code with cmake macros and reduce the number of
install commands. This fixes an issue where the headers were being
installed twice.
This clean up should also make future modifications easier, like
adding a cmake option to install header files into a custom resource
directory.
Reviewers: chandlerc, smeenai, mgorny, beanz, phosek
Reviewed By: smeenai
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58537
llvm-svn: 355253
r344555 switched LLVM to guarding install targets with LLVM_ENABLE_IDE
instead of CMAKE_CONFIGURATION_TYPES, which expresses the intent more
directly and can be overridden by a user. Make the corresponding change
in clang. LLVM_ENABLE_IDE is computed by HandleLLVMOptions, so it should
be available for both standalone and integrated builds.
Differential Revision: https://reviews.llvm.org/D58284
llvm-svn: 354525