This patch changes the AArch32 crypto instructions (sha2 and aes) to
require the specific sha2 or aes features. These features have
already been implemented and can be controlled through the command
line, but do not have the expected result (i.e. `+noaes` will not
disable aes instructions). The crypto feature retains its existing
meaning of both sha2 and aes.
Several small changes are included due to the knock-on effect this has:
- The AArch32 driver has been modified to ensure sha2/aes is correctly
set based on arch/cpu/fpu selection and feature ordering.
- Crypto extensions are permitted for AArch32 v8-R profile, but not
enabled by default.
- ACLE feature macros have been updated with the fine grained crypto
algorithms. These are also used by AArch64.
- Various tests updated due to the change in feature lists and macros.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D99079
Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
In detail this patch:
- introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type.
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Alexandros Lamprineas
- Luke Geeson
- Simon Tatham
- Ties Stuij
Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli
Reviewed By: SjoerdMeijer
Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76077
This is the result of an audit of all of the ABIs in clang to implement
and enable the type for those targets.
Additionally, this finds an issue with integer-promotion passing for a
few platforms when using _ExtInt of < int, so this also corrects that
resulting in signext/zeroext being on a params of those types in some
platforms.
Differential Revisions: https://reviews.llvm.org/D79118
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch32
- Intrinsics Support for AArch32 Neon Intrinsics for Matrix
Multiplication
Note: these extensions are optional in the 8.6a architecture and so have
to be enabled by default
No additional IR types or C Types are needed for this extension.
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: t.p.northover, miyuki
Reviewed By: miyuki
Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77872
Summary:
This patch implements feature test macros for the CDE extension
according to the upcoming ACLE specification.
The following 2 macros are being added:
- __ARM_FEATURE_CDE - defined as '1' when any coprocessor is
configured as a CDE coprocessor
- __ARM_FEATURE_CDE_COPROC - defined as an 8-bit mask, each bit of the
mask corresponds to a coprocessor and is set when the corresponding
coprocessor is configured as CDE (and cleared otherwise).
The patch also exposes the value of __ARM_FEATURE_CDE_COPROC in the
target-independent method TargetInfo::getARMCDECorpocMask, the method
will be used in follow-up patches implementing semantic checks of CDE
intrinsics (we want to diagnose the cases when CDE intrinsics are used
with coprocessors that are not configured as CDE).
Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM
Reviewed By: simon_tatham
Subscribers: kristof.beyls, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75843
Summary:
Add support for vcadd_* family of intrinsics. This set of intrinsics is
available in Armv8.3-A.
The fp16 versions require the FP16 extension, which has been available
(opt-in) since Armv8.2-A.
Reviewers: t.p.northover
Reviewed By: t.p.northover
Subscribers: t.p.northover, kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70862
Provides support for using r6-r11 as globally scoped
register variables. This requires a -ffixed-rN flag
in order to reserve rN against general allocation.
If for a given GRV declaration the corresponding flag
is not found, or the the register in question is the
target's FP, we fail with a diagnostic.
Differential Revision: https://reviews.llvm.org/D68862
ARM has a special target feature called soft-float-abi. This feature is
special, since we get it passed to us explicitly in the frontend, but
filter it out before it can land in any target feature strings in LLVM
IR.
__attribute__((target(""))) doesn't quite filter these features out
properly, so today, we get warnings about soft-float-abi being an
unknown feature from the backend.
This CL has us filter soft-float-abi out at a slightly different point,
so we don't end up passing these invalid features to the backend.
Differential Revision: https://reviews.llvm.org/D61750
llvm-svn: 363346
If MVE is present at all, then the macro __ARM_FEATURE_MVE is defined
to a value which has bit 0 set for integer MVE, and bit 1 set for
floating-point MVE.
(Floating-point MVE implies integer MVE, so if this macro is defined
at all then it will be set to 1 or 3, never 2.)
Patch mostly by Simon Tatham
Differential Revision: https://reviews.llvm.org/D60710
llvm-svn: 362806
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
The getConstraintRegister method is used by semantic checking of
inline assembly statements in order to diagnose conflicts between
clobber list and input/output lists. Currently ARM and AArch64 don't
override getConstraintRegister, so conflicts between registers
assigned to variables in asm labels and clobber lists are not
diagnosed. Such conflicts can cause assertion failures in the back end
and even miscompilations.
This patch implements getConstraintRegister for ARM and AArch64
targets. Since these targets don't have single-register constraints,
the implementation is trivial and just returns the register specified
in an asm label (if any).
Reviewers: eli.friedman, javed.absar, thopre
Reviewed By: thopre
Subscribers: rengolin, eraman, rogfer01, myatsina, kristof.beyls, cfe-commits, chrib
Differential Revision: https://reviews.llvm.org/D45965
llvm-svn: 331164
This adds a pre-defined macro to test if the compiler has support for the
v8.2-A dot rpoduct intrinsics in AArch32 mode.
The AAcrh64 equivalent has already been added by rL330229.
The ACLE spec which describes this macro hasn't been published yet, but this is
based on the final internal draft, and GCC has already implemented this.
Differential revision: https://reviews.llvm.org/D46108
llvm-svn: 331038
For generating NEON intrinsics, this determines the NEON data type, and whether
it should be a half type or an i16 type. I.e., we always pass a half type for
AArch64, this hasn't changed, but now also for ARM but only when FullFP16 is
enabled, and i16 otherwise.
This is intended to be non-functional change, but together with the backend
work in D44538 which adds support for f16 vectors, this enables adding the
AArch32 FP16 (vector) intrinsics.
Differential Revision: https://reviews.llvm.org/D44561
llvm-svn: 327836
This is a partial recommit of r327189 that was reverted
due to test issues. I.e., this recommits minimal functional
change, the FP16 feature test macros, and adds tests that
were missing in the original commit.
llvm-svn: 327455
When rejecting a march= or target-cpu command line parameter,
the message is quite lacking. This patch adds a note that prints
all possible values for the current target, if the target supports it.
This adds support for the ARM/AArch64 targets (more to come!).
Differential Revision: https://reviews.llvm.org/D42978
llvm-svn: 324673
This commit fixes a bug in IRGen where it generates completely broken
code for __fp16 vectors on X86. For example when the following code is
compiled:
half4 hv0, hv1, hv2; // these are vectors of __fp16.
void foo221() {
hv0 = hv1 + hv2;
}
clang generates the following IR, in which two i16 vectors are added:
@hv1 = common global <4 x i16> zeroinitializer, align 8
@hv2 = common global <4 x i16> zeroinitializer, align 8
@hv0 = common global <4 x i16> zeroinitializer, align 8
define void @foo221() {
%0 = load <4 x i16>, <4 x i16>* @hv1, align 8
%1 = load <4 x i16>, <4 x i16>* @hv2, align 8
%add = add <4 x i16> %0, %1
store <4 x i16> %add, <4 x i16>* @hv0, align 8
ret void
}
To fix the bug, this commit uses the code committed in r314056, which
modified clang to promote and truncate __fp16 vectors to and from float
vectors in the AST. It also fixes another IRGen bug where a short value
is assigned to an __fp16 variable without any integer-to-floating-point
conversion, as shown in the following example:
__fp16 a;
short b;
void foo1() {
a = b;
}
@b = common global i16 0, align 2
@a = common global i16 0, align 2
define void @foo1() #0 {
%0 = load i16, i16* @b, align 2
store i16 %0, i16* @a, align 2
ret void
}
rdar://problem/20625184
Differential Revision: https://reviews.llvm.org/D40112
llvm-svn: 320215
Targets.cpp is getting unwieldy, and even minor changes cause the entire thing
to cause recompilation for everyone. This patch bites the bullet and breaks
it up into a number of files.
I tended to keep function definitions in the class declaration unless it
caused additional includes to be necessary. In those cases, I pulled it
over into the .cpp file. Content is copy/paste for the most part,
besides includes/format/etc.
Differential Revision: https://reviews.llvm.org/D35701
llvm-svn: 308791