Neither gcc or icc support this. Split out from D79472. I want
to remove more, but it looks like icc does support some things
gcc doesn't and I need to double check our internal test suites.
This is the result of an audit of all of the ABIs in clang to implement
and enable the type for those targets.
Additionally, this finds an issue with integer-promotion passing for a
few platforms when using _ExtInt of < int, so this also corrects that
resulting in signext/zeroext being on a params of those types in some
platforms.
Differential Revisions: https://reviews.llvm.org/D79118
gcc supports selecting ymm0/zmm0 for the Yz constraint when used with 256 or 512 bit vector types.
Fixes PR45806
Differential Revision: https://reviews.llvm.org/D79448
The reinterpret builtins are generated separately because they
need the cross product of all types, 121 functions in total,
which is inconvenient to specify in the arm_sve.td file.
Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78756
* svdupq builtins that duplicate scalars to every quadword of a vector
are defined using builtins for svld1rq (load and replicate quadword).
* svdupq builtins that duplicate boolean values to fill a predicate vector
are defined using `svcmpne`.
Reviewers: SjoerdMeijer, efriedma, ctetreau
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78750
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79224
test cases
Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff
Differential Revision: https://reviews.llvm.org/D72841
This reverts commit 85dc033cac, and makes
corrections to the test cases that failed on buildbots.
Summary:
Per Windows SEH Spec, except _leave, all other early exits of a _try (goto/return/continue/break) are considered abnormal exits. In those cases, the first parameter passes to its _finally funclet should be TRUE to indicate an abnormal-termination.
One way to implement abnormal exits in _try is to invoke Windows runtime _local_unwind() (MSVC approach) that will invoke _dtor funclet where abnormal-termination flag is always TRUE when calling _finally. Obviously this approach is less optimal and is complicated to implement in Clang.
Clang today has a NormalCleanupDestSlot mechanism to dispatch multiple exits at the end of _try. Since _leave (or try-end fall-through) is always Indexed with 0 in that NormalCleanupDestSlot, this fix takes the advantage of that mechanism and just passes NormalCleanupDest ID as 1st Arg to _finally.
Reviewers: rnk, eli.friedman, JosephTremoulet, asmith, efriedma
Reviewed By: efriedma
Subscribers: efriedma, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77936
I'm currently auditing all of the calling convention implications of
_ExtInt for all platforms, so splitting them up into their own test will
make this a much easier task to organize.
After speaking with Craig Topper about some recent defects, he pointed
out that _ExtInts should be passed indirectly if larger than the largest
int register, and like ints when smaller than that. This patch
implements that.
Note that this changed the way vaargs worked quite a bit, but they still
work.
Differential Revision: https://reviews.llvm.org/D78785
When passing a value of a struct/union type from secure to non-secure
state (that is returning from a CMSE entry function or passing an
argument to CMSE-non-secure call), there is a potential sensitive
information leak via the padding bits in the structure. It is not
possible in the general case to ensure those bits are cleared by using
Standard C/C++.
This patch makes the compiler emit code to clear such padding
bits. Since type information is lost in LLVM IR, the code generation
is done by Clang.
For each interesting record type, we build a bitmask, in which all the
bits, corresponding to user declared members, are set. Values of
record types are returned by coercing them to an integer. After the
coercion, the coerced value is masked (with bitwise AND) and then
returned by the function. In a similar manner, values of record types
are passed as arguments by coercing them to an array of integers, and
the coerced values themselves are masked.
For union types, we effectively clear only bits, which aren't part of
any member, since we don't know which is the currently active one.
The compiler will issue a warning, whenever a union is passed to
non-secure state.
Values of half-precision floating-point types are passed in the least
significant bits of a 32-bit register (GPR or FPR) with the most
significant bits unspecified. Since this is also a potential leak of
sensitive information, this patch also clears those unspecified bits.
Differential Revision: https://reviews.llvm.org/D76369
This patch adds builtins for:
- svmad, svmla, svmls, svmsb
svnmad, svnmla, svnmls, svnmsb
svmla_lane, svmls_lane
These builtins come in several flavours:
- Merge into first source vector (`_m`)
- False lanes are undef (`_x`)
- False lanes are zeroed (`_z`)
And can also have `_n` to indicate the last operand is a scalar.
For example:
svint32_t svmla[_n_s32]_z(svbool_t pg, svint32_t op1, svint32_t op2, int32_t op3)
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D78960
The svlen builtins return the number of elements in a vector
and are implemented using `llvm.vscale`.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D78755
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.
This patch adds the flag IsInsertOp1SVALL to insert SV_ALL as the
second operand.
Reviewers: efriedma, SjoerdMeijer
Reviewed By: SjoerdMeijer
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78401
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.
This patch adds the flag IsAppendSVALL to append SV_ALL as the final
operand.
Reviewers: SjoerdMeijer, efriedma, rovka, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77597
These tests were placed in the CodeGen directory while they really
should have been placed in the Preprocessor directory.
Differential Revision: https://reviews.llvm.org/D78163
This intrinsic isn't overloaded so we should query with types.
Doing so causes the backend to miss the intrinsic and not codegen it.
This eventually leads to a linker error.
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch32
- Intrinsics Support for AArch32 Neon Intrinsics for Matrix
Multiplication
Note: these extensions are optional in the 8.6a architecture and so have
to be enabled by default
No additional IR types or C Types are needed for this extension.
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: t.p.northover, miyuki
Reviewed By: miyuki
Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77872
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)
No IR types or C Types are needed for this extension.
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin
Reviewed By: kmclaughlin
Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77871
The IsReverseCompare flag tells CGBuiltin to swap the operands,
so that a LT/LE intrinsics can be expressed in terms of GE/GT
intrinsics.
This patch also adds builtins for the wide-variants of the compares.
Reviewers: SjoerdMeijer, efriedma, ctetreau
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78747
This patch also adds the enum `sv_prfop` for the prefetch operation specifier
and checks to ensure the passed enum values are valid.
Reviewers: SjoerdMeijer, efriedma, ctetreau
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78674
This patch changes the codegen of the builtins for contiguous loads
to map onto the SVE specific IR intrinsics llvm.aarch64.sve.ld1/st1.
Reviewers: SjoerdMeijer, efriedma, kmclaughlin, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78673
This patch changes the FP conversion intrinsics to take a predicate
that matches the number of lanes for the vector with the widest element
type as opposed to using <vscale x 16 x i1>.
For example:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f16(<vscale x 4 x float>, <vscale x 4 x i1>, <vscale x 8 x half>)```
now uses <vscale x 4 x i1> instead of <vscale x 16 x i1>
And similar for:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f64(<vscale x 4 x float>, <vscale x 2 x i1>, <vscale x 2 x double>)```
where the predicate now matches the wider type, so <vscale x 2 x i1>.
Reviewers: efriedma, SjoerdMeijer, paulwalker-arm, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78402
This adds the flag IsOverloadCvt which tells CGBulitin to use
the result type and the type of the last operand as the
overloaded types for the LLVM IR intrinsic.
This also adds the flag IsFPConvert, which is needed to avoid
converting the predicate of the operation from svbool_t to
a predicate with fewer lanes, as the LLVM IR intrinsics use
the <vscale x 16 x i1> as the predicate.
Reviewers: SjoerdMeijer, efriedma
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78239
This also adds the IsOverloadWhileRW flag which tells CGBuiltin to use
the result predicate type and the first pointer type as the
overloaded types for the LLVM IR intrinsic.
Reviewers: SjoerdMeijer, efriedma
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78238
This also adds the IsOverloadWhile flag which tells CGBuiltin to use
both the default type (predicate) and the type of the second operand
(scalar) as the overloaded types for the LLMV IR intrinsic.
Reviewers: SjoerdMeijer, efriedma, rovka
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77595