Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D112055
I recently renamed some tests from vfredsum to vfredusum without
noticing they tested both ordered and unordered reductions. This
patch renames them back.
I've also merged test files for vfwredosum/vfwredusum into a single
file for consistency with the other 3 floating point reduction test
files. The inconsistency is what caused my original confusion.
This patch implements __builtin_reduce_max and __builtin_reduce_min as
specified in D111529.
The order of operations does not matter for min or max reductions and
they can be directly lowered to the corresponding
llvm.vector.reduce.{fmin,fmax,umin,umax,smin,smax} intrinsic calls.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D112001
Basically, inline builtin definition are shadowed by externally visible
redefinition. This matches GCC behavior.
The implementation has to workaround the fact that:
1. inline builtin are renamed at callsite during codegen, but
2. they may be shadowed by a later external definition
As a consequence, during codegen, we need to walk redecls and eventually rewrite
some call sites, which is totally inelegant.
Differential Revision: https://reviews.llvm.org/D112059
When D105690 changed the mnemonic from vf(w)redsum to vf(w)redusum,
several tests were deleted instead of being renamed.
This commit also consistently renames the other tests that weren't
deleted.
Verify that the resolver exists, that it is a defined
Function, and that its return type matches the ifunc's
type. Add corresponding check to BitcodeReader, change
clang to emit the correct type, and fix tests to comply.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D112349
This change fixes a crash where a NULL fd was used to emit a diagnostic.
Instead of crashing, just avoid printing the declaration name when there's no
associated function declaration.
Differential Revision: https://reviews.llvm.org/D109402
Add i32x4.relaxed_trunc_f32x4_s, i32x4.relaxed_trunc_f32x4_u,
i32x4.relaxed_trunc_f64x2_s_zero, i32x4.relaxed_trunc_f64x2_u_zero.
These are only exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112186
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.
Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.
This reverts commit f3190dedee.
This reverts commit 749581d21f.
This reverts commit f3df87d57e.
This reverts commit ab1dbcecd6.
This patch implements __builtin_elementwise_abs as specified in
D111529.
Reviewed By: aaron.ballman, scanon
Differential Revision: https://reviews.llvm.org/D111986
There's precedent for that in `CreateOr()`/`CreateAnd()`.
The motivation here is to avoid bloating the run-time check's IR
in `SCEVExpander::generateOverflowCheck()`.
Refs. https://reviews.llvm.org/D109368#3089809
While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.
There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.
Refs. https://reviews.llvm.org/D109368#3089809
This patch implements __builtin_elementwise_max and
__builtin_elementwise_min, as specified in D111529.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D111985
All but 2 of the vector builtins are only used by clang_builtin_alias.
When using clang_builtin_alias, the type string of the builtin is never
checked. Only the types in the function definition used for the alias
are checked.
This patch takes advantage of this to share a single builtin for
many different types. We already used type overloads on the IR intrinsic
so the codegen for the builtins that are being merge were already
the same. This extends the type overloading to the builtins.
I had to make a few tweaks to make this work.
-Floating point vector-vector vmerge now uses the vmerge intrinsic
instead of the vfmerge intrinsic. New isel patterns and tests are
added to support this.
-The SemaChecking for the immediate of vset_v/vget_v has been removed.
Determining the valid range is harder now. I've added masking to
ManualCodegen to ensure valid IR for invalid input.
This reduces the number of builtins from ~25000 to ~1100.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D112102
This patch splits the existing SveVectorBits LangOpt into VScaleMin and
VScaleMax LangOpts such that we can represent such an option. The cc1
option has also been split into -mvscale-{min,max}=<n> options so that the
cc1 arguments better reflect the vscale_range IR attribute.
Differential Revision: https://reviews.llvm.org/D111790
Zvamo is not part of the 1.0 V spec. Remove the intrinsics
for now. This helps reduce clang binary size and lit test time.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D111692
Based on post-commit review discussion on
2bd8493847 with Richard Smith.
Other uses of forcing HasEmptyPlaceHolder to false seem OK to me -
they're all around pointer/reference types where the pointer/reference
token will appear at the rightmost side of the left side of the type
name, so they make nested types (eg: the "int" in "int *") behave as
though there is a non-empty placeholder (because the "*" is essentially
the placeholder as far as the "int" is concerned).
This was originally committed in 277623f4d5
Reverted in f9ad1d1c77 due to breakages
outside of clang - lldb seems to have some strange/strong dependence on
"char [N]" versus "char[N]" when printing strings (not due to that name
appearing in DWARF, but probably due to using clang to stringify type
names) that'll need to be addressed, plus a few other odds and ends in
other subprojects (clang-tools-extra, compiler-rt, etc).
This reverts commit 121b2252de.
The following code causes a crash in some circumstances:
struct k {
~k() __attribute__((annotate(""))) {}
};
void m() { k(); }
Add relaxed. f32x4.min, f32x4.max, f64x2.min, f64x2.max. These are only
exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112146
This change implements new DAG nodes TABLE_GET/TABLE_SET, and lowering
methods for load and stores of reference types from IR arrays. These
global LLVM IR arrays represent tables at the Wasm level.
Differential Revision: https://reviews.llvm.org/D111154
Add i8x16 relaxed_swizzle instructions. These are only
exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112022
Now that the legacy PM is deprecated for the optimization pipeline, we
can start deleting legacy PM tests.
For tests that test both PMs, merge the RUN lines.
Delete tests specific to the legacy PM.
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
This patch remove the override in AIX target,
so the int128 is enabled in 64 bit mode or with ForceEnableInt128.
Reviewed By: lkail
Differential Revision: https://reviews.llvm.org/D111078
This was committed as ec6c847179, but then reverted after a failure
in: https://lab.llvm.org/buildbot/#/builders/84/builds/13983
I was not able to reproduce the problem, but I added an extra check
for a NULL QualType just in case.
Original comit message:
The patch adds missing diagnostics for cases like:
float F3 = ((__float128)F1 * (__float128)F2) / 2.0f;
Sema::checkDeviceDecl (renamed to checkTypeSupport) is changed to work
with a type without the corresponding ValueDecl. It is also refactored
so that host diagnostics for unsupported types can be added here as
well.
Differential Revision: https://reviews.llvm.org/D109315
Looks like lldb has some issues with this - somehow it causes lldb to
treat a "char[N]" type as an array of chars (prints them out
individually) but a "char [N]" is printed as a string. (even though the
DWARF doesn't have this string in it - it's something to do with the
string lldb generates for itself using clang)
This reverts commit 277623f4d5.
Based on post-commit review discussion on
2bd8493847 with Richard Smith.
Other uses of forcing HasEmptyPlaceHolder to false seem OK to me -
they're all around pointer/reference types where the pointer/reference
token will appear at the rightmost side of the left side of the type
name, so they make nested types (eg: the "int" in "int *") behave as
though there is a non-empty placeholder (because the "*" is essentially
the placeholder as far as the "int" is concerned).
The builtin __rlwnm is currently constrained to accept only constants
for the shift parameter but the instructions emitted for it have no such
constraint, this patch allows the builtins to accept variable shift.
Reviewed By: NeHuang, amyk
Differential Revision: https://reviews.llvm.org/D111229
This reverts commit 97f0c63783.
As discussed in https://reviews.llvm.org/D110684, it increased the
compile time and the binary size of clang more than 1%. I reverted
this patch first to think about a better way to do it.
GCC 9.1 removed Intel MPX support. Linux kernel removed MPX in 2019.
glibc 2.35 will remove MPX.
Our support is limited: we support assembling of bndmov but not bnd.
Just remove it.
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D111517
Rename vfredsum and vfwredsum to vfredusum and vfwredusum. Add aliases for vfredsum and vfwredsum.
Reviewed By: luismarques, HsiangKai, khchen, frasercrmck, kito-cheng, craig.topper
Differential Revision: https://reviews.llvm.org/D105690
Current btf_tag is applied to declaration only.
Per discussion in https://reviews.llvm.org/D111199,
we plan to introduce btf_type_tag attribute for types.
So rename btf_tag to btf_decl_tag to make it easily
differentiable from btf_type_tag.
Differential Revision: https://reviews.llvm.org/D111588
In the original design, we levarage _mt intrinsics to define macros for
_m intrinsics. Such as,
```
__builtin_rvv_vadd_vv_i8m1_mt((vbool8_t)(op0), (vint8m1_t)(op1), (vint8m1_t)(op2), (vint8m1_t)(op3), (size_t)(op4), (size_t)VE_TAIL_AGNOSTIC)
```
However, we could not define generic interface for mask intrinsics any
more due to clang_builtin_alias only accepts clang builtins as its
argument.
In the example,
```
__rvv_overloaded
__attribute__((clang_builtin_alias(__builtin_rvv_vadd_vv_i8m1_mt)))
vint8m1_t vadd(vbool8_t op0, vint8m1_t op1, vint8m1_t op2, vint8m1_t
op3, size_t op4, size_t op5);
```
op5 is the tail policy argument. When users want to use vadd generic
interface for masked vector add, they need to specify tail policy in the
previous design. In this patch, we define _m intrinsics as clang
builtins to solve the problem.
Differential Revision: https://reviews.llvm.org/D110684
When AnnotateAttr is on a function, AddGlobalAnnotations is only called
in CodeGenModule::EmitGlobalFunctionDefinition which means AnnotateAttr
on function declaration without function body will be ignored.
The patch will move AddGlobalAnnotations to
CodeGenModule::SetFunctionAttributes, so with or without function body,
the AnnotateAttr will get code gen for a function.
It'll help case when AnnotateAttr is on external function, and the
AnnotateAttr will be consumed in IR level.
For example, a pass to collect num of uses for functions with
__attribute((annotate("count_use"))) after optimizations,
As long as there's __attribute((annotate("count_use"))), function with
or without function body should be counted.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D111109
Patch by: python3kgae (Xiang Li)
The SSE4 header (smmintrin.h) should include SSSE3 (tmmintrin.h) instead
of SSE2 (emmintrin.h).
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D111482
fae0dfa implemented the new __ibm128 type, this patch enables its
complex form.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D109948