Commit Graph

5856 Commits

Author SHA1 Message Date
Ian Levesque bb3111cbaf [clang][xray] Add xray attributes to functions without decls too
Summary: This allows instrumenting things like global initializers

Reviewers: dberris, MaskRay, smeenai

Subscribers: cfe-commits, johnislarry

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77191
2020-04-01 00:02:39 -04:00
Anna Thomas 58a05675da Revert "[InlineFunction] Handle return attributes on call within inlined body"
This reverts commit 28518d9ae3.
There is a failure in MsgPackReader.cpp when built with clang. It
complains about "signext and zeroext" are incompatible. Investigating
offline if it is infact a UB in the MsgPackReader code.
2020-03-31 16:16:34 -04:00
Anna Thomas 28518d9ae3 [InlineFunction] Handle return attributes on call within inlined body
Consider a callee function that has a call (C) within it which feeds
into the return.  When we inline that callee into a callsite that has
return attributes, we can backward propagate those attributes to the
call (C) within that inlined callee body.

This is safe to do so only if we can guarantee transfer of execution to
successor in the window of instructions between return value (i.e. the
call C) and the return instruction.

See added test cases.

Reviewed-By: reames, jdoerfert

Differential Revision: https://reviews.llvm.org/D76140
2020-03-31 14:35:40 -04:00
Yonghong Song ced0d1f42b [BPF] support 128bit int explicitly in layout spec
Currently, bpf does not specify 128bit alignment in its
layout spec. So for a structure like
  struct ipv6_key_t {
    unsigned pid;
    unsigned __int128 saddr;
    unsigned short lport;
  };
clang will generate IR type
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
Additional padding is to ensure later IR->MIR can generate correct
stack layout with target layout spec.

But it is common practice for a tracing program to be
first compiled with target flag (e.g., x86_64 or aarch64) through
clang to generate IR and then go through llc to generate bpf
byte code. Tracing program often refers to kernel internal
data structures which needs to be compiled with non-bpf target.

But such a compilation model may cause a problem on aarch64.
The bcc issue https://github.com/iovisor/bcc/issues/2827
reported such a problem.

For the above structure, since aarch64 has "i128:128" in its
layout string, the generated IR will have
  %struct.ipv6_key_t = type { i32, i128, i16 }

Since bpf does not have "i128:128" in its spec string,
the selectionDAG assumes alignment 8 for i128 and
computes the stack storage size for the above is 32 bytes,
which leads incorrect code later.

The x86_64 does not have this issue as it does not have
"i128:128" in its layout spec as it does permits i128 to
be alignmented at 8 bytes at stack. Its IR type looks like
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }

The fix here is add i128 support in layout spec, the same as
aarch64. The only downside is we may have less optimal stack
allocation in certain cases since we require 16byte alignment
for i128 instead of 8. But this is probably fine as i128 is
not used widely and in most cases users should already
have proper alignment.

Differential Revision: https://reviews.llvm.org/D76587
2020-03-28 11:46:29 -07:00
Mikhail Maltsev bd722ef63f [ARM,CDE] Improve CDE intrinsics testing
Summary:
This patch:
* adds tests for vreinterpret intinsics in big-endian mode
* adds C++ runs to the CDE+MVE header compatibility test

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76927
2020-03-27 16:05:18 +00:00
Mikael Holmen 7d482e9213 Fix TBAA for unsigned fixed-point types
Summary:
Unsigned types can alias the corresponding signed types. I don't see
that this is explicitly mentioned in the Embedded-C specification, but
I think it should work the same as for the integer types.

Patch by: materi

Reviewers: ebevhan, leonardchan

Reviewed By: leonardchan

Subscribers: kosarev, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76856
2020-03-27 10:35:24 +01:00
Sid Manning b0da094983 [Hexagon] Add support for Linux/Musl ABI (part 2)
A continuation of https://reviews.llvm.org/D72701.  This
adds support needed in clang.

Differential Revision: https://reviews.llvm.org/D75638
2020-03-26 17:19:46 -05:00
Sam McCall 13dc21e841 [AST] Make thinlto testcase robust to 159a9f7e76
Ultimately it relies on the output of __PRETTY_FUNCTION__ which isn't reliable.
2020-03-26 12:47:39 +01:00
Sam McCall 38798d0306 Revert "[AST] Fix thinlto testcase missed in 159a9f7e76307734bcdcae3357640e42e0733194"
This reverts commit 4bd1d55884.
Cure is worse than the disease: "> >" is still expected in most configs.
Working on fixing the fuchsia builder.
2020-03-26 12:38:33 +01:00
Sam McCall 4bd1d55884 [AST] Fix thinlto testcase missed in 159a9f7e76 2020-03-26 10:28:54 +01:00
Mikhail Maltsev bb4da94e5b [ARM,CDE] Implement predicated Q-register CDE intrinsics
Summary:
This patch implements the following CDE intrinsics:

  T __arm_vcx1q_m(int coproc, T inactive, uint32_t imm, mve_pred_t p);
  T __arm_vcx2q_m(int coproc, T inactive, U n, uint32_t imm, mve_pred_t p);
  T __arm_vcx3q_m(int coproc, T inactive, U n, V m, uint32_t imm, mve_pred_t p);

  T __arm_vcx1qa_m(int coproc, T acc, uint32_t imm, mve_pred_t p);
  T __arm_vcx2qa_m(int coproc, T acc, U n, uint32_t imm, mve_pred_t p);
  T __arm_vcx3qa_m(int coproc, T acc, U n, V m, uint32_t imm, mve_pred_t p);

The intrinsics are not part of the released ACLE spec, but internally at
Arm we have reached consensus to add them to the next ACLE release.

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76610
2020-03-25 17:08:19 +00:00
Simon Tatham 8f1651ccea [ARM,MVE] Add missing tests for vqdmlash intrinsics.
Summary:
These were accidentally left out of D76123. I added tests for the
other three instructions in this small cross-product family (vqdmlah,
vqrdmlah, vqrdmlash) but missed this one.

Reviewers: miyuki

Reviewed By: miyuki

Subscribers: kristof.beyls, dmgreen, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76714
2020-03-25 09:46:16 +00:00
Erik Pilkington de98cf92e3 [CodeGen] Add an alignment attribute to all sret parameters
This fixes a miscompile when the parameter is actually underaligned.
rdar://58316406

Differential revision: https://reviews.llvm.org/D74183
2020-03-24 15:31:57 -04:00
Sam McCall 0b59982134
[CodeGen] Fix test attr-noreturn.c when run from my home directory 2020-03-24 13:59:16 +01:00
Momchil Velikov 080d046c91 [ARM][CMSE] Implement CMSE attributes
This patch adds CMSE attributes `cmse_nonsecure_call` and
`cmse_nonsecure_entry`.  As usual, specification is available here:
https://developer.arm.com/docs/ecm0359818/latest

Patch by Javed Absar, Bradley Smith, David Green, Momchil Velikov,
possibly others.

Differential Revision: https://reviews.llvm.org/D71129
2020-03-24 10:21:26 +00:00
Momchil Velikov 6081ccf4a3 Apply function attributes through array declarators
There's inconsistency in handling array types between the
`distributeFunctionTypeAttrXXX` functions and the
`FunctionTypeUnwrapper` in `SemaType.cpp`.

This patch lets `FunctionTypeUnwrapper` apply function type attributes
through array types.

Differential Revision: https://reviews.llvm.org/D75109
2020-03-23 11:03:13 +00:00
Thomas Lively de6cd3e836 [WebAssembly] Add SIMD integer abs builtins
Summary:
Since the conditional operator cannot be used with vector conditions
in C, we need a builtin to be able to express this operation in C
source.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76538
2020-03-21 00:21:24 -07:00
Adrian Prantl ceae47143b Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 16:27:50 -07:00
Adrian Prantl bde15de3ca Revert "Allow remapping the sysroot with -fdebug-prefix-map."
This reverts commit 6725c4836a.
2020-03-20 16:27:23 -07:00
Adrian Prantl 6725c4836a Allow remapping the sysroot with -fdebug-prefix-map.
<rdar://problem/55685132>

Differential Revision: https://reviews.llvm.org/D76393
2020-03-20 15:52:39 -07:00
Simon Tatham 1adfa4c991 [ARM,MVE] Add ACLE intrinsics for the vaddv/vaddlv family.
Summary:
I've implemented them as target-specific IR intrinsics rather than
using `@llvm.experimental.vector.reduce.add`, on the grounds that the
'experimental' intrinsic doesn't currently have much code generation
benefit, and my replacements encapsulate the sign- or zero-extension
so that you don't expose the illegal MVE vector type (`<4 x i64>`) in
IR.

The machine instructions come in two versions: with and without an
input accumulator. My new IR intrinsics, like the 'experimental' one,
don't take an accumulator parameter: we represent that by just adding
on the input value using an ordinary i32 or i64 add. So if you write
the `vaddvaq` C-language intrinsic with an input accumulator of zero,
it can be optimised to VADDV, and conversely, if you write something
like `x += vaddvq(y)` then that can be combined into VADDVA.

Most of this is achieved in isel lowering, by converting these IR
intrinsics into the existing `ARMISD::VADDV` family of custom SDNode
types. For the difficult case (64-bit accumulators), isel lowering
already implements the optimization of folding an addition into a
VADDLV to make a VADDLVA; so once we've made a VADDLV, our job is
already done, except that I had to introduce a parallel set of ARMISD
nodes for the //predicated// forms of VADDLV.

For the simpler VADDV, we handle the predicated form by just leaving
the IR intrinsic alone and matching it in an ordinary dag pattern.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76491
2020-03-20 15:42:33 +00:00
Simon Tatham 45a9945b9e [ARM,MVE] Add ACLE intrinsics for the vminv/vmaxv family.
Summary:
I've implemented these as target-specific IR intrinsics, because
they're not //quite// enough like @llvm.experimental.vector.reduce.min
(which doesn't take the extra scalar parameter). Also this keeps the
predicated and unpredicated versions looking similar, and the
floating-point minnm/maxnm versions fold into the same schema.

We had a couple of min/max reductions already implemented, from the
initial pathfinding exercise in D67158. Those were done by having
separate IR intrinsic names for the signed and unsigned integer
versions; as part of this commit, I've changed them to use a flag
parameter indicating signedness, which is how we ended up deciding
that the rest of the MVE intrinsics family ought to work. So now
hopefully the ewhole lot is consistent.

In the new llc test, the output code from the `v8f16` test functions
looks quite unpleasant, but most of it is PCS lowering (you can't pass
a `half` directly in or out of a function). In other circumstances,
where you do something else with your `half` in the same function, it
doesn't look nearly as nasty.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76490
2020-03-20 15:42:33 +00:00
Mikhail Maltsev 6ae3eff8ba [ARM,CDE] Implement CDE vreinterpret intrinsics
Summary:
This patch implements the following CDE intrinsics:

  int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);
  uint16x8_t __arm_vreinterpretq_u16_u8 (uint8x16_t in);
  int16x8_t __arm_vreinterpretq_s16_u8 (uint8x16_t in);
  uint32x4_t __arm_vreinterpretq_u32_u8 (uint8x16_t in);
  int32x4_t __arm_vreinterpretq_s32_u8 (uint8x16_t in);
  uint64x2_t __arm_vreinterpretq_u64_u8 (uint8x16_t in);
  int64x2_t __arm_vreinterpretq_s64_u8 (uint8x16_t in);
  float16x8_t __arm_vreinterpretq_f16_u8 (uint8x16_t in);
  float32x4_t __arm_vreinterpretq_f32_u8 (uint8x16_t in);

These intrinsics are header-only because they reuse the existing
MVE vreinterpret clang built-ins.

This set is slightly different from the published specification
(see https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf):
it includes

  int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);

which was unintentionally ommitted from the spec, and
does not include

  float64x2_t __arm_vreinterpretq_f64_u8 (uint8x16_t in);

The float64x2_t type requires additional implementation
effort, and we are not including it yet.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76300
2020-03-20 14:01:57 +00:00
Mikhail Maltsev 969034b860 [ARM,CDE] Implement CDE unpredicated Q-register intrinsics
Summary:
This patch implements the following intrinsics:

  uint8x16_t __arm_vcx1q_u8 (int coproc, uint32_t imm);
  T __arm_vcx1qa(int coproc, T acc, uint32_t imm);
  T __arm_vcx2q(int coproc, T n, uint32_t imm);
  uint8x16_t __arm_vcx2q_u8(int coproc, T n, uint32_t imm);
  T __arm_vcx2qa(int coproc, T acc, U n, uint32_t imm);
  T __arm_vcx3q(int coproc, T n, U m, uint32_t imm);
  uint8x16_t __arm_vcx3q_u8(int coproc, T n, U m, uint32_t imm);
  T __arm_vcx3qa(int coproc, T acc, U n, V m, uint32_t imm);

Most of them are polymorphic. Furthermore, some intrinsics are
polymorphic by 2 or 3 parameter types, such polymorphism is not
supported by the existing MVE/CDE tablegen backends, also we don't
really want to have a combinatorial explosion caused by 1000 different
combinations of 3 vector types. Because of this some intrinsics are
implemented as macros involving a cast of the polymorphic arguments to
uint8x16_t.

The IR intrinsics are even more restricted in terms of types: all MVE
vectors are cast to v16i8.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76299
2020-03-20 14:01:56 +00:00
Mikhail Maltsev d22e661712 [ARM,CDE] Implement CDE S and D-register intrinsics
Summary:
This patch implements the following ACLE intrinsics:

  uint32_t __arm_vcx1_u32(int coproc, uint32_t imm);
  uint32_t __arm_vcx1a_u32(int coproc, uint32_t acc, uint32_t imm);
  uint32_t __arm_vcx2_u32(int coproc, uint32_t n, uint32_t imm);
  uint32_t __arm_vcx2a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t imm);
  uint32_t __arm_vcx3_u32(int coproc, uint32_t n, uint32_t m, uint32_t imm);
  uint32_t __arm_vcx3a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t m, uint32_t imm);

  uint64_t __arm_vcx1d_u64(int coproc, uint32_t imm);
  uint64_t __arm_vcx1da_u64(int coproc, uint64_t acc, uint32_t imm);
  uint64_t __arm_vcx2d_u64(int coproc, uint64_t m, uint32_t imm);
  uint64_t __arm_vcx2da_u64(int coproc, uint64_t acc, uint64_t m, uint32_t imm);
  uint64_t __arm_vcx3d_u64(int coproc, uint64_t n, uint64_t m, uint32_t imm);
  uint64_t __arm_vcx3da_u64(int coproc, uint64_t acc, uint64_t n, uint64_t m, uint32_t imm);

Since the semantics of CDE instructions is opaque to the compiler, the
ACLE intrinsics require dedicated LLVM IR intrinsics. The 64-bit and
32-bit variants share the same IR intrinsic.

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76298
2020-03-20 14:01:53 +00:00
Mikhail Maltsev 7a85e3585e [ARM,CDE] Implement GPR CDE intrinsics
Summary:
This change implements ACLE CDE intrinsics that translate to
instructions working with general-purpose registers.

The specification is available at
https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf

Each ACLE intrinsic gets a corresponding LLVM IR intrinsic (because
they have distinct function prototypes). Dual-register operands are
represented as pairs of i32 values. Because of this the instruction
selection for these intrinsics cannot be represented as TableGen
patterns and requires custom C++ code.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76296
2020-03-20 14:01:51 +00:00
Shiva Chen fc3752665f [RISCV] Passing small data limitation value to RISCV backend
Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
The data will be put into the small data section if the data smaller than
the threshold.

Differential Revision: https://reviews.llvm.org/D57497
2020-03-20 11:03:51 +08:00
Thomas Lively a3f974f3c3 [WebAssembly] SIMD bitmask intrinsics and builtin functions
Summary:
These experimental new instructions are proposed in
https://github.com/WebAssembly/simd/pull/201.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76397
2020-03-19 17:15:37 -07:00
Djordje Todorovic d9b9621009 Reland D73534: [DebugInfo] Enable the debug entry values feature by default
The issue that was causing the build failures was fixed with the D76164.
2020-03-19 13:57:30 +01:00
Lucas Prates d4ad386ee1 [ARM] Fixing range checks for Neon's vqdmulhq_lane and vqrdmulhq_lane intrinsics
Summary:
The range checks performed for the vqrdmulh_lane and vqrdmulh_lane Neon
intrinsics were incorrectly using their return type as the base type for
the range check performed on their 'lane' argument.

This patch updates those intrisics to use the type of the proper reference
argument to perform the range checks.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74766
2020-03-19 12:08:12 +00:00
Lucas Prates f56550cf7f [ARM] Enabling range checks on Neon intrinsics' lane arguments
Summary:
Range checks were not properly performed in the lane arguments of Neon
intrinsics implemented based on splat operations. Calls to those
intrinsics where translated to `__builtin__shufflevector` calls directly
by the pre-processor through the arm_neon.h macros, missing the chance
for the proper range checks.

This patch enables the range check by introducing an auxiliary splat
instruction in arm_neon.td, delaying the translation to shufflevector
calls to CGBuiltin.cpp in clang after the checks were performed.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, ostannard

Reviewed By: ostannard

Subscribers: ostannard, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74619
2020-03-19 12:07:23 +00:00
Lucas Prates 7bf23563f4 Revert "[ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions"
This reverts commit 62ab15ffa3.

Multiple commits were unintentionally squashed into this one. Reverting
so each of them can be pushed properly.
2020-03-19 12:01:13 +00:00
Lucas Prates 62ab15ffa3 [ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions
Summary:
Some of the `*_laneq` intrinsics defined in arm_neon.td were missing the
setting of the `isLaneQ` attribute. This patch sets the attribute on the
related definitions, as they will be required to properly perform range
checks on their lane arguments.

Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio

Reviewed By: dnsampaio

Subscribers: dnsampaio, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74616
2020-03-19 11:52:41 +00:00
Sander de Smalen 981f0802b3 [SVE] Generate overloaded functions for ACLE intrinsics.
The SVE ACLE allows using a short-form for the intrinsics, e.g.
the following two declarations generate the same code:

  svuint32_t svld1(svbool_t, uint32_t const *);
  svuint32_t svld1_u32(svbool_t, uint32_t const *);

using the attribute:
  __clang_arm_builtin_alias

so that any call to svld1(svbool_t, uint32_t const *) will
map to __builtin_sve_svld1_u32.

Reviewers: SjoerdMeijer, miyuki, efriedma, simon_tatham, rengolin

Reviewed By: SjoerdMeijer

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75861
2020-03-19 09:36:23 +00:00
Richard Smith f18233dad4 Fix -fsanitize=array-bound to treat T[0] union members as flexible array
members regardless of whether they're the last member of the union.
2020-03-18 15:47:24 -07:00
Simon Tatham e13d153c1b [ARM,MVE] Add intrinsics for the VQDMLAD family.
Summary:
This is another set of instructions too complicated to be sensibly
expressed in IR by anything short of a target-specific intrinsic.
Given input vectors a,b, the instruction generates intermediate values
2*(a[0]*b[0]+a[1]+b[1]), 2*(a[2]*b[2]+a[3]+b[3]), etc; takes the high
half of each double-width values, and overwrites half the lanes in the
output vector c, which you therefore have to provide the input value
of. Optionally you can swap the elements of b so that the are things
like a[0]*b[1]+a[1]*b[0]; optionally you can round to nearest when
taking the high half; and optionally you can take the difference
rather than sum of the two products. Finally, saturation is applied
when converting back to a single-width vector lane.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76359
2020-03-18 17:11:22 +00:00
Simon Tatham 928776de92 [ARM,MVE] Add intrinsics for the VQDMLAH family.
Summary:
These are complicated integer multiply+add instructions with extra
saturation, taking the high half of a double-width product, and
optional rounding. There's no sensible way to represent that in
standard IR, so I've converted the clang builtins directly to
target-specific intrinsics.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76123
2020-03-18 10:55:04 +00:00
Simon Tatham 28c5d97bee [ARM,MVE] Add intrinsics and isel for MVE integer VMLA.
Summary:
These instructions compute multiply+add in integers, with one of the
operands being a splat of a scalar. (VMLA and VMLAS differ in whether
the splat operand is a multiplier or the addend.)

I've represented these in IR using existing standard IR operations for
the unpredicated forms. The predicated forms are done with target-
specific intrinsics, as usual.

When operating on n-bit vector lanes, only the bottom n bits of the
i32 scalar operand are used. So we have to tell that to isel lowering,
to allow it to remove a pointless sign- or zero-extension instruction
on that input register. That's done in `PerformIntrinsicCombine`, but
first I had to enable `PerformIntrinsicCombine` for MVE targets
(previously all the intrinsics it handled were for NEON), and make it
a method of `ARMTargetLowering` so that it can get at
`SimplifyDemandedBits`.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76122
2020-03-18 10:55:04 +00:00
Jon Chesterfield cc691f3384 Disable loader-uninitialized tests on Windows 2020-03-17 23:33:12 +00:00
Jon Chesterfield 1d19b15395 Fix arm build broken by D74361 by dropping align from filecheck pattern 2020-03-17 22:15:19 +00:00
Jon Chesterfield c45eaeabb7 [Clang] Undef attribute for global variables
Summary:
[Clang] Attribute to allow defining undef global variables

Initializing global variables is very cheap on hosted implementations. The
C semantics of zero initializing globals work very well there. It is not
necessarily cheap on freestanding implementations. Where there is no loader
available, code must be emitted near the start point to write the appropriate
values into memory.

At present, external variables can be declared in C++ and definitions provided
in assembly (or IR) to achive this effect. This patch provides an attribute in
order to remove this reason for writing assembly for performance sensitive
freestanding implementations.

A close analogue in tree is LDS memory for amdgcn, where the kernel is
responsible for initializing the memory after it starts executing on the gpu.
Uninitalized variables in LDS are observably cheaper than zero initialized.

Patch is loosely based on the cuda __shared__ and opencl __local variable
implementation which also produces undef global variables.

Reviewers: kcc, rjmccall, rsmith, glider, vitalybuka, pcc, eugenis, vlad.tsyrklevich, jdoerfert, gregrodgers, jfb, aaron.ballman

Reviewed By: rjmccall, aaron.ballman

Subscribers: Anastasia, aaron.ballman, davidb, Quuxplusone, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74361
2020-03-17 21:22:23 +00:00
Nick Desaulniers 5d90f886bc [clang][AArch64] readd support for 'p' inline asm constraint
Summary:
Was accidentally removed by commit af64948e2a when it overrode
TargetInfo::convertConstraint.

Fixes: pr/45225

Reviewers: eli.friedman, sdesmalen

Reviewed By: sdesmalen

Subscribers: echristo, sdesmalen, kristof.beyls, cfe-commits, kmclaughlin, srhines

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76297
2020-03-17 10:51:25 -07:00
Ayke van Laethem 4add249205
[AVR] Add support for the -mdouble=x flag
This flag is used by avr-gcc (starting with v10) to set the width of the
double type. The double type is by default interpreted as a 32-bit
floating point number in avr-gcc instead of a 64-bit floating point
number as is common on other architectures. Starting with GCC 10, a new
option has been added to control this behavior:
https://gcc.gnu.org/wiki/avr-gcc#Deviations_from_the_Standard

This commit keeps the default double at 32 bits but adds support for the
-mdouble flag (-mdouble=32 and -mdouble=64) to control this behavior.

Differential Revision: https://reviews.llvm.org/D76181
2020-03-17 13:21:03 +01:00
Kerry McLaughlin af64948e2a [SVE][Inline-Asm] Add constraints for SVE ACLE types
Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
 - y: SVE registers Z0-Z7
 - Upl: One of the low eight SVE predicate registers (P0-P7)
 - Upa: Full range of SVE predicate registers (P0-P15)

Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin

Reviewed By: efriedma

Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75690
2020-03-17 11:04:19 +00:00
Sander de Smalen 5087ace651 [Clang][SVE] Parse builtin type string for scalable vectors
This patch adds 'q' to mean 'scalable vector' in the builtin
type string, and for SVE will return the matching builtin
type as defined in the C/C++ language extensions for SVE.

This patch also adds some scaffolding to generate the arm_sve.h
header file, and some builtin definitions (+CodeGen) to be able
to implement some simple masked load intrinsics that use the
ACLE types, such as:

 svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) {
   return svld1_s8(pg, base);
 }

Reviewers: efriedma, rjmccall, rovka, rsandifo-arm, rengolin

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75298
2020-03-15 14:34:52 +00:00
Akira Hatanaka 86bba6c641 [Sema] Use the canonical type in function isVector
This reapplies the following patch, which was reverted because it caused
neon CodeGen tests to fail:

https://reviews.llvm.org/rGa6150b48cea00ab31e9335cc73770327acc4cb3a

I've added checks to detect half precision neon vectors and avoid
promiting them to vectors of floats.

See the discussion here: https://reviews.llvm.org/rG825235c140e7

Original commit message:

This fixes an assertion in Sema::CreateBuiltinBinOp that fails when one
of the vector operand's element type is a typedef of __fp16.

rdar://problem/55983556
2020-03-13 13:08:48 -07:00
Nico Weber f82b32a51e Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit 5aa5c943f7.
Causes clang to assert, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1061533#c4
for a repro.
2020-03-13 15:37:44 -04:00
Adrian Prantl 842ea709e4 Debug Info: Store the SDK in the DICompileUnit.
This is another intermediate step for PR44213
(https://bugs.llvm.org/show_bug.cgi?id=44213).

This stores the SDK *name* in the debug info, to make it possible to
`-fdebug-prefix-map`-replace the sysroot with a recognizable string
and allowing the debugger to find a fitting SDK relative to itself,
not the machine the executable was compiled on.

rdar://problem/51645582
2020-03-13 11:21:30 -07:00
Nick Desaulniers 246398ece7 [clang][Parse] properly parse asm-qualifiers, asm inline
Summary:
The parsing of GNU C extended asm statements was a little brittle and
had a few issues:
- It was using Parse::ParseTypeQualifierListOpt to parse the `volatile`
  qualifier.  That parser is really meant for TypeQualifiers; an asm
  statement doesn't really have a type qualifier. This is still maybe
  nice to have, but not necessary. We now can check for the `volatile`
  token by properly expanding the grammer, rather than abusing
  Parse::ParseTypeQualifierListOpt.
- The parsing of `goto` was position dependent, so `asm goto volatile`
  wouldn't parse. The qualifiers should be position independent to one
  another. Now they are.
- We would warn on duplicate `volatile`, but the parse error for
  duplicate `goto` was a generic parse error and wasn't clear.
- We need to add support for the recent GNU C extension `asm inline`.
  Adding support to the parser with the above issues highlighted the
  need for this refactoring.

Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Reviewers: aaron.ballman

Reviewed By: aaron.ballman

Subscribers: aheejin, jfb, nathanchance, cfe-commits, echristo, efriedma, rsmith, chandlerc, craig.topper, erichkeane, jyu2, void, srhines

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75563
2020-03-12 15:13:59 -07:00
Simon Tatham 3f8e714e2f [ARM,MVE] Add intrinsics and isel for MVE fused multiply-add.
Summary:
This adds the ACLE intrinsic family for the VFMA and VFMS
instructions, which perform fused multiply-add on vectors of floats.

I've represented the unpredicated versions in IR using the cross-
platform `@llvm.fma` IR intrinsic. We already had isel rules to
convert one of those into a vector VFMA in the simplest possible way;
but we didn't have rules to detect a negated argument and turn it into
VFMS, or rules to detect a splat argument and turn it into one of the
two vector/scalar forms of the instruction. Now we have all of those.

The predicated form uses a target-specific intrinsic as usual, but
I've stuck to just one, for a predicated FMA. The subtraction and
splat versions are code-generated by passing an fneg or a splat as one
of its operands, the same way as the unpredicated version.

In arm_mve_defs.h, I've had to introduce a tiny extra piece of
infrastructure: a record `id` for use in codegen dags which implements
the identity function. (Just because you can't declare a Tablegen
value of type dag which is //only// a `$varname`: you have to wrap it
in something. Now I can write `(id $varname)` to get the same effect.)

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75998
2020-03-12 11:13:50 +00:00
Shengchen Kan 214d24e1f8 [X86] Support intrinsic _mm_broadcastsi128_si256
Reviewers: LuoYuanke, craig.topper, RKSimon, pengfei

Reviewed By: craig.topper

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75897
2020-03-12 10:56:39 +08:00
Shengchen Kan ab69cd0779 [X86] Support intrinsic _mm_cldemote
Reviewers: LuoYuanke, craig.topper, RKSimon, pengfei

Reviewed By: craig.topper

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75896
2020-03-12 10:03:41 +08:00
Shengchen Kan 560aa53f8f [X86] Support intrinsics _bextr2*
Reviewers: LuoYuanke, craig.topper, RKSimon, pengfei

Reviewed By: craig.topper

Subscribers: cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75894
2020-03-12 09:26:51 +08:00
Simon Moll d871ef4e6a [instcombine] remove fsub to fneg hacks; only emit fneg
Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp
negation. This also extends the scalarization cost in instcombine for unary
operators to result in the same IR rewrites for fneg as for the idiom.

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D75467
2020-03-10 16:57:02 +01:00
Mikhail Maltsev 47edf5bafb [ARM,CDE] Generalize MVE intrinsics infrastructure to support CDE
Summary:
This patch generalizes the existing code to support CDE intrinsics
which will share some properties with existing MVE intrinsics
(some of the intrinsics will be polymorphic and accept/return values
of MVE vector types).
Specifically the patch:
* Adds new tablegen backends -gen-arm-cde-builtin-def,
  -gen-arm-cde-builtin-codegen, -gen-arm-cde-builtin-sema,
  -gen-arm-cde-builtin-aliases, -gen-arm-cde-builtin-header based on
  existing MVE backends.
* Renames the '__clang_arm_mve_alias' attribute into
  '__clang_arm_builtin_alias' (it will be used with CDE intrinsics as
  well as MVE intrinsics)
* Implements semantic checks for the coprocessor argument of the CDE
  intrinsics as well as the existing coprocessor intrinsics.
* Adds one CDE intrinsic __arm_cx1 to test the above changes

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: sdesmalen, mgorny, kristof.beyls, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75850
2020-03-10 14:03:16 +00:00
Djordje Todorovic 5aa5c943f7 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-03-10 09:15:06 +01:00
Sjoerd Meijer e32f8ef927 Follow up of 3d9a0445cc, clang driver defaulting to -fno-common
Attempt to pacify windows bot where this failed:

clang/test/CodeGen/vlt_to_pointer.c
2020-03-09 20:43:05 +00:00
Sjoerd Meijer 3d9a0445cc Recommit #2 "[Driver] Default to -fno-common for all targets"
After a first attempt to fix the test-suite failures, my first recommit
caused the same failures again. I had updated CMakeList.txt files of
tests that needed -fcommon, but it turns out that there are also
Makefiles which are used by some bots, so I've updated these Makefiles
now too.

See the original commit message for more details on this change:
0a9fc9233e
2020-03-09 19:57:03 +00:00
Erich Keane cc8390bfe3 Permit attribute 'used' with 'target' multiversioning.
This adds infrastructure for a multiversioning whitelist, plus adds
'used' to the allowed list with 'target'.  The behavior here mirrors the
implementation in GCC, where 'used' only applies to the single
declaration and doesn't apply to the ifunc or resolver.

This is not being applied to cpu_dispatch and cpu_specific, since the
rules are more complicated for cpu_specific, which emits multiple
symbols. Additionally, the author isn't currently aware of uses in the
wild of this combination, but is aware of a number of target+used
combinations.
2020-03-09 12:38:03 -07:00
Erich Keane 7b66160828 Fix Target Multiversioning renaming.
The initial implementation only did 'first declaration renaming' when
a default version came after. This is insufficient in cases where a
default does not exist, so this patch makes sure that we do the renaming
in all cases.

This renaming is necessary because we emit the first declaration before
knowing that it IS a target multiversion function, which would change
its name. The second declaration (the one that caused the
multiversioning) then needs to make sure that the first one has its name
changed to be consistent with the resolver usage.
2020-03-09 08:29:18 -07:00
Sjoerd Meijer f35d112efd Revert "Recommit "[Driver] Default to -fno-common for all targets""
This reverts commit 2c36c23f34.

Still problems in the test-suite, which I really thought I had fixed...
2020-03-09 10:37:28 +00:00
Sjoerd Meijer 2c36c23f34 Recommit "[Driver] Default to -fno-common for all targets"
This includes fixes for:
- test-suite: some benchmarks need to be compiled with -fcommon, see D75557.
- compiler-rt: one test needed -fcommon, and another a change, see D75520.
2020-03-09 10:07:37 +00:00
Matt Arsenault a4e71f01c0 Assume ieee behavior without denormal-fp-math attribute 2020-03-07 12:10:56 -05:00
Thomas Lively d43fcd0c04 [WebAssembly] Add SIMD integer min/max builtins
Summary:
Although SIMD integer min/max operations can be expressed using the ?:
operator in C++, that operator is disallowed for vectors in C. As a
workaround, this change introduces new WebAssembly-specific builtin
functions that lower to the desired vector icmp/select sequences.

Reviewers: aheejin, dschuff, kripken

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75770
2020-03-06 14:28:52 -08:00
Simon Tatham 068b2f313c [ARM,MVE] Add the `vshlcq` intrinsics.
Summary:
The VSHLC instruction performs a left shift of a whole vector register
by an immediate shift count up to 32, shifting in new bits at the low
end from a GPR and delivering the shifted-out bits from the high end
back into the same GPR.

Since the instruction produces two outputs (the shifted vector
register and the output GPR of shifted-out bits), it has to be
instruction-selected in C++ rather than Tablegen.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75445
2020-03-04 08:49:27 +00:00
Simon Tatham 810127f6ab [ARM,MVE] Add the `vsbciq` intrinsics.
Summary:
These are exactly parallel to the existing `vadciq` intrinsics, which
we implemented last year as part of the original MVE intrinsics
framework setup.

Just like VADC/VADCI, the MVE VSBC/VSBCI instructions deliver two
outputs, both of which the intrinsic exposes: a modified vector
register and a carry flag. So they have to be instruction-selected in
C++ rather than Tablegen. However, in this case, that's trivial: the
same C++ isel routine we already have for VADC works unchanged, and
all we have to do is to pass it a different instruction id.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75444
2020-03-04 08:49:27 +00:00
Sjoerd Meijer 4e363563fa Revert "[Driver] Default to -fno-common for all targets"
This reverts commit 0a9fc9233e.

Going to look at the asan failures.

I find the failures in the test suite weird, because they look
like compile time test and I don't understand how that can be
failing, but will have a brief look at that too.
2020-03-03 10:00:36 +00:00
Sjoerd Meijer 0a9fc9233e [Driver] Default to -fno-common for all targets
This makes -fno-common the default for all targets because this has performance
and code-size benefits and is more language conforming for C code.
Additionally, GCC10 also defaults to -fno-common and so we get consistent
behaviour with GCC.

With this change, C code that uses tentative definitions as definitions of a
variable in multiple translation units will trigger multiple-definition linker
errors. Generally, this occurs when the use of the extern keyword is neglected
in the declaration of a variable in a header file. In some cases, no specific
translation unit provides a definition of the variable. The previous behavior
can be restored by specifying -fcommon.

As GCC has switched already, we benefit from applications already being ported
and existing documentation how to do this. For example:
- https://gcc.gnu.org/gcc-10/porting_to.html
- https://wiki.gentoo.org/wiki/Gcc_10_porting_notes/fno_common

Differential revision: https://reviews.llvm.org/D75056
2020-03-03 09:15:07 +00:00
Sanjay Patel 1e308452bf [CodeGen] avoid running the entire optimizer pipeline in clang test file; NFC
I'm making the CHECK lines vague enough that they pass at -O0.
If that is too vague (we really want to check the data flow
to verify that the variables are not mismatched, etc), then
we can adjust those lines again to more closely match the output
at -O0 rather than -O1.

This change is based on the post-commit comments for:
83f4372f3a
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20200224/307888.html
2020-03-02 09:47:32 -05:00
Sanjay Patel 8cdcbcaa02 [CodeGen] avoid running the entire optimizer pipeline in clang test file; NFC
There are no failures from the first set of RUN lines here,
so the CHECKs were already vague enough to not be affected
by optimizations. The final RUN line does induce some kind
of failure, so I'll try to fix that separately in a
follow-up.
2020-03-02 09:12:53 -05:00
Luke Geeson 7d594cf003 [ARM] Add Cortex-M55 Support for clang and llvm
This patch upstreams support for the ARM Armv8.1m cpu Cortex-M55.

In detail adding support for:

 - mcpu option in clang
 - Arm Target Features in clang
 - llvm Arm TargetParser definitions

details of the CPU can be found here:
https://developer.arm.com/ip-products/processors/cortex-m/cortex-m55

Reviewers: chill

Reviewed By: chill

Subscribers: dmgreen, kristof.beyls, hiraditya, cfe-commits,
llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74966
2020-03-02 11:42:26 +00:00
Simon Tatham 1a8cbfa514 [ARM,MVE] Add ACLE intrinsics for VCVT[ANPM] family.
Summary:
These instructions convert a vector of floats to a vector of integers
of the same size, with assorted non-default rounding modes.
Implemented in IR as target-specific intrinsics, because as far as I
can see there are no matches for that functionality in the standard IR
intrinsics list.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75255
2020-03-02 10:33:30 +00:00
Simon Tatham b08d2ddd69 [ARM,MVE] Add ACLE intrinsics for VCVT.F32.F16 family.
Summary:
These instructions make a vector of `<4 x float>` by widening every
other lane of a vector of `<8 x half>`.

I wondered about representing these using standard IR, along the lines
of a shufflevector to extract elements of the input into a `<4 x half>`
followed by an `fpext` to turn that into `<4 x float>`. But it looks as
if that would take a lot of work in isel lowering to make it match any
pattern I could sensibly write in Tablegen, and also I haven't been
able to think of any other case where that pattern might be generated
in IR, so there wouldn't be any extra code generation win from doing
it that way.

Therefore, I've just used another target-specific intrinsic. We can
always change it to the other way later if anyone thinks of a good
reason.

(In order to put the intrinsic definition near similar things in
`IntrinsicsARM.td`, I've also lifted the definition of the
`MVEMXPredicated` multiclass higher up the file, without changing it.)

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75254
2020-03-02 10:33:30 +00:00
Simon Tatham a41ecf0eb0 [ARM,MVE] Add ACLE intrinsics for VQMOV[U]N family.
Summary:
These instructions work like VMOVN (narrowing a vector of wide values
to half size, and overwriting every other lane of an output register
with the result), except that the narrowing conversion is saturating.
They come in three signedness flavours: signed to signed, unsigned to
unsigned, and signed to unsigned. All are represented in IR by a
target-specific intrinsic that takes two separate 'unsigned' flags.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75252
2020-03-02 10:33:30 +00:00
Simon Pilgrim 7e9747b50b [X86][F16C] Remove cvtph2ps intrinsics and use generic half2float conversion (PR37554)
This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default.

Differential Revision: https://reviews.llvm.org/D75162
2020-02-29 18:57:35 +00:00
Simon Pilgrim bfa0aaf37f [AVX512] Add strict-fp cvtph2ps constrained tests
As suggested on D75162
2020-02-28 16:55:00 +00:00
Simon Pilgrim a06402cc69 [F16C] Add strict-fp constrained tests
As suggested on D75162
2020-02-28 16:55:00 +00:00
Simon Moll ddd11273d9 Remove BinaryOperator::CreateFNeg
Use UnaryOperator::CreateFNeg instead.

Summary:
With the introduction of the native fneg instruction, the
fsub -0.0, %x idiom is obsolete. This patch makes LLVM
emit fneg instead of the idiom in all places.

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D75130
2020-02-27 09:06:03 -08:00
Dan Gohman 00072c08c7 [WebAssembly] Mangle the argc/argv `main` as `__wasm_argc_argv`.
WebAssembly enforces a rule that caller and callee signatures must
match. This means that the traditional technique of passing `main`
`argc` and `argv` even when it doesn't need them doesn't work.

Currently the backend renames `main` to `__original_main`, however this
doesn't interact well with LTO'ing libc, and the name isn't intuitive.
This patch allows us to transition to `__main_argc_argv` instead.

This implements the proposal in
https://github.com/WebAssembly/tool-conventions/pull/134
with a flag to disable it when targeting Emscripten, though this is
expected to be temporary, as discussed in the proposal comments.

Differential Revision: https://reviews.llvm.org/D70700
2020-02-27 07:55:36 -08:00
Simon Tatham 8c26f42fe9 [clang,ARM,MVE] Remove redundant #includes in test file.
I made that file by pasting together several pieces, and forgot to
take out the #include <arm_mve.h> from the tops of the later ones, so
the test was pointlessly including the same header five times. NFC.
2020-02-27 09:39:35 +00:00
Simon Tatham 9eb3cc10b2 [ARM,MVE] Add predicated intrinsics for many unary functions.
Summary:
This commit adds the predicated MVE intrinsics for the same set of
unary operations that I added in their unpredicated forms in

* D74333 (vrint)
* D74334 (vrev)
* D74335 (vclz, vcls)
* D74336 (vmovl)
* D74337 (vmovn)

but since the predicated versions are a lot more similar to each
other, I've kept them all together in a single big patch. Everything
here is done in the standard way we've been doing other predicated
operations: an IR intrinsic called `@llvm.arm.mve.foo.predicated` and
some isel rules that match that alongside whatever they accept for the
unpredicated version of the same instruction.

In order to write the isel rules conveniently, I've refactored the
existing isel rules for the affected instructions into multiclasses
parametrised by a vector-type class, in the usual way. All those
refactorings are intended to leave the existing isel rules unchanged:
the only difference should be that new ones for the predicated
intrinsics are introduced.

The only tiny infrastructure change I needed in this commit was to
change the implementation of `IntrinsicMX` in `arm_mve_defs.td` so
that the records it defines are anonymous rather than named (and use
`NameOverride` to set the output intrinsic name), which allows me to
call it twice in two multiclasses with the same `NAME` without a
tablegen-time error.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75165
2020-02-26 15:12:07 +00:00
Rong Xu 11857d4994 [remark][diagnostics] [codegen] Fix PR44896
This patch fixes PR44896. For IR input files, option fdiscard-value-names
should be ignored as we need named values in loadModule().
Commit 60d3947922 sets this option after loadModule() where valued names
already created. This creates an inconsistent state in setNameImpl()
that leads to a seg fault.
This patch forces fdiscard-value-names to be false for IR input files.

This patch also emits a warning of "ignoring -fdiscard-value-names" if
option fdiscard-value-names is explictly enabled in the commandline for
IR input files.

Differential Revision: https://reviews.llvm.org/D74878
2020-02-25 08:15:17 -08:00
Benjamin Kramer fc466f8780 Make test not write to the source directory 2020-02-25 16:03:06 +01:00
Sanjay Patel 83f4372f3a [CodeGen] fix clang test that runs the optimizer pipeline; NFC
There's already a FIXME note on this file; it can break when the
underlying LLVM behavior changes independently of anything in clang.
2020-02-25 09:13:49 -05:00
Bill Wendling 50cac24877 Support output constraints on "asm goto"
Summary:
Clang's "asm goto" feature didn't initially support outputs constraints. That
was the same behavior as gcc's implementation. The decision by gcc not to
support outputs was based on a restriction in their IR regarding terminators.
LLVM doesn't restrict terminators from returning values (e.g. 'invoke'), so
it made sense to support this feature.

Output values are valid only on the 'fallthrough' path. If an output value's used
on an indirect branch, then it's 'poisoned'.

In theory, outputs *could* be valid on the 'indirect' paths, but it's very
difficult to guarantee that the original semantics would be retained. E.g.
because indirect labels could be used as data, we wouldn't be able to split
critical edges in situations where two 'callbr' instructions have the same
indirect label, because the indirect branch's destination would no longer be
the same.

Reviewers: jyknight, nickdesaulniers, hfinkel

Reviewed By: jyknight, nickdesaulniers

Subscribers: MaskRay, rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69876
2020-02-24 18:51:29 -08:00
Craig Topper 727328433a [X86] Add back fmaddsub intrinsics to work towards fixing the strict fp implementation
Previously we emitted an fmadd and a fmadd+fneg and combined them with a shufflevector. But this doesn't follow the correct exception behavior for unselected elements so the backend can't merge them into the fmaddsub/fmsubadd instructions.

This patch restores the the fmaddsub intrinsics so we don't have two arithmetic operations. We lose out on optimization opportunity in the non-strict FP case, but I don't think this is a big loss. If someone gives us a test case we can look into adding instcombine/dagcombine improvements. I'd rather not have the frontend do completely different things for strict and non-strict.

This still has problems because target specific intrinsics don't support strict semantics yet. We also still have all of the problems with masking. But we at least generate the right instruction in constrained mode now.

Differential Revision: https://reviews.llvm.org/D74268
2020-02-24 12:07:21 -08:00
Xiangling Liao 8bee52bdb5 [AIX][Frontend] C++ ABI customizations for AIX boilerplate
This PR enables "XL" C++ ABI in frontend AST to IR codegen. And it is driven by
static init work. The current kind in Clang by default is Generic Itanium, which
has different behavior on static init with IBM xlclang compiler on AIX.

Differential Revision: https://reviews.llvm.org/D74015
2020-02-24 10:26:51 -05:00
Fangrui Song fc6057e34f [Frontend] Replace CC1 option -mcode-model with -mcmodel=
Before:

% clang -mcmodel=x -xc /dev/null
error: invalid argument 'x' in '-mcode-model x'

Now:

% clang -mcmodel=x -xc /dev/null
clang-11: error: invalid argument 'x' to -mcmodel=
2020-02-21 23:10:50 -08:00
Luís Marques 0781e93a6e [CodeGen][RISCV] Fix clang/test/CodeGen/atomic_ops.c for RISC-V
By default the RISC-V target doesn't have the atomics standard extension
enabled. The first RUN line in `clang/test/CodeGen/atomic_ops.c` didn't
specify a target triple, which meant that on RISC-V Linux hosts it would
target RISC-V, but because it used clang cc1 we didn't get the toolchain
driver functionality to automatically turn on the extensions implied by
the target triple (riscv64-linux includes atomics). This would cause the
test to fail on RISC-V hosts.

This patch changes the test to have RUN lines for two explicit targets,
one with native atomics and one without. To work around FileCheck
limitations and more accurately match the output, some tests now have
separate prefixes for the two cases.

Reviewers: jyknight, eli.friedman, lenary, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D74847
2020-02-21 19:29:57 +00:00
Djordje Todorovic 2f215cf36a Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGfaff707db82d.
A failure found on an ARM 2-stage buildbot.
The investigation is needed.
2020-02-20 14:41:39 +01:00
Roman Lebedev 9ea5d17cc9
[Sema] Demote call-site-based 'alignment is a power of two' check for AllocAlignAttr into a warning
Summary:
As @rsmith notes in https://reviews.llvm.org/D73020#inline-672219
while that is certainly UB land, it may not be actually reachable at runtime, e.g.:
```
template<int N> void *make() {
  if ((N & (N-1)) == 0)
    return operator new(N, std::align_val_t(N));
  else
    return operator new(N);
}
void *p = make<7>();
```
and we shouldn't really error-out there.

That being said, i'm not really following the logic here.
Which ones of these cases should remain being an error?

Reviewers: rsmith, erichkeane

Reviewed By: erichkeane

Subscribers: cfe-commits, rsmith

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73996
2020-02-20 16:39:26 +03:00
Mikhail Maltsev f4fd7dbf85 [ARM,MVE] Add vqdmull[b,t]q intrinsic families
Summary:
This patch adds two families of ACLE intrinsics: vqdmullbq and
vqdmulltq (including vector-vector and vector-scalar variants) and the
corresponding LLVM IR intrinsics llvm.arm.mve.vqdmull and
llvm.arm.mve.vqdmull.predicated.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74845
2020-02-20 10:51:19 +00:00
Reid Kleckner 0edb212925 [MS] Mark vectorcall FP and vector args inreg
This has no effect on how LLVM passes the arguments, but it prevents
rewriteWithInAlloca from thinking that these parameters should be part
of the inalloca pack.

Follow-up to D72114

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D74452
2020-02-19 16:37:50 -08:00
Krzysztof Parzyszek b1d47467e2 [Hexagon] Change HVX vector predicate types from v512/1024i1 to v64/128i1
This commit removes the artificial types <512 x i1> and <1024 x i1>
from HVX intrinsics, and makes v512i1 and v1024i1 no longer legal on
Hexagon.

It may cause existing bitcode files to become invalid.

* Converting between vector predicates and vector registers must be
  done explicitly via vandvrt/vandqrt instructions (their intrinsics),
  i.e. (for 64-byte mode):
    %Q = call <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32> %V, i32 -1)
    %V = call <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1> %Q, i32 -1)

  The conversion intrinsics are:
    declare  <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32>, i32)
    declare <128 x i1> @llvm.hexagon.V6.vandvrt.128B(<32 x i32>, i32)
    declare <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1>, i32)
    declare <32 x i32> @llvm.hexagon.V6.vandqrt.128B(<128 x i1>, i32)
  They are all pure.

* Vector predicate values cannot be loaded/stored directly. This directly
  reflects the architecture restriction. Loading and storing or vector
  predicates must be done indirectly via vector registers and explicit
  conversions via vandvrt/vandqrt instructions.
2020-02-19 14:14:56 -06:00
Mikhail Maltsev 461fd94f00 [ARM,MVE] Fix predicate types of some intrinsics
Summary:
Some predicated MVE intrinsics return a vector with element size
different from the input vector element size. In this case the
predicate must type correspond to the output vector type.

The following intrinsics use the incorrect predicate type:
* llvm.arm.mve.mull.int.predicated
* llvm.arm.mve.mull.poly.predicated
* llvm.arm.mve.vshll.imm.predicated

This patch fixes the issue.

Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74838
2020-02-19 16:24:54 +00:00
Sander de Smalen 49b307e96d [AArch64][SVE] CodeGen of ACLE Builtin Types
Summary:
This patch adds codegen support for the ACLE builtin types added in:

  https://reviews.llvm.org/D62960

so that the ACLE builtin types are emitted as corresponding scalable
vector types in LLVM.

Reviewers: rsandifo-arm, rovka, rjmccall, efriedma

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74724
2020-02-19 12:10:47 +00:00
Oliver Stannard 78654e8511 Revert "Reland D74436 "Change clang option -ffp-model=precise to select ffp-contract=on"""
Reverting because this patch is causing ~20 llvm-test-suite failures on
a number of different bots:
* http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3366
* http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/8222
* http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13275
* http://lab.llvm.org:8011/builders/clang-s390x-linux-lnt/builds/17213

This reverts commit cd2c5af6df.
2020-02-19 12:03:27 +00:00
Djordje Todorovic faff707db8 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-02-19 11:12:26 +01:00
David Tenty 58817a0783 [clang][XCOFF] Indicate that XCOFF does not support COMDATs
Summary: XCOFF doesn't support COMDATs, so clang shouldn't emit them.

Reviewers: stevewan, sfertile, Xiangling_L

Reviewed By: sfertile

Subscribers: dschuff, aheejin, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74631
2020-02-18 16:10:11 -05:00
Mikhail Maltsev 63809d365e [ARM,MVE] Add vbrsrq intrinsics family
Summary:
This patch adds a new MVE intrinsics family, `vbrsrq`: vector bit
reverse and shift right. The intrinsics are compiled into the VBRSR
instruction. Two new LLVM IR intrinsics were also added: arm.mve.vbrsr
and arm.mve.vbrsr.predicated.

Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM

Reviewed By: simon_tatham

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74721
2020-02-18 17:31:21 +00:00