Commit Graph

35 Commits

Author SHA1 Message Date
Mikhail Maltsev e04ab4fe97 [CodeGen][ARM] Coerce FP16 vectors to integer vectors when needed
Summary:
On targets that do not support FP16 natively LLVM currently legalizes
vectors of FP16 values by scalarizing them and promoting to FP32. This
causes problems for the following code:

  void foo(int, ...);

  typedef __attribute__((neon_vector_type(4))) __fp16 float16x4_t;
  void bar(float16x4_t x) {
    foo(42, x);
  }

According to the AAPCS (appendix A.2) float16x4_t is a containerized
vector fundamental type, so 'foo' expects that the 4 16-bit FP values
are packed into 2 32-bit registers, but instead bar promotes them to
4 single precision values.

Since we already handle scalar FP16 values in the frontend by
bitcasting them to/from integers, this patch adds similar handling for
vector types and homogeneous FP16 vector aggregates.

One existing test required some adjustments because we now generate
more bitcasts (so the patch changes the test to target a machine with
native FP16 support).

Reviewers: eli.friedman, olista01, SjoerdMeijer, javed.absar, efriedma

Reviewed By: javed.absar, efriedma

Subscribers: efriedma, kristof.beyls, cfe-commits, chrib

Differential Revision: https://reviews.llvm.org/D50507

llvm-svn: 342034
2018-09-12 09:19:19 +00:00
Ivan A. Kosarev a9f484ac4a [NEON] Support vldNq intrinsics in AArch32 (Clang part)
This patch reworks the support for dup NEON intrinsics as
described in D48439.

Differential Revision: https://reviews.llvm.org/D48440

llvm-svn: 335734
2018-06-27 13:58:43 +00:00
Ivan A. Kosarev 1243ebdcdb Revert r330195 "[NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only".
Differential Revision: https://reviews.llvm.org/D45668

llvm-svn: 330248
2018-04-18 12:02:49 +00:00
Ivan A. Kosarev b3b87c3314 [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only
Differential Revision: https://reviews.llvm.org/D45668

llvm-svn: 330195
2018-04-17 16:43:07 +00:00
Akira Hatanaka 673af7a688 Generalize NRVO to cover C structs.
This commit generalizes NRVO to cover C structs (both trivial and
non-trivial structs).

rdar://problem/33599681

Differential Revision: https://reviews.llvm.org/D44968

llvm-svn: 328809
2018-03-29 17:56:24 +00:00
Sjoerd Meijer 95da875898 This reverts "r327189 - [ARM] Add ARMv8.2-A FP16 vector intrinsic"
This is causing problems in testing, and PR36683 was raised.
Reverting it until we have sorted out how to pass f16 vectors.

llvm-svn: 327437
2018-03-13 19:38:56 +00:00
Abderrazek Zaafrani 5bd68cf742 [ARM] Add ARMv8.2-A FP16 vector intrinsic
Add the fp16 neon vector intrinsic for ARM as described in the ARM ACLE document.

Reviews in https://reviews.llvm.org/D43650

llvm-svn: 327189
2018-03-09 23:39:34 +00:00
Daniel Neilson 6e938effaa Change memcpy/memove/memset to have dest and source alignment attributes (Step 1).
Summary:
  Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the Clang tests for this change.

  The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.

 For example, code which used to read:
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.

llvm-svn: 322964
2018-01-19 17:12:54 +00:00
Abderrazek Zaafrani abb890b7be [AArch64] Enable fp16 data type for the Builtin for AArch64 only.
Differential Revision: https:://reviews.llvm.org/D41360

llvm-svn: 321301
2017-12-21 20:10:03 +00:00
Abderrazek Zaafrani f58a132eef [AARch64] Add ARMv8.2-A FP16 vector intrinsics
Putting back the code that was reverted few weeks ago.

Differential Revision: https://reviews.llvm.org/D34161

llvm-svn: 321294
2017-12-21 19:20:01 +00:00
Sjoerd Meijer 98ee78578b This reverts r305820 (ARMv.2-A FP16 vector intrinsics) because it shows
problems in testing, see comments in D34161 for some more details.
A fix is in progres in D35011, but a revert seems better now as the fix will
probably take some more time to land.

llvm-svn: 307277
2017-07-06 16:37:31 +00:00
Abderrazek Zaafrani f10ca93f34 [AArch64] ADD ARMv.2-A FP16 vector intrinsics
Differential Revision: https://reviews.llvm.org/D34161

llvm-svn: 305820
2017-06-20 18:54:57 +00:00
Alexander Kornienko 50e3e123e8 Enable the ARM Neon intrinsics test by default.
The test being marked 'REQUIRES: long-tests' doesn't make sense. It's not the
first time the test is broken without being noticed by the committer. If the
test is too long, it should be shortened, split in multiple ones or removed
altogether. Keeping it as is is actively harmful.
(BTW, on my machine `ninja check-clang` takes 90-92 seconds with and without
this test. The difference in times is below the spread caused by random
factors.)

llvm-svn: 304302
2017-05-31 14:35:50 +00:00
Alexander Kornienko 915e3ab8f6 Revert "[ARM] Update long-test after r304201."
This reverts commit 304208, since r304201 has been reverted as well.

The test needs to be turned on by default to detect breakages earlier. Will
commit this change separately.

llvm-svn: 304301
2017-05-31 14:33:29 +00:00
Benjamin Kramer 88d73626cd [ARM] Update long-test after r304201.
llvm-svn: 304208
2017-05-30 12:44:48 +00:00
Benjamin Kramer e524ddeef3 Unbreak long test after r304127.
llvm-svn: 304167
2017-05-29 18:11:11 +00:00
Daniel Jasper 6e254b5f38 Fix tests after speculatable intrinsics patch
These were relying on the attribute group numbering

llvm-svn: 302009
2017-05-03 10:04:25 +00:00
David Majnemer 3f5a4354db Update for LLVM changes
InstSimplify has gained the ability to remove needless bitcasts which
perturbed some clang codegen tests.

llvm-svn: 276756
2016-07-26 15:21:18 +00:00
Ahmed Bougacha 1d9de10130 [ARM NEON] Define vfms_f32 on ARM, and all vfms using vfma.
r259537 added vfma/vfms to armv7, but the builtin was only lowered
on the AArch64 side. Instead of supporting it on ARM, get rid of it.

The vfms builtin lowered to:
  %nb = fsub float -0.0, %b
  %r = @llvm.fma.f32(%a, %nb, %c)

Instead, define the operation in terms of vfma, and swap the
multiplicands. It now lowers to:
  %na = fsub float -0.0, %a
  %r = @llvm.fma.f32(%na, %b, %c)

This matches the instruction more closely, and lets current LLVM
generate the "natural" operand ordering:
  fmls.2s v0, v1, v2
instead of the crooked (but equivalent):
  fmls.2s v0, v2, v1
Except for theses changes, assembly is identical.

LLVM accepts both commutations, and the LLVM tests in:
  test/CodeGen/AArch64/arm64-fmadd.ll
  test/CodeGen/AArch64/fp-dp3.ll
  test/CodeGen/AArch64/neon-fma.ll
  test/CodeGen/ARM/fusedMAC.ll
already check either the new one only, or both.

Also verified against the test-suite unittests.

llvm-svn: 266807
2016-04-19 19:44:45 +00:00
Tim Northover e5dc94ee31 ARM: fix arm_neon_intrinsics.c and re-enable.
It turns out I'd never actually tested my recent change because it was
gated on long-tests. Failure ensued.

llvm-svn: 263093
2016-03-10 04:39:45 +00:00
Richard Trieu 9402d58e5b Disable failing test and fix RUN line.
See https://llvm.org/bugs/show_bug.cgi?id=26894 for details.  This change
fixes the incorrect flags to Clang and the piping issue.  It also disables
the FileCheck portion of the test, which is currently failing.

llvm-svn: 263091
2016-03-10 04:04:12 +00:00
Tim Northover 58672974a9 ARM & AArch64: convert asm tests to LLVM IR and restrict optimizations.
This is mostly a one-time autoconversion of tests that checked assembly after
"-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This
should make them much more stable to changes in LLVM so they won't break on
unrelated changes.

"opt -mem2reg" is a compromise designed to increase the readability of tests
that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is
stable enough that no surpises will come along.

Should address http://llvm.org/PR26815.

llvm-svn: 263048
2016-03-09 18:54:42 +00:00
Luke Cheeseman 7f5571a129 This patch makes the NEON intrinsics vget_lane_f16, vgetq_lane_f16,
vset_lane_f16 and vsetq_lane_f16 available in AArch32.

Differential Revision: http://reviews.llvm.org/D10388

llvm-svn: 239610
2015-06-12 15:52:39 +00:00
Quentin Colombet bb9a858b25 [test/CodeGen/ARM] Update arm_neon_intrinsics test case to actually test the
lowering of the intrinsics.
Prior to this commit, most of the copy-related intrinsics could be optimized
away. The situation is still not ideal as there are several possibilities to
lower a given intrinsic. Currently, we match LLVM behavior.

llvm-svn: 216474
2014-08-26 18:43:31 +00:00
Quentin Colombet a1c34d3560 [test/CodeGen/ARM] Adpat test to match new codegen after r216274.
Moreover, rework some patterns to actually check the emitted instructions
instead of matching unrelated string!

E.g.,
some of the "// CHECK: vmov" were matching stuff like ".globl
funcname_with_vmov" instead of actual instructions.

llvm-svn: 216275
2014-08-22 18:08:37 +00:00
Quentin Colombet ffe5e5a42d [test/CodeGen/ARM] Adpat test to match new codegen after r216236.
llvm-svn: 216249
2014-08-22 00:27:52 +00:00
James Molloy b8fd41926c CHECK-LABEL'ify this test.
llvm-svn: 211687
2014-06-25 11:50:56 +00:00
James Molloy 7d64a0eec4 [AArch32] Fix a stupid error in an architectural guard
The < 8 instead of <= 8 meant that a bunch of vreinterprets were not available on v8 AArch32. Simplify the guard to just !defined(aarch64) while we're at it, and enable some v8 AArch32 testing.

llvm-svn: 211686
2014-06-25 11:46:24 +00:00
Tim Northover efe7a5e1c8 ARM NEON: fix tests after r202137
llvm-svn: 202143
2014-02-25 11:48:25 +00:00
Tim Northover 87da936164 ARM NEON: add _f16 support to a couple of vector-shuffling intrinsics.
llvm-svn: 202137
2014-02-25 11:13:42 +00:00
Amaury de la Vieuville 718ce62b3c Add support for poly16 vtst and vtstq
vtst and vtstq currently support poly8 types, but they should also work on
poly16.

llvm-svn: 190925
2013-09-18 08:33:53 +00:00
Jim Grosbach 362bf98ec6 ARM: Update testcases for improved codegen.
From llvm r189841.

llvm-svn: 189842
2013-09-03 20:08:30 +00:00
Michael Gottesman a7b73d4534 Revert "Revert r184787: "Added arm_neon intrinsic tests.""
This reverts commit r184817. The failure Chandler was seeing was most likely the
bug that Bob Wilson fixed in r184870 (which was a bug caught by these tests).

To be safe, I just checked again on x86-64 mac os x/linux that this test passed
(which it did).

llvm-svn: 185110
2013-06-27 21:52:01 +00:00
Chandler Carruth 3bab90a400 Revert r184787: "Added arm_neon intrinsic tests."
This test doesn't actually pass when run with llvm-lit for me or in
a bot that actually always tries to run it.

llvm-svn: 184817
2013-06-25 02:18:39 +00:00
Michael Gottesman a35103a8cf Added arm_neon intrinsic tests.
This is a large test and thus it will only run if you pass in --param
run_long_tests=trueto LIT. This is intended so that this test can run on
buildbots and not when one runs make check.

llvm-svn: 184787
2013-06-24 21:25:42 +00:00