Commit Graph

10 Commits

Author SHA1 Message Date
Sander de Smalen 51c2fa0e2a Improve reduction intrinsics by overloading result value.
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.

For example:

  ; The LLVM LangRef specifies that the scalar result must equal the
  ; vector element type, but this is not checked/enforced by LLVM.
  declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)

This patch changes that into:

  declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)

Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.

Reviewers: RKSimon, arsenm, rnk, greened, aemerson

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D62996

llvm-svn: 363240
2019-06-13 09:37:38 +00:00
Amara Emerson c9916d7e97 Re-commit r302678, fixing PR33053.
The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions
which didn't have a lowering.

llvm-svn: 303211
2017-05-16 21:29:22 +00:00
Hans Wennborg bd6e9e77a7 Revert r302678 "[AArch64] Enable use of reduction intrinsics."
This caused PR33053.

Original commit message:

> The new experimental reduction intrinsics can now be used, so I'm enabling this
> for AArch64. We will need this for SVE anyway, so it makes sense to do this for
> NEON reductions as well.
>
> The existing code to match shufflevector patterns are replaced with a direct
> lowering of the reductions to AArch64-specific nodes. Tests updated with the
> new, simpler, representation.
>
> Differential Revision: https://reviews.llvm.org/D32247

llvm-svn: 303115
2017-05-15 20:59:32 +00:00
Amara Emerson 816542ceb3 [AArch64] Enable use of reduction intrinsics.
The new experimental reduction intrinsics can now be used, so I'm enabling this
for AArch64. We will need this for SVE anyway, so it makes sense to do this for
NEON reductions as well.

The existing code to match shufflevector patterns are replaced with a direct
lowering of the reductions to AArch64-specific nodes. Tests updated with the
new, simpler, representation.

Differential Revision: https://reviews.llvm.org/D32247

llvm-svn: 302678
2017-05-10 15:15:38 +00:00
Simon Pilgrim b87a21f1c3 [AARCH64] Fix linu triple typo
As promised in D22191

llvm-svn: 275976
2016-07-19 14:12:45 +00:00
Simon Pilgrim fc4d4b251d [AARCH64] Enable AARCH64 lit tests on windows dev machines
As discussed on PR27654, this patch fixes the triples of a lot of aarch64 tests and enables lit tests on windows

This will hopefully help stop cases where windows developers break the aarch64 target

Differential Revision: https://reviews.llvm.org/D22191

llvm-svn: 275973
2016-07-19 13:35:11 +00:00
Charlie Turner 434d4599d4 [AArch64] Implement vector splitting on UADDV.
Summary: Fixes PR25056.

Reviewers: mcrosier, junbuml, jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13466

llvm-svn: 250520
2015-10-16 15:38:25 +00:00
Jun Bum Lim 54f3ddfbe2 [AArch64]Fix bug in function names in test case
Functions in this test case need to be renamed as its names are the same
as the instructions we are comparing with.

llvm-svn: 250052
2015-10-12 15:34:52 +00:00
Jun Bum Lim 0aace13d18 Improve ISel across lane float min/max reduction
In vectorized float min/max reduction code, the final "reduce" step
is sub-optimal. In AArch64, this change wll combine :

  svn0 = vector_shuffle t0, undef<2,3,u,u>
  fmin = fminnum t0,svn0
  svn1 = vector_shuffle fmin, undef<1,u,u,u>
  cc = setcc fmin, svn1, ole
  n0 = extract_vector_elt cc, #0
  n1 = extract_vector_elt fmin, #0
  n2 = extract_vector_elt fmin, #1
  result = select n0, n1,n2
into :
  result = llvm.aarch64.neon.fminnmv t0

This change extends r247575.

llvm-svn: 249834
2015-10-09 14:11:25 +00:00
Jun Bum Lim 34b9bd0435 Improve ISel using across lane min/max reduction
In vectorized integer min/max reduction code, the final "reduce" step
is sub-optimal. In AArch64, this change wll combine :
  %svn0 = vector_shuffle %0, undef<2,3,u,u>
  %smax0 = smax %0, svn0
  %svn3 = vector_shuffle %smax0, undef<1,u,u,u>
  %sc = setcc %smax0, %svn3, gt
  %n0 = extract_vector_elt %sc, #0
  %n1 = extract_vector_elt %smax0, #0
  %n2 = extract_vector_elt $smax0, #1
  %result = select %n0, %n1, n2
becomes :
  %1 = smaxv %0
  %result = extract_vector_elt %1, 0

This change extends r246790.

llvm-svn: 247575
2015-09-14 16:19:52 +00:00