Commit Graph

127 Commits

Author SHA1 Message Date
Chad Rosier b7747e31ef [AArch64] Add SchedRW lists to NEON instructions.
Previously, only regular AArch64 instructions were annotated with SchedRW lists.
This patch does the same for NEON enabling these instructions to be scheduled by
the MIScheduler. Additionally, store operations are now modeled and a few
SchedRW lists were updated for bug fixes (e.g. multiple def operands).

Reviewers: apazos, mcrosier, atrick
Patch by Dave Estes <cestes@codeaurora.org>!

llvm-svn: 204505
2014-03-21 19:34:41 +00:00
Tim Northover fad2761ca0 InstCombine: form shuffles from wider range of insert/extractelements
Sequences of insertelement/extractelements are sometimes used to build
vectorsr; this code tries to put them back together into shuffles, but
could only produce a completely uniform shuffle types (<N x T> from two
<N x T> sources).

This should allow shuffles with different numbers of elements on the
input and output sides as well.

llvm-svn: 203229
2014-03-07 10:24:44 +00:00
Hao Liu 7b6dfcf06a [AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 types.
As this problems are similar to shl/sra/srl, also add patterns for shift nodes.

llvm-svn: 201298
2014-02-13 05:42:33 +00:00
Jim Grosbach e9008de652 X86: Resolve a long standing FIXME and properly isel pextr[bw].
Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use
them to match the relevant pextr store instructions.

The test widen_load-2.ll requires a slight change because with the
stores gone, the remaining instructions are scheduled in a different
order.

Add test cases for SSE4 and AVX variants.

Resolves rdar://13414672.

Patch by Adam Nemet <anemet@apple.com>.

llvm-svn: 200957
2014-02-07 00:16:33 +00:00
Tim Northover fdbdb4b6d5 ARM & AArch64: merge NEON absolute compare intrinsics
There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.

llvm-svn: 200768
2014-02-04 14:55:42 +00:00
Tim Northover 24979d8e10 AArch64 & ARM: refactor crypto intrinsics to take scalars
Some of the SHA instructions take a scalar i32 as one argument (largely because
they work on 160-bit hash fragments). This wasn't reflected in the IR
previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x
i32> respectively) which was ugly.

This makes all the affected intrinsics take a uniform "i32", allowing them to
become non-polymorphic at the same time.

llvm-svn: 200706
2014-02-03 17:27:49 +00:00
Chad Rosier fe5ab2f5ba [AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types.
llvm-svn: 200491
2014-01-30 21:46:54 +00:00
Kevin Qin 92d64d2d56 [AArch64 NEON] Lower SELECT_CC with vector operand.
When the scalar compare is between floating point and operands are
vector, we custom lower SELECT_CC to use NEON SIMD compare for
generating less instructions.

llvm-svn: 200365
2014-01-29 01:57:30 +00:00
Jiangning Liu fb3c17b6c9 Improve pattern match from v1i8 to v1i32 for AArch64 Neon.
llvm-svn: 200119
2014-01-26 04:55:53 +00:00
Jiangning Liu 6398d839c6 Implement pattern match from v1xx to v1xx for AArch64 Neon.
llvm-svn: 200113
2014-01-26 03:27:40 +00:00
Kevin Qin 18662f4b7c [AArch64 NEON] Add patterns for concat_vector on v2i32.
llvm-svn: 200111
2014-01-26 02:46:15 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Ana Pazos 5d31f6945b [AArch64] Added vselect patterns with float and double types
llvm-svn: 199925
2014-01-23 19:18:57 +00:00
Kevin Qin ef66ff78ca [AArch64 NEON] Accept both #0.0 and #0 for comparing with floating point zero in asm parser.
For FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT, floating point zero will be
printed as #0.0 instead of #0. To support the history codes using #0,
we consider to let asm parser accept both #0.0 and #0.

llvm-svn: 199621
2014-01-20 02:14:05 +00:00
Hao Liu 17457a2ee2 [AArch64]Fix the problem can't select f16_to_f32 and f32_to_f16.
Also add copy support for FPR16.
Also add a missing test case file belongs to commit r197361.

llvm-svn: 199463
2014-01-17 06:23:30 +00:00
Hao Liu 18d92262c5 [AArch64]Fix the problem can't select concat_vectors of two v1i32 types.
Also fix the problem can't select scalar_to_vector from f32 to v2f32/v4f32.

llvm-svn: 199461
2014-01-17 05:44:46 +00:00
Jiangning Liu 0a791c348b For AArch64, lowering sext_inreg and generate optimized code by using SXTL.
llvm-svn: 199296
2014-01-15 05:08:01 +00:00
Rafael Espindola 08ff298d51 Revert "[AArch64] Added vselect patterns with float and double types"
This reverts commit r199242.

It is causing CodeGen/AArch64/neon-bsl.ll to fail.

llvm-svn: 199248
2014-01-14 19:24:08 +00:00
Ana Pazos 787f540daa [AArch64] Added vselect patterns with float and double types
llvm-svn: 199242
2014-01-14 18:45:48 +00:00
Kevin Qin cfef55d6d4 [AArch64 NEON] Add missing patterns for bitcast from or to v1f64
llvm-svn: 199070
2014-01-13 01:58:38 +00:00
Ana Pazos cfd2ca5826 [AArch64][NEON] Added UXTL and UXTL2 instruction aliases
llvm-svn: 198791
2014-01-08 21:02:13 +00:00
Kevin Qin cfa41a2569 [AArch64 NEON] Fixed incorrect immediate used in BIC instruction.
llvm-svn: 198675
2014-01-07 05:10:47 +00:00
Ana Pazos e891c5f264 [AArch64][NEON] Added SXTL and SXTL2 instruction aliases
llvm-svn: 198437
2014-01-03 19:20:31 +00:00
Jiangning Liu a0acf70af1 For AArch64 Neon, simplify scalar dup by lane0 for fp.
llvm-svn: 198194
2013-12-30 02:44:35 +00:00
Hao Liu b591f835d6 [AArch64]Can't select shift left 0 of type v1i64
llvm-svn: 198192
2013-12-30 02:12:46 +00:00
Hao Liu 83799741fb [AArch64]Fix a problem that the register order of fmls/fmla by element is incorrect.
E.g. the codegen result is 
     fmls v1.2s, v0.2s, v2.s[3]
which is expected to be
     fmls v0.2s, v1.2s, v2.s[3]

llvm-svn: 198001
2013-12-25 07:12:34 +00:00
Hao Liu ce7a12be8f [AArch64]Add patterns to match normal shift nodes: shl, sra and srl.
llvm-svn: 197969
2013-12-24 09:00:21 +00:00
Kevin Qin cd5f3153f5 [AArch64 NEON] Fix a pattern match failure with NEON_VDUP.
This failure caused by improper condition when lowering shuffle_vector
to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not
be generated.

llvm-svn: 197966
2013-12-24 08:11:47 +00:00
Ana Pazos bc2996b30f [AArch64] Check fmul node single use in fused multiply patterns
Check for single use of fmul node in fused multiply patterns
to allow generation of fused multiply add/sub instructions.
Otherwise fmul operation ends up being repeated more than
once which does not help peformance on targets with
only one MAC unit, as for example cortex-a53.

llvm-svn: 197929
2013-12-24 00:47:29 +00:00
Kevin Qin 53eaea0104 [AArch64 NEON]Implment loading vector constant form constant pool.
llvm-svn: 197551
2013-12-18 06:26:04 +00:00
Chad Rosier 5f87edb484 [AArch64] Fix v1fx patterns for Floating-point Multiply Extend and Floating-point Compare to Zero.
llvm-svn: 197402
2013-12-16 18:29:35 +00:00
Hao Liu 774cabb538 [AArch64]Fix the pattern match failure for v1i8/v1i16/v1i32 types.
Currently we have such types as legal vector types. The DAG combiner may generate some DAG nodes having such types but we don't have patterns to match them.
E.g. a load i32 and a bitcast i32 to v1i32 will be combined into a load v1i32:
     bitcast (load i32) to v1i32 -> load v1i32.
So this patch fixes such problems for load/dup instructions.
If v1i8/v1i16/v1i32 are not legal any more, the code in this patch can be deleted. So I also add some FIXME.

llvm-svn: 197361
2013-12-16 02:51:28 +00:00
Chad Rosier e139dd4fe6 [AArch64] Simplify the Neon Scalar3Same patterns for floating-point reciprocal
step, floating-point reciprocal square root step, floating-point absolute
difference, and integer/floating-point compare instructions.  Also, move the
scalar general arithmetic operation patterns closer to similar code.  No
functional change intended.

llvm-svn: 197250
2013-12-13 17:56:44 +00:00
Chad Rosier 4055f42d22 [AArch64] Removed unnecessary copy patterns with v1fx types.
- Copy patterns with float/double types are enough.
- Fix typos in test case names that were using v1fx.
- There is no ACLE intrinsic that uses v1f32 type.  And there is no conflict of
  neon and non-neon ovelapped operations with this type, so there is no need to
  support operations with this type.
- Remove v1f32 from FPR32 register and disallow v1f32 as a legal type for
  operations.

Patch by Ana Pazos!

llvm-svn: 197159
2013-12-12 15:46:29 +00:00
Hao Liu 46a10eec28 [AArch64]Fix the problem that AArch64 backend fails to select scalar_to_vector of vector types having more than one element.
llvm-svn: 197135
2013-12-12 07:36:26 +00:00
Chad Rosier 446d8ea0fb [AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across vector AArch64
intrinsics to use f32 types, rather than their vector equivalents.

llvm-svn: 197090
2013-12-11 23:21:25 +00:00
Chad Rosier 088f93d4b5 [AArch64] Add NEON scalar floating-point compare LLVM AArch64 intrinsics that
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197068
2013-12-11 21:03:46 +00:00
Chad Rosier 473a01e1c9 [AArch64] Refactor the NEON scalar floating-point reciprocal step and
floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197067
2013-12-11 21:03:43 +00:00
Chad Rosier 7098fcc062 [AArch64] Refactor the NEON scalar floating-point reciprocal estimate, floating-
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.

llvm-svn: 197066
2013-12-11 21:03:40 +00:00
Kevin Qin 310b6c08ba [AArch64 NEON] Get instruction BSL matched to VSELECT.
llvm-svn: 196998
2013-12-11 02:33:50 +00:00
Chad Rosier f70af21651 [AArch64] Refactor the NEON floating-point absolute difference LLVM AArch64
intrinsic to use f32/f64 types, rather than their vector equivalents.

llvm-svn: 196965
2013-12-10 21:33:59 +00:00
Chad Rosier 07cc3f9100 [AArch64] Refactor the NEON signed/unsigned floating-point convert to fixed-point
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.

llvm-svn: 196964
2013-12-10 21:33:56 +00:00
Chad Rosier 98b8baa35c [AArch64] Overload NEON signed/unsigned floating-point convert to fixed-point
and fixed-point convert to floating-point LLVM AArch64 intrinsics.

llvm-svn: 196963
2013-12-10 21:33:53 +00:00
Chad Rosier cc34d187b8 [AArch64] Overload NEON signed/unsigned integer convert to floating-point
LLVM AArch64 intrinsics.

llvm-svn: 196962
2013-12-10 21:33:50 +00:00
Chad Rosier 7a9bba442f [AArch64] Refactor the Neon vector/scalar floating-point convert intrinsics so
that they use float/double rather than the vector equivalents when appropriate.

llvm-svn: 196930
2013-12-10 16:11:39 +00:00
Chad Rosier fcc4c366d1 [AArch64] Refactor the Neon vector/scalar floating-point convert implementation.
Specifically, reuse the ARM intrinsics when possible.

llvm-svn: 196926
2013-12-10 15:35:33 +00:00
Kevin Qin 43385c7065 [AArch64 NEON] Replace fpimm with fpz32 for floating compare with zero.
This is a small change to be strict. Just want get pattern safer.

llvm-svn: 196889
2013-12-10 06:51:07 +00:00
Kevin Qin 04396d1e69 [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
llvm-svn: 196887
2013-12-10 06:48:35 +00:00
Chad Rosier 5c8bf9c3db [AArch64] Refactor the NEON scalar reduce pairwise intrinsics, so that they use
float/double rather than the vector equivalents when appropriate.

llvm-svn: 196833
2013-12-09 22:47:38 +00:00
Chad Rosier 3b0b3ee71e [AArch64] Refactor NEON scalar reduce pairwise front-end codegen to remove
unnecessary patterns in tablegen.

llvm-svn: 196832
2013-12-09 22:47:34 +00:00