llvm-project

Commit Graph

Author	SHA1	Message	Date
Hao Liu	6e73761dc8	[AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg. llvm-svn: 201061	2014-02-10 03:16:22 +00:00
Tim Northover	fdbdb4b6d5	ARM & AArch64: merge NEON absolute compare intrinsics There was an extremely confusing proliferation of LLVM intrinsics to implement the vacge & vacgt instructions. This combines them all into two polymorphic intrinsics, shared across both backends. llvm-svn: 200768	2014-02-04 14:55:42 +00:00
Tim Northover	24979d8e10	AArch64 & ARM: refactor crypto intrinsics to take scalars Some of the SHA instructions take a scalar i32 as one argument (largely because they work on 160-bit hash fragments). This wasn't reflected in the IR previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x i32> respectively) which was ugly. This makes all the affected intrinsics take a uniform "i32", allowing them to become non-polymorphic at the same time. llvm-svn: 200706	2014-02-03 17:27:49 +00:00
Chad Rosier	fe5ab2f5ba	[AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types. llvm-svn: 200491	2014-01-30 21:46:54 +00:00
Kevin Qin	92d64d2d56	[AArch64 NEON] Lower SELECT_CC with vector operand. When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365	2014-01-29 01:57:30 +00:00
Kevin Qin	4a183d7094	[AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR. Replace r199791. llvm-svn: 200180	2014-01-27 02:53:54 +00:00
Kevin Qin	9eeedfbaa6	Revert r199791. It's old version which has some bugs. I'll commit lattest patch soon. llvm-svn: 200179	2014-01-27 02:53:41 +00:00
Jiangning Liu	fb3c17b6c9	Improve pattern match from v1i8 to v1i32 for AArch64 Neon. llvm-svn: 200119	2014-01-26 04:55:53 +00:00
Jiangning Liu	6398d839c6	Implement pattern match from v1xx to v1xx for AArch64 Neon. llvm-svn: 200113	2014-01-26 03:27:40 +00:00
Kevin Qin	18662f4b7c	[AArch64 NEON] Add patterns for concat_vector on v2i32. llvm-svn: 200111	2014-01-26 02:46:15 +00:00
Kevin Qin	a4068c4243	[AArch64 NEON] Add test case for vector FP_ROUND. llvm-svn: 200110	2014-01-26 02:23:33 +00:00
Ana Pazos	cd3b9f763e	[AArch64] Removed unused i8 type from FPR8 register class. The i8 type is not registered with any register class. This causes a segmentation fault in MachineLICM::getRegisterClassIDAndCost. The code selects the first type associated with register class FPR8, which happens to be i8. It uses this type (i8) to get the representative class pointer, which is 0. It then uses this pointer to access a field, resulting in segmentation fault. Since i8 type is not being used for printing any neon instruction we can safely remove it. llvm-svn: 200046	2014-01-24 22:36:53 +00:00
Kevin Qin	21cd2152d3	[AArch64 NEON] Fix a bug in implementing register copy bwtween FPR16. llvm-svn: 199978	2014-01-24 07:53:04 +00:00
Ana Pazos	5d31f6945b	[AArch64] Added vselect patterns with float and double types llvm-svn: 199925	2014-01-23 19:18:57 +00:00
Hao Liu	b920682e4a	[AArch64]Add CHECK for two test cases testing scalar_to_vector committed in r199461. llvm-svn: 199861	2014-01-23 02:09:30 +00:00
Kevin Qin	ce0190c6d5	[AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR. llvm-svn: 199791	2014-01-22 06:11:03 +00:00
Kevin Qin	6d379abd8f	[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. It was commited as r199628 but reverted in r199628 as causing regression test failed. It's because of old vervsion of patch I used to commit. Sorry for mistake. llvm-svn: 199704	2014-01-21 01:48:52 +00:00
Chandler Carruth	f835fc6f4f	Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT." This test fails the newly added regression tests. llvm-svn: 199631	2014-01-20 08:18:01 +00:00
Kevin Qin	ff42e06ef4	[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. llvm-svn: 199628	2014-01-20 07:32:26 +00:00
Kevin Qin	e0faea11b1	[AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon doesn't support these operations. llvm-svn: 199485	2014-01-17 09:54:30 +00:00
Hao Liu	17457a2ee2	[AArch64]Fix the problem can't select f16_to_f32 and f32_to_f16. Also add copy support for FPR16. Also add a missing test case file belongs to commit r197361. llvm-svn: 199463	2014-01-17 06:23:30 +00:00
Kevin Qin	212d9b4a56	[AArch64 NEON] Custom lower conversion between vector integer and vector floating point if element bit-width doesn't match. llvm-svn: 199462	2014-01-17 05:52:35 +00:00
Hao Liu	18d92262c5	[AArch64]Fix the problem can't select concat_vectors of two v1i32 types. Also fix the problem can't select scalar_to_vector from f32 to v2f32/v4f32. llvm-svn: 199461	2014-01-17 05:44:46 +00:00
Jiangning Liu	0a791c348b	For AArch64, lowering sext_inreg and generate optimized code by using SXTL. llvm-svn: 199296	2014-01-15 05:08:01 +00:00
Tim Northover	6e219cd588	AArch64: don't try to handle [SU]MUL_LOHI nodes We should set them to expand for now since there are no patterns dealing with them. Actually, there are no instructions either so I doubt they'll ever be acceptable. llvm-svn: 199265	2014-01-14 22:53:22 +00:00
Rafael Espindola	08ff298d51	Revert "[AArch64] Added vselect patterns with float and double types" This reverts commit r199242. It is causing CodeGen/AArch64/neon-bsl.ll to fail. llvm-svn: 199248	2014-01-14 19:24:08 +00:00
Ana Pazos	787f540daa	[AArch64] Added vselect patterns with float and double types llvm-svn: 199242	2014-01-14 18:45:48 +00:00
Andrea Di Biagio	9bc0415c1f	[AArch64] Fix assertion failure caused by an invalid comparison between APInt values. APInt only knows how to compare values with the same BitWidth and asserts in all other cases. With this fix, function PerformORCombine does not use the APInt equality operator if the APInt values returned by 'isConstantSplat' differ in BitWidth. In that case they are different and no comparison is needed. llvm-svn: 199119	2014-01-13 16:51:00 +00:00
Kevin Qin	cfef55d6d4	[AArch64 NEON] Add missing patterns for bitcast from or to v1f64 llvm-svn: 199070	2014-01-13 01:58:38 +00:00
Kevin Qin	21e8f1c4eb	[AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector This patch covered 2 more scenarios: 1. Two operands of shuffle_vector are the same, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> 2. One of operands is undef, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> After this patch, perm instructions will have chance to be emitted instead of lots of INS. llvm-svn: 199069	2014-01-13 01:56:29 +00:00
Nico Rieck	b5262d6d8f	Fix non-deterministic SDNodeOrder-dependent codegen Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050	2014-01-12 14:09:17 +00:00
Kristof Beyls	58306ad903	Make sure -use-init-array has intended effect on all AArch64 ELF targets, not just linux. llvm-svn: 198937	2014-01-10 13:41:49 +00:00
Andrea Di Biagio	23df4e4a2d	Teach the DAGCombiner how to fold 'vselect' dag nodes according to the following two rules: 1) fold (vselect (build_vector AllOnes), A, B) -> A 2) fold (vselect (build_vector AllZeros), A, B) -> B llvm-svn: 198777	2014-01-08 18:33:04 +00:00
Kevin Qin	44946439e1	[AArch64 NEON] Fix generating incorrect value type of NEON_VDUPLANE when lower build_vector if result value type mismatch with operand value type. llvm-svn: 198743	2014-01-08 08:06:14 +00:00
Hao Liu	7d11d99d20	[AArch64]Add support to spill/fill D tuples such as DPair/DTriple/DQuad. There is no test cases for D tuple as the original test cases are too large. As the spill/fill of the D tuple is similar to the Q tuple, the correctness can be guaranteed. llvm-svn: 198684	2014-01-07 10:50:43 +00:00
Hao Liu	27d88376bc	[AArch64]Add support to copy D tuples such as DPair/DTriple/DQuad and Q tuples such as QPair/QTriple/QQuad. There is no test case for D tuple as the original test cases are too large. As the copy of the D tuple is similar to the Q tuple, the correctness can be guaranteed. llvm-svn: 198682	2014-01-07 10:00:03 +00:00
Kevin Qin	cfa41a2569	[AArch64 NEON] Fixed incorrect immediate used in BIC instruction. llvm-svn: 198675	2014-01-07 05:10:47 +00:00
Kevin Qin	5cd73c9e0a	[AArch64 NEON] Fix invalid constant used in vselect condition. There is a wrong assumption that the vector element type and the type of each ConstantSDNode in the build_vector were the same. However, when promoting the integer operand of a legally typed build_vector, the operand type and the vector element type do not need to be the same (See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in LegalizeIntegerTypes.cpp). in AArch64 backend, the following dag sequence: C0: i1 = Constant<0> C1: i1 = Constant<-1> V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0 is type-legalized into: NewC0: i32 = Constant<0> NewC1: i32 = Constant<1> V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0 Forcing a getZeroExtend to VTBits to ensure that the new constant is correctly. llvm-svn: 198582	2014-01-06 02:26:10 +00:00
Jiangning Liu	a0acf70af1	For AArch64 Neon, simplify scalar dup by lane0 for fp. llvm-svn: 198194	2013-12-30 02:44:35 +00:00
Hao Liu	fe3bfc8c41	[AArch64]Add code to spill/fill Q register tuples such as QPair/QTriple/QQuad. llvm-svn: 198193	2013-12-30 02:38:12 +00:00
Hao Liu	b591f835d6	[AArch64]Can't select shift left 0 of type v1i64 llvm-svn: 198192	2013-12-30 02:12:46 +00:00
Kevin Qin	ede9ce1933	Fix a bug in DAGcombiner about zero-extend after setcc. For AArch64 backend, if DAGCombiner see "sext(setcc)", it will combine them together to a single setcc with extended value type. Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1, DAGcombiner will create wrong node and get wrong code emitted. llvm-svn: 198190	2013-12-30 02:05:13 +00:00
Hao Liu	74107fe526	[AArch64]Fix the problem that can't select mul of v1i64/v2i64 types. E.g. Can't select such IR: %tmp = mul <2 x i64> %a, %b llvm-svn: 198188	2013-12-30 01:38:41 +00:00
Hao Liu	83799741fb	[AArch64]Fix a problem that the register order of fmls/fmla by element is incorrect. E.g. the codegen result is fmls v1.2s, v0.2s, v2.s[3] which is expected to be fmls v0.2s, v1.2s, v2.s[3] llvm-svn: 198001	2013-12-25 07:12:34 +00:00
Jiangning Liu	dd1afd5338	Add missing pattern matches to support ACLE intrinsics of AArch64 NEON. llvm-svn: 197993	2013-12-25 01:22:51 +00:00
Hao Liu	ce7a12be8f	[AArch64]Add patterns to match normal shift nodes: shl, sra and srl. llvm-svn: 197969	2013-12-24 09:00:21 +00:00
Kevin Qin	82bd84aadf	[AArch64 NEON] Fix a bug when lowering BUILD_VECTOR. DAG.getVectorShuffle() doesn't always return a vector_shuffle node. If mask is the exact sequence of it's operand(For example, operand_0 is v8i8, and the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly return that operand. So a check is added here. llvm-svn: 197967	2013-12-24 08:16:06 +00:00
Kevin Qin	cd5f3153f5	[AArch64 NEON] Fix a pattern match failure with NEON_VDUP. This failure caused by improper condition when lowering shuffle_vector to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not be generated. llvm-svn: 197966	2013-12-24 08:11:47 +00:00
Ana Pazos	bc2996b30f	[AArch64] Check fmul node single use in fused multiply patterns Check for single use of fmul node in fused multiply patterns to allow generation of fused multiply add/sub instructions. Otherwise fmul operation ends up being repeated more than once which does not help peformance on targets with only one MAC unit, as for example cortex-a53. llvm-svn: 197929	2013-12-24 00:47:29 +00:00
Ana Pazos	3ca23915cd	[AArch64 NEON] Fixed fused multiply negate add/sub patterns The correct pattern matching should be: - fnmadd is (-Ra) + (-Rn)Rm which should be matched as: fma (fneg node:$Rn), node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul FPR32:$Rn, FPR32:$Rm)))) - fnmsub is (-Ra) + RnRm which should be matched as fma node:$Rn, node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fmul FPR32:$Rn, FPR32:$Rm)), FPR32:$Ra)))) llvm-svn: 197928	2013-12-24 00:40:10 +00:00

1 2 3 4

185 Commits