llvm-project

Commit Graph

Author	SHA1	Message	Date
Cullen Rhodes	1fe0e6a380	[AArch64][SME] Support ptrue(s) in streaming mode The ptrue and ptrues instructions are legal in streaming mode, missed in D106272. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06/SVE-Instructions Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D107807	2021-08-11 07:49:36 +00:00
Bradley Smith	81eafb8a37	[AArch64][SVE] Break false dependencies for inactive lanes of unary operations Differential Revision: https://reviews.llvm.org/D105889	2021-07-26 15:01:21 +00:00
Caroline Concatto	0bfc26e3a4	[SVE][AArch64] Improve code generation for vector_splice for Imm > 0 This patch implements vector_splice in tablegen for all cases when the Immediate is positive and lower than the known minimum value of a scalable vector. Vector_splice can be implemented using SVE instruction EXT. For instance : @llvm.experimental.vector.splice(Vector_1, Vector_2, Imm) @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> EXT Vector_1, Vector_2, Imm // Vector_1 = B, C, D + Vector_2 = E Depends on D105633 Differential Revision: https://reviews.llvm.org/D106273	2021-07-26 11:45:46 +01:00
Eli Friedman	0ca46a1757	[SelectionDAG] Fix the representation of ISD::STEP_VECTOR. The existing rule about the operand type is strange. Instead, just say the operand is a TargetConstant with the right width. (Legalization ignores TargetConstants, so it doesn't matter if that width is legal.) Highlights: 1. I had to substantially rewrite the AArch64 isel patterns to expect a TargetConstant. Nothing too exotic, but maybe a little hairy. Maybe worth considering a target-specific node with some dagcombines instead of this complicated nest of isel patterns. 2. Our behavior on RV32 for vectors of i64 has changed slightly. In particular, we correctly preserve the width of the arithmetic through legalization. This changes the DAG a bit. Maybe room for improvement here. 3. I explicitly defined the behavior around overflow. This is necessary to make the DAGCombine transforms legal, and I don't think it causes any practical issues. Differential Revision: https://reviews.llvm.org/D105673	2021-07-21 10:58:40 -07:00
Eli Friedman	e41e865b15	[AArch64] Prepare for changes to STEP_VECTOR. Rewrite patterns to assume that the operand of STEP_VECTOR is a constant. The old patterns will stop working when the operand is changed from a Constant to a TargetConstant. (See D105673.) Add test coverage for certain patterns that weren't exercised by existing regression tests. Differential Revision: https://reviews.llvm.org/D105847	2021-07-17 14:13:41 -07:00
Bradley Smith	026bb84bcd	[AArch64][SVE] Add ISel patterns for floating point compare with zero instructions Additionally, lower the floating point compare SVE intrinsics to SETCC_MERGE_ZERO ISD nodes to avoid duplicating ISel patterns. Differential Revision: https://reviews.llvm.org/D105486	2021-07-08 10:46:12 +00:00
Paul Walker	287d39dd5a	[NFC] Fix a few whitespace issues and typos.	2021-07-04 11:49:58 +01:00
Bradley Smith	e42ee2d509	[AArch64][SVE] Add support for using reverse forms of SVE2 shifts When using and ACLE intrinsic for an SVE2 shift, if the predicate passed has all relevant lanes active, then use a reversed version of the instruction if beneficial.	2021-06-04 12:56:53 +01:00
Bradley Smith	12a74137b3	[AArch64][SVE] Combine cntp intrinsics with add/sub to produce incp/decp Depends on D101062 Differential Revision: https://reviews.llvm.org/D102077	2021-05-14 17:16:06 +01:00
Bradley Smith	90ffcb1245	[AArch64][SVE] Add unpredicated vector BIC ISD node Addition of this node allows us to better utilize the different forms of the SVE BIC instructions, including using the alias to an AND (immediate). Differential Revision: https://reviews.llvm.org/D101831	2021-05-14 16:12:13 +01:00
Bradley Smith	65c89cd1a6	[AArch64][SVE] Better utilisation of unpredicated forms of remaining intrinsics When using predicated intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms. This only includes instructions where the unpredicated/predicated forms matched in such a way that instruction selection would not introduce extra ptrue instructions. This allows us to convert the intrinsics directly to architecture independent ISD nodes. Depends on D101062 Differential Revision: https://reviews.llvm.org/D101828	2021-05-10 13:06:02 +01:00
Bradley Smith	f8f953c2a6	[AArch64][SVE] Better utilisation of unpredicated forms of arithmetic intrinsics When using predicated arithmetic intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms. This also includes a new complex isel pattern which allows matching an all active predicate when the types are different but the predicate is a superset of the type being used. For example, to allow a b8 ptrue for a b32 predicate operand. This only includes instructions where the unpredicated/predicated forms are mismatched between variants, meaning that the removal of the predicate is done during instruction selection in order to prevent spurious re-introductions of ptrue instructions. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D101062	2021-05-10 13:05:37 +01:00
Jun Ma	b3aeb13892	[AArch64][SVE] Remove index_vector node. Since index_vector is lowered into step_vector in D100816, we can just remove index_vector, use step_vector for codegen directly. Differential Revision: https://reviews.llvm.org/D101593	2021-05-10 11:08:58 +08:00
Jun Ma	b310dd1501	[AArch64][SVE] Lower index_vector to step_vector As discussed in D100107, this patch first convert index_vector to step_vector, and convert step_vector back to index_vector after LegalizeDAG. Differential Revision: https://reviews.llvm.org/D100816	2021-04-30 19:04:39 +08:00
Bradley Smith	354604a2a7	[AArch64][SVE] Use SIMD variant of INSR when scalar is the result of a vector extract At the intrinsic layer the sve.insr operation takes a scalar. When this scalar is an integer we are forcing a data transition between GPRs and ZPRs that is potentially costly. Often the integer scalar is the result of a vector extract, when performing a reduction for example. In such cases we should keep all data within the ZPRs. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D101169	2021-04-29 12:17:42 +01:00
Caroline Concatto	ca9b7e2e2f	[AArch64][SVE] Fix crash with icmp+select This patch changes the lowering of SELECT_CC from Legal to Expand for scalable vector and adds support for scalable vectors in performSelectCombine. When selecting the nodes to lower in visitSELECT it checks if it is possible to use SELECT_CC in cases where SETCC is followed by SELECT. visistSELECT checks if SELECT_CC is legal or custom to replace SELECT by SELECT_CC. SELECT_CC used to be legal for scalable vector, so the node changes to SELECT_CC. This used to crash the compiler as there is no support for SELECT_CC with scalable vectors. So now the compiler lowers to VSELECT instead of SELECT_CC. Differential Revision: https://reviews.llvm.org/D100485	2021-04-21 14:16:27 +01:00
Jun Ma	5c6ac3b4a2	[AArch64][SVE] Combine add and index_vector This patch tries to combine pattern add(index_vector(zero, step), dup(X)) into index_vector(X, step) TestPlan: check-llvm Differential Revision: https://reviews.llvm.org/D100107	2021-04-20 11:38:37 +08:00
Bradley Smith	f2593a0bd1	[AArch64][SVE] Remove redundant PTEST of MATCH/NMATCH results Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D99584	2021-04-12 12:55:00 +01:00
Jun Ma	ab3c5fb282	[NFC][SVE] Use SVE_4_Op_Imm_Pat for sve_intx_dot_by_indexed_elem	2021-04-02 20:05:17 +08:00
Jun Ma	1af373c673	[AArch64][SVE] Codegen dup_lane for dup(vector_extract) Differential Revision: https://reviews.llvm.org/D99324	2021-03-30 10:35:08 +08:00
Bradley Smith	9745dce8c3	[SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps This is currently performed in SelectionDAGLegalize, here we make it also happen in LegalizeVectorOps, allowing a target to lower the SETCC condition codes first in LegalizeVectorOps and then lower to a custom node afterwards, without having to duplicate all of the SETCC condition legalization in the target specific lowering. As a result of this, fixed length floating point SETCC nodes can now be properly lowered for SVE. Differential Revision: https://reviews.llvm.org/D98939	2021-03-29 15:32:25 +01:00
Bradley Smith	8bad8a43c3	[AArch64][SVE] Add patterns to generate FMLA/FMLS/FNMLA/FNMLS/FMAD Adjust generateFMAsInMachineCombiner to return false if SVE is present in order to combine fmul+fadd into fma. Also add new pseudo instructions so as to select the most appropriate of FMLA/FMAD depending on register allocation. Depends on D96599 Differential Revision: https://reviews.llvm.org/D96424	2021-02-18 16:55:16 +00:00
David Green	0f435a544a	[AArch64] Correct some tablegen operand types. NFC	2021-02-06 14:34:14 +00:00
David Sherwood	d1bf26fd94	[AArch64][SVE] Add lowering for llvm abs intrinsic Add functionality to permit lowering of the abs and neg intrinsics using the passthru variants. Differential Revision: https://reviews.llvm.org/D94160	2021-01-08 08:55:25 +00:00
Cameron McInally	f4013359b3	[SVE] Add unpacked scalable floating point ZIP/UZP/TRN patterns Differential Revision: https://reviews.llvm.org/D94193	2021-01-07 09:56:53 -06:00
Bradley Smith	c73ae747cb	[AArch64][SVE] Add optimization to remove redundant ptest instructions Co-Authored-by: Graham Hunter <graham.hunter@arm.com> Co-Authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D93292	2021-01-05 15:28:36 +00:00
Paul Walker	eba6deab22	[SVE] Lower vector CTLZ, CTPOP and CTTZ operations. CTLZ and CTPOP are lowered to CLZ and CNT instructions respectively. CTTZ is not a native SVE operation but is instead lowered to: CTTZ(V) => CTLZ(BITREVERSE(V)) In the case of fixed-length support using SVE we also lower CTTZ operating on NEON sized vectors because of its reliance on BITREVERSE which is also lowered to SVE intructions at these lengths. Differential Revision: https://reviews.llvm.org/D93607	2021-01-05 10:42:35 +00:00
Paul Walker	8eec7294fe	[SVE] Lower vector BITREVERSE and BSWAP operations. These operations are lowered to RBIT and REVB instructions respectively. In the case of fixed-length support using SVE we also lower BITREVERSE operating on NEON sized vectors as this results in fewer instructions. Differential Revision: https://reviews.llvm.org/D93606	2020-12-22 16:49:50 +00:00
Paul Walker	c0bc169cb1	[NFC][SVE] Clean up bfloat isel patterns that emit non-bfloat instructions. During isel there's no need to protect illegal types. Patch also adds a missing unit test for tbl2 intrinsic using bfloat types. Differential Revision: https://reviews.llvm.org/D93404	2020-12-18 13:20:41 +00:00
Paul Walker	632f4d2747	[NFC] Fix a few SVEInstrInfo related stylistic issues.	2020-12-15 16:10:38 +00:00
Kerry McLaughlin	c5ced82c8e	[SVE][CodeGen] Lower scalable floating-point vector reductions Changes in this patch: - Minor changes to the LowerVECREDUCE_SEQ_FADD function added by @cameron.mcinally to also work for scalable types - Added TableGen patterns for FP reductions with unpacked types (nxv2f16, nxv4f16 & nxv2f32) - Asserts added to expandFMINNUM_FMAXNUM & expandVecReduceSeq for scalable types Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D93050	2020-12-14 11:45:42 +00:00
Huihui Zhang	1e113c078a	[AArch64][SVE] Fix umin/umax lowering to handle out of range imm. Immediate must be in an integer range [0,255] for umin/umax instruction. Extend pattern matching helper SelectSVEArithImm() to take in value type bitwidth when checking immediate value is in range or not. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D89831	2020-10-23 09:42:56 -07:00
Paul C. Anagnostopoulos	2871c6c93f	[Aarch64] [TableGen] Clean up !if(!eq(boolean, 1) and related booleans. Differential Revision: https://reviews.llvm.org/D89551	2020-10-19 10:33:55 -04:00
Muhammad Asif Manzoor	aab6f7db47	[AArch64][SVE] Add lowering for llvm fabs Add the functionality to lower fabs for passthru variant Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D88679	2020-10-01 19:41:25 -04:00
Kerry McLaughlin	fcf70e1e3b	[SVE][CodeGen] Lower scalable fp_extend & fp_round operations This patch adds FP_EXTEND_MERGE_PASSTHRU & FP_ROUND_MERGE_PASSTHRU ISD nodes, used to lower scalable vector fp_extend/fp_round operations. fp_round has an additional argument, the 'trunc' flag, which is an integer of zero or one. This also fixes a warning introduced by the new tests added to sve-split-fcvt.ll, resulting from an implicit TypeSize -> uint64_t cast in SplitVecOp_FP_ROUND. Reviewed By: sdesmalen, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88321	2020-10-01 12:17:37 +01:00
Muhammad Asif Manzoor	3a76de4275	[AArch64][SVE] Add lowering for llvm frecpx Add the functionality to lower frecpx for passthru variant Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D88032	2020-09-23 15:23:54 -04:00
Kerry McLaughlin	d0149ba9b4	[SVE][CodeGen] Lower legal integer -> floating point conversions This patch adds new ISD nodes, SCVTZ_MERGE_PASSTHRU & UCVTZ_MERGE_PASSTHRU, which are used to lower both legal scalable vector [S\|U]INT_TO_FP operations and the following intrinsics: - llvm.aarch64.sve.scvtf - llvm.aarch64.sve.ucvtf Reviewed By: sdesmalen, efriedma Differential Revision: https://reviews.llvm.org/D87913	2020-09-23 11:53:53 +01:00
David Sherwood	96e52c1364	[SVE][CodeGen] Mark ptrue/pfalse instructions as rematerializable	2020-09-21 16:44:32 +01:00
Paul Walker	f3fa954b5b	[SVE] Change definition of reduction ISD nodes to have an SVE vector result type. The current nodes, AArch64::SMAXV_PRED for example, are defined to return a NEON vector result. This is incorrect because they modify the complete SVE register and are thus changed to represent such. This patch also adds nodes for UADDV_PRED and SADDV_PRED, which unifies the handling of all SVE reductions. NOTE: Floating-point reductions are already implemented correctly, so this patch is essentially making everything consistent with those. Differential Revision: https://reviews.llvm.org/D87843	2020-09-21 13:16:28 +01:00
Kerry McLaughlin	f7185b271f	[SVE][CodeGen] Lower floating point -> integer conversions This patch adds new ISD nodes, FCVTZS_MERGE_PASSTHRU & FCVTZU_MERGE_PASSTHRU, which are used to lower scalable vector FP_TO_SINT/FP_TO_UINT operations and the following intrinsics: - llvm.aarch64.sve.fcvtzu - llvm.aarch64.sve.fcvtzs Reviewed By: efriedma, paulwalker-arm Differential Revision: https://reviews.llvm.org/D87232	2020-09-17 14:04:22 +01:00
Muhammad Asif Manzoor	fd536eeed9	[AArch64][SVE] Add lowering for llvm fceil Add the functionality to lower fceil for passthru variant Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84548	2020-08-26 15:59:44 -04:00
Francesco Petrogalli	61dfa00957	[MC][SVE] Fix data operand for instruction alias of `st1d`. The version of `st1d` that operates with vector plus immediate addressing mode uses the alias `st1d { <Zn>.d }, <Pg>, [<Za>.d]` for rendering `st1d { <Zn>.d }, <Pg>, [<Za>.d, #0]`. The disassembler was generating `<Zn>.s` instead of `<Zn>.d>`. Differential Revision: https://reviews.llvm.org/D86633	2020-08-26 18:22:17 +00:00
Paul Walker	73ac3c0ede	[SVE] Lower scalable vector ISD::FNEG operations. Also updates isConstOrConstSplatFP to allow the mul(A,-1) -> neg(A) transformation when -1 is expressed as an ISD::SPLAT_VECTOR. Differential Revision: https://reviews.llvm.org/D86415	2020-08-25 11:22:28 +01:00
Paul Walker	0015b8db8e	[SVE] Add ISEL patterns for predicated shifts by an immediate. For scalable vector shifts the prediacte is typically all active, which gets selected to an unpredicated shift by immediate. When code generating for fixed length vectors the predicate is based on the vector length and so additional patterns are required to make use of SVE's predicated shift by immediate instructions. Differential Revision: https://reviews.llvm.org/D86204	2020-08-20 11:47:20 +01:00
Eli Friedman	be944c85f3	[AArch64][SVE] Add patterns for integer mla/mls. We probably want to introduce pseudo-instructions at some point, like we have for binary operations, but this seems okay for now. One thing I'm not sure about is whether we should be doing this as a DAGCombine instead of directly pattern-matching it. I don't see any big downside to doing it this way, though. Differential Revision: https://reviews.llvm.org/D85681	2020-08-18 12:51:16 -07:00
Paul Walker	9f63dc3265	[SVE] Fix shift-by-imm patterns used by asr, lsl & lsr intrinsics. Right shift patterns will no longer incorrectly accept a shift amount of zero. At the same time they will allow larger shift amounts that are now saturated to their upper bound. Patterns have been extended to enable immediate forms for shifts taking an arbitrary predicate. This patch also unifies the code path for immediate parsing so the i64 based shifts are no longer treated specially. Differential Revision: https://reviews.llvm.org/D86084	2020-08-18 11:41:26 +01:00
Paul Walker	b6c7b7fa31	[SVE] Add ISD nodes for predicated integer extend inreg operations. These are useful instructions when lowering fixed length vector extends, so I've broken this patch out as kind of NFC like work. Differential Revision: https://reviews.llvm.org/D85546	2020-08-11 11:39:26 +01:00
Paul Walker	0d33a8ef5b	[SVE] Lower scalable vector mul operations. This allows us to remove extra patterns from AArch64SVEInstrInfo.td because we can reuse those required for fixed length vectors. Differential Revision: https://reviews.llvm.org/D85328	2020-08-06 11:15:35 +01:00
Paul Walker	3ed59b775d	[SVE] Implement lowering for fixed length vector multiplication. NOTE: Also uses SVE code generation for NEON size vectors, instead of expanding i64 based vector multiplications. Differential Revision: https://reviews.llvm.org/D85327	2020-08-06 11:01:39 +01:00
Paul Walker	4be13b15d6	[SVE] Replace remaining _MERGE_OP1 nodes with _PRED variants. This is the final bit of work to relax the register allocation requirements when code generating normal LLVM IR, which rarely care about the result of inactive lanes. By using _PRED nodes we can make better use of SVE's reversed instructions. Also removes a redundant parameter from the min/max tests. Differential Revision: https://reviews.llvm.org/D85142	2020-08-04 11:19:17 +01:00

1 2 3 4 5 ...

302 Commits