llvm-project

Commit Graph

Author	SHA1	Message	Date
Sander de Smalen	faee91a52b	[AArch64][SVE] Asm: Support for insert element (INSR) instructions. Insert general purpose register into shifted vector, e.g. insr z0.s, w0 insr z0.d, x0 Insert SIMD&FP scalar register into shifted vector, e.g. insr z0.b, b0 insr z0.h, h0 insr z0.s, s0 insr z0.d, d0 llvm-svn: 336979	2018-07-13 08:51:57 +00:00
Sjoerd Meijer	83a2a62fb4	[AArch64] Armv8.4-A: LDAPR & STLR with immediate offset instructions These instructions are added to AArch64 only. llvm-svn: 336913	2018-07-12 14:57:59 +00:00
Sander de Smalen	ea45a89e5c	[AArch64][SVE] Asm: Support for COMPACT instruction. The compact instruction shuffles active elements of vector into lowest numbered elements and sets remaining elements to zero. e.g. compact z0.s, p0, z1.s llvm-svn: 336789	2018-07-11 11:22:26 +00:00
Sander de Smalen	a90530f7c1	[AArch64][SVE] Asm: Support for LAST(A\|B) and CLAST(A\|B) instructions. The LASTB and LASTA instructions extract the last active element, or element after the last active, from the source vector. The added variants are: Scalar: last(a\|b) w0, p0, z0.b last(a\|b) w0, p0, z0.h last(a\|b) w0, p0, z0.s last(a\|b) x0, p0, z0.d SIMD & FP Scalar: last(a\|b) b0, p0, z0.b last(a\|b) h0, p0, z0.h last(a\|b) s0, p0, z0.s last(a\|b) d0, p0, z0.d The CLASTB and CLASTA conditionally extract the last or element after the last active element from the source vector. The added variants are: Scalar: clast(a\|b) w0, p0, w0, z0.b clast(a\|b) w0, p0, w0, z0.h clast(a\|b) w0, p0, w0, z0.s clast(a\|b) x0, p0, x0, z0.d SIMD & FP Scalar: clast(a\|b) b0, p0, b0, z0.b clast(a\|b) h0, p0, h0, z0.h clast(a\|b) s0, p0, s0, z0.s clast(a\|b) d0, p0, d0, z0.d Vector: clast(a\|b) z0.b, p0, z0.b, z1.b clast(a\|b) z0.h, p0, z0.h, z1.h clast(a\|b) z0.s, p0, z0.s, z1.s clast(a\|b) z0.d, p0, z0.d, z1.d Please refer to the architecture specification for more details on the semantics of the added instructions. llvm-svn: 336783	2018-07-11 10:08:00 +00:00
Sander de Smalen	53108d48f7	[AArch64][SVE] Asm: Support for predicated unary operations. This patch adds support for the following instructions: CLS (Count Leading Sign bits) CLZ (Count Leading Zeros) CNT (Count non-zero bits) CNOT (Logically invert boolean condition in vector) NOT (Bitwise invert vector) FABS (Floating-point absolute value) FNEG (Floating-point negate) All operations are predicated and unary, e.g. clz z0.s, p0/m, z1.s - CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32 and 64 bit elements. - FABS and FNEG have variants for 16, 32 and 64 bit elements. llvm-svn: 336677	2018-07-10 14:05:55 +00:00
Sander de Smalen	d3efb59f29	[AArch64][SVE] Asm: Support for CNT(B\|H\|W\|D) and CNTP instructions. This patch adds support for the following instructions: CNTB CNTH - Determine the number of active elements implied by CNTW CNTD the named predicate constant, multiplied by an immediate, e.g. cnth x0, vl8, #16 CNTP - Count active predicate elements, e.g. cntp x0, p0, p1.b counts the number of active elements in p1, predicated by p0, and stores the result in x0. llvm-svn: 336552	2018-07-09 15:22:08 +00:00
Sander de Smalen	813b21e33a	[AArch64][SVE] Asm: Support for remaining shift instructions. This patch completes support for shifts, which include: - LSL - Logical Shift Left - LSLR - Logical Shift Left, Reversed form - LSR - Logical Shift Right - LSRR - Logical Shift Right, Reversed form - ASR - Arithmetic Shift Right - ASRR - Arithmetic Shift Right, Reversed form - ASRD - Arithmetic Shift Right for Divide In the following variants: - Predicated shift by immediate - ASR, LSL, LSR, ASRD e.g. asr z0.h, p0/m, z0.h, #1 (active lanes of z0 shifted by #1) - Unpredicated shift by immediate - ASR, LSL, LSR e.g. asr z0.h, z1.h, #1 (all lanes of z1 shifted by #1, stored in z0) - Predicated shift by vector - ASR, LSL, LSR e.g. asr z0.h, p0/m, z0.h, z1.h (active lanes of z0 shifted by z1, stored in z0) - Predicated shift by vector, reversed form - ASRR, LSLR, LSRR e.g. lslr z0.h, p0/m, z0.h, z1.h (active lanes of z1 shifted by z0, stored in z0) - Predicated shift left/right by wide vector - ASR, LSL, LSR e.g. lsl z0.h, p0/m, z0.h, z1.d (active lanes of z0 shifted by wide elements of vector z1) - Unpredicated shift left/right by wide vector - ASR, LSL, LSR e.g. lsl z0.h, z1.h, z2.d (all lanes of z1 shifted by wide elements of z2, stored in z0) *Variants added in previous patches. llvm-svn: 336547	2018-07-09 13:23:41 +00:00
Sander de Smalen	54077dcfcb	[AArch64][SVE] Asm: Support for TBL instruction. Support for SVE's TBL instruction for programmable table lookup/permute using vector of element indices, e.g. tbl z0.d, { z1.d }, z2.d stores elements from z1, indexed by elements from z2, into z0. llvm-svn: 336544	2018-07-09 12:32:56 +00:00
Sander de Smalen	c69944c6b0	[AArch64][SVE] Asm: Support for ADR instruction. Supporting various addressing modes: - adr z0.s, [z0.s, z0.s] - adr z0.s, [z0.s, z0.s, lsl #<shift>] - adr z0.d, [z0.d, z0.d] - adr z0.d, [z0.d, z0.d, lsl #<shift>] - adr z0.d, [z0.d, z0.d, uxtw #<shift>] - adr z0.d, [z0.d, z0.d, sxtw #<shift>] Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D48870 llvm-svn: 336533	2018-07-09 09:58:24 +00:00
Sander de Smalen	bd513b42a1	[AArch64][SVE] Asm: Support for UZP and TRN instructions. This patch adds support for: UZP1 Concatenate even elements from two vectors UZP2 Concatenate odd elements from two vectors TRN1 Interleave even elements from two vectors TRN2 Interleave odd elements from two vectors With variants for both data and predicate vectors, e.g. uzp1 z0.b, z1.b, z2.b trn2 p0.s, p1.s, p2.s llvm-svn: 336531	2018-07-09 09:12:17 +00:00
Yvan Roux	a96a04558d	[MachineOutliner] Assert that Liveness tracking is accurate (NFC) The checking is done deeper inside MachineBasicBlock, but this will hopefully help to find issues when porting the machine outliner to a target where Liveness tracking is broken (like ARM). Differential Revision: https://reviews.llvm.org/D49023 llvm-svn: 336481	2018-07-07 08:02:19 +00:00
Sjoerd Meijer	35bd8f5d1e	[AArch64] Armv8.4-A: TLB support This adds: - outer shareable TLB Maintenance instructions, and - TLB range maintenance instructions. llvm-svn: 336434	2018-07-06 13:00:16 +00:00
Sjoerd Meijer	a3dad801b7	Recommit: [AArch64] Armv8.4-A: Flag manipulation instructions Now with the asm operand definition included. llvm-svn: 336432	2018-07-06 12:32:33 +00:00
Sjoerd Meijer	8203177e5e	Revert [AArch64] Armv8.4-A: Flag manipulation instructions It's causing build errors. llvm-svn: 336422	2018-07-06 08:39:43 +00:00
Sjoerd Meijer	6f5f6d5b2e	[AArch64] Armv8.4-A: Flag manipulation instructions These instructions are added to AArch64 only. Differential Revision: https://reviews.llvm.org/D48926 llvm-svn: 336421	2018-07-06 08:12:20 +00:00
Sjoerd Meijer	2a57b357a3	[AArch64][ARM] Armv8.4-A: Trace synchronization barrier instruction This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction. Differential Revision: https://reviews.llvm.org/D48918 llvm-svn: 336418	2018-07-06 08:03:12 +00:00
Sander de Smalen	e2c10f8f47	This is a recommit of r336322, previously reverted in r336324 due to a deficiency in TableGen that has been addressed in r336334. [AArch64][SVE] Asm: Support for predicated FP rounding instructions. This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336387	2018-07-05 20:21:21 +00:00
Simon Pilgrim	6dc45e6ca0	Try to fix -Wimplicit-fallthrough warning. NFCI. llvm-svn: 336331	2018-07-05 09:48:01 +00:00
Sander de Smalen	097ab704c9	Reverting r336322 for now, as it causes an assert failure in TableGen, for which there is already a patch in Phabricator (D48937) that needs to be committed first. llvm-svn: 336324	2018-07-05 08:52:03 +00:00
Sander de Smalen	ef44226c4f	[AArch64][SVE] Asm: Support for predicated FP rounding instructions. This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336322	2018-07-05 08:38:30 +00:00
Sander de Smalen	592718f906	[AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABD This patch implements the following varieties: - Unpredicated signed max, e.g. smax z0.h, z1.h, #-128 - Unpredicated signed min, e.g. smin z0.h, z1.h, #-128 - Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255 - Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255 - Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h - Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h - Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h - Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h - Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h - Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h llvm-svn: 336317	2018-07-05 07:54:10 +00:00
Yvan Roux	eaececf5e0	[MachineOutliner] Fix typo in getOutliningCandidateInfo function name getOutlininingCandidateInfo -> getOutliningCandidateInfo Differential Revision: https://reviews.llvm.org/D48867 llvm-svn: 336285	2018-07-04 15:37:08 +00:00
Sander de Smalen	1e4dc2e97d	[AArch64][SVE] Asm: Support for reversed subtract (SUBR) instruction. This patch adds both a vector and an immediate form, e.g. - Vector form: subr z0.h, p0/m, z0.h, z1.h subtract active elements of z0 from z1, and store the result in z0. - Immediate form: subr z0.h, z0.h, #255 subtract elements of z0, and store the result in z0. llvm-svn: 336274	2018-07-04 14:05:33 +00:00
Sander de Smalen	ab2b0530d9	[AArch64][SVE] Asm: Support for instructions to set/read FFR. Includes instructions to read the First-Faulting Register (FFR): - RDFFR (unpredicated) rdffr p0.b - RDFFR (predicated) rdffr p0.b, p0/z - RDFFRS (predicated, sets condition flags) rdffr p0.b, p0/z Includes instructions to set/write the FFR: - SETFFR (no arguments, sets the FFR to all true) setffr - WRFFR (unpredicated) wrffr p0.b llvm-svn: 336267	2018-07-04 12:58:46 +00:00
Sander de Smalen	80283b2af4	[AArch64][SVE] Asm: Support for FP conversion instructions. The variants added are: - fcvt (FP convert precision) - scvtf (signed int -> FP) - ucvtf (unsigned int -> FP) - fcvtzs (FP -> signed int (round to zero)) - fcvtzu (FP -> unsigned int (round to zero)) For example: fcvt z0.h, p0/m, z0.s (single- to half-precision FP) scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP) ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP) fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int) fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int) llvm-svn: 336265	2018-07-04 12:13:17 +00:00
Sander de Smalen	e31e6d46dd	[AArch64][SVE] Asm: Support for SVE condition code aliases SVE overloads the AArch64 PSTATE condition flags and introduces a set of condition code aliases for the assembler. The details are described in section 2.2 of the architecture reference manual supplement for SVE. In short: SVE alias => AArch64 name -------------------------- NONE => EQ ANY => NE NLAST => HS LAST => LO FIRST => MI NFRST => PL PMORE => HI PLAST => LS TCONT => GE TSTOP => LT Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48869 llvm-svn: 336245	2018-07-04 08:50:49 +00:00
Fangrui Song	bc5c7f2ef0	[AArch64] Make function parameter names in declarations match those of definitions llvm-svn: 336222	2018-07-03 19:07:53 +00:00
Sander de Smalen	128fdfa23f	[AArch64][SVE] Asm: Support for FP Complex ADD/MLA. The variants added in this patch are: - Predicated Complex floating point ADD with rotate, e.g. fcadd z0.h, p0/m, z0.h, z1.h, #90 - Predicated Complex floating point MLA with rotate, e.g. fcmla z0.h, p0/m, z1.h, z2.h, #180 - Unpredicated Complex floating point MLA with rotate (indexed operand), e.g. fcmla z0.h, p0/m, z1.h, z2.h[0], #180 Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48824 llvm-svn: 336210	2018-07-03 16:01:27 +00:00
Amara Emerson	d912ffaba5	[AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to unselectable stores. r336120 resulted in falling back to SelectionDAG more often due to the G_STORE MMOs not matching the vreg size. This fixes that by explicitly any-extending the value. llvm-svn: 336209	2018-07-03 15:59:26 +00:00
Sander de Smalen	8cd1f53334	[AArch64][SVE] Asm: Support for FMUL (indexed) Unpredicated FP-multiply of SVE vector with a vector-element given by vector[index], for example: fmul z0.s, z1.s, z2.s[0] which performs an unpredicated FP-multiply of all 32-bit elements in 'z1' with the first element from 'z2'. This patch adds restricted register classes for SVE vectors: ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements. ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48823 llvm-svn: 336205	2018-07-03 15:31:04 +00:00
Sander de Smalen	cbd224941f	[AArch64][SVE] Asm: Support for predicated unary operations. The patch includes support for the following instructions: ABS z0.h, p0/m, z0.h NEG z0.h, p0/m, z0.h (S\|U)XTB z0.h, p0/m, z0.h (S\|U)XTB z0.s, p0/m, z0.s (S\|U)XTB z0.d, p0/m, z0.d (S\|U)XTH z0.s, p0/m, z0.s (S\|U)XTH z0.d, p0/m, z0.d (S\|U)XTW z0.d, p0/m, z0.d llvm-svn: 336204	2018-07-03 14:57:48 +00:00
Sjoerd Meijer	173b7f0ec7	[AArch64] Armv8.4-A: system registers This adds the following system registers: - RAS registers, - MPAM registers, - Activitiy monitor registers, - Trace Extension registers, - Timing insensitivity of data processing instructions, - Enhanced Support for Nested Virtualization. Differential Revision: https://reviews.llvm.org/D48871 llvm-svn: 336193	2018-07-03 12:09:20 +00:00
Sander de Smalen	7fc8543208	[AArch64][SVE] Asm: Support for saturing ADD/SUB instructions. The variants added are: signed Saturating ADD/SUB (immediate) e.g. sqadd z0.h, z0.h, #42 unsigned Saturating ADD/SUB (immediate) e.g. uqadd z0.h, z0.h, #42 signed Saturating ADD/SUB (vectors) e.g. sqadd z0.h, z0.h, z1.h unsigned Saturating ADD/SUB (vectors) e.g. uqadd z0.h, z0.h, z1.h llvm-svn: 336186	2018-07-03 09:48:22 +00:00
Sander de Smalen	8fcc3f5feb	[AArch64][SVE] Asm: Support for vector element FP compare. Contains the following variants: - Compare with (elements from) other vector instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo. aliases: fcmle, fcmlt. e.g. fcmle p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h - Compare absolute values with (absolute values from) other vector. instructions: facge, facgt. aliases: facle, faclt. e.g. facle p0.h, p0/z, z0.h, z1.h => facge p0.h, p0/z, z1.h, z0.h - Compare vector elements with #0.0 instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne. e.g. fcmle p0.h, p0/z, z0.h, #0.0 llvm-svn: 336182	2018-07-03 09:07:23 +00:00
Amara Emerson	846f2436e8	[AArch64][GlobalISel] Any-extend vararg parameters to stack slot size on Darwin. We currently don't any-extend vararg parameters before storing them to the stack locations on Darwin. However, SelectionDAG however does this, and so user code is in the wild which inadvertently relies on this extension. This can manifest in cases where the value stored is (int)0, but the actual parameter is interpreted by va_arg as a pointer, and so not extending to 64 bits causes the callee to load additional undefined bits. llvm-svn: 336120	2018-07-02 16:39:09 +00:00
Sander de Smalen	8d4c01a702	[AArch64][SVE] Asm: Support for (SQ)INCP/DECP (scalar, vector) Increments/decrements the result with the number of active bits from the predicate. The inc/dec variants added are: - incp x0, p0.h (scalar) - incp z0.h, p0 (vector) The unsigned saturating inc/dec variants added are: - uqincp x0, p0.h (scalar) - uqincp w0, p0.h (scalar, 32bit) - uqincp z0.h, p0 (vector) The signed saturating inc/dec variants added are: - sqincp x0, p0.h (scalar) - sqincp x0, p0.h, w0 (scalar, 32bit) - sqincp z0.h, p0 (vector) llvm-svn: 336091	2018-07-02 10:08:36 +00:00
Sander de Smalen	c504101781	[AArch64][SVE] Asm: Support for (saturating) vector INC/DEC instructions. Increment/decrement vector by multiple of predicate constraint element count. The variants added by this patch are: - INCH, INCW, INC and (saturating): - SQINCH, SQINCW, SQINCD - UQINCH, UQINCW, UQINCW - SQDECH, SQINCW, SQINCD - UQDECH, UQINCW, UQINCW For example: incw z0.s, all, mul #4 llvm-svn: 336090	2018-07-02 09:31:11 +00:00
Sander de Smalen	8eea4f1c7d	[AArch64][SVE] Asm: Support for vector element compares (immediate). Compare vector elements with a signed/unsigned immediate, e.g. cmpgt p0.s, p0/z, z0.s, #-16 cmphi p0.s, p0/z, z0.s, #127 llvm-svn: 336081	2018-07-02 08:20:59 +00:00
Sander de Smalen	0325e304b9	Reapply r334980 and r334983. These patches were previously reverted as they led to buildbot time-outs caused by large switch statement in printAliasInstr when using UBSan and O3. The issue has been addressed with a workaround (r335525). llvm-svn: 336079	2018-07-02 07:34:52 +00:00
Sjoerd Meijer	3b599d75d5	[AArch64] Armv8.4-A: Virtualization system registers This adds the Secure EL2 extension. Differential Revision: https://reviews.llvm.org/D48711 llvm-svn: 335962	2018-06-29 11:03:15 +00:00
Sjoerd Meijer	195e904002	[ARM][AArch64] Armv8.4-A Enablement Initial patch adding assembly support for Armv8.4-A. Besides adding v8.4 as a supported architecture to the usual places, this also adds target features for the different crypto algorithms. Armv8.4-A introduced new crypto algorithms, made them optional, and allows different combinations: - none of the v8.4 crypto functions are supported, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - the v8.4 SHA512 and SHA3 support is implemented, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. - the v8.4 SM3 and SM4 support is implemented, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - all of the v8.4 crypto functions are supported, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. The v8.4 crypto instructions are added to AArch64 only, and not AArch32, and are made optional extensions to Armv8.2-A. The user-facing Clang options will map on these new target features, their naming will be compatible with GCC and added in follow-up patches. The Armv8.4-A instruction sets can be downloaded here: https://developer.arm.com/products/architecture/a-profile/exploration-tools Differential Revision: https://reviews.llvm.org/D48625 llvm-svn: 335953	2018-06-29 08:43:19 +00:00
Jessica Paquette	dafa198c96	[MachineOutliner] Define MachineOutliner support in TargetOptions Targets should be able to define whether or not they support the outliner without the outliner being added to the pass pipeline. Before this, the outliner pass would be added, and ask the target whether or not it supports the outliner. After this, it's possible to query the target in TargetPassConfig, before the outliner pass is created. This ensures that passing -enable-machine-outliner will not modify the pass pipeline of any target that does not support it. https://reviews.llvm.org/D48683 llvm-svn: 335887	2018-06-28 17:45:43 +00:00
Matthias Braun	da5e7e11d1	SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O) Add NoTrapAfterNoreturn target option which skips emission of traps behind noreturn calls even if TrapUnreachable is enabled. Enable the feature on Mach-O to save code size; Comments suggest it is not possible to enable it for the other users of TrapUnreachable. rdar://41530228 DifferentialRevision: https://reviews.llvm.org/D48674 llvm-svn: 335877	2018-06-28 17:00:45 +00:00
Daniel Sanders	bdeb880d14	[globalisel][legalizer] Add AtomicOrdering to LegalityQuery and use it in AArch64 Now that we have the ability to legalize based on MMO's. Add support for legalizing based on AtomicOrdering and use it to correct the legalization of the atomic instructions. Also extend all() to be a variadic template as this ruleset now requires 3 and 4 argument versions. llvm-svn: 335767	2018-06-27 19:03:21 +00:00
Jessica Paquette	f472f6159a	[MachineOutliner] Don't outline sequences where x16/x17/nzcv are live across It isn't safe to outline sequences of instructions where x16/x17/nzcv live across the sequence. This teaches the outliner to check whether or not a specific canidate has x16/x17/nzcv live across it and discard the candidate in the case that that is true. https://bugs.llvm.org/show_bug.cgi?id=37573 https://reviews.llvm.org/D47655 llvm-svn: 335758	2018-06-27 17:43:27 +00:00
Luke Geeson	316327150b	[AArch64] Reverting FP16 vcvth_n_s64_f16 to fix llvm-svn: 335737	2018-06-27 14:34:40 +00:00
Adhemerval Zanella	cadcfed7aa	[AArch64] Add custom lowering for v4i8 trunc store This patch adds a custom trunc store lowering for v4i8 vector types. Since there is not v.4b register, the v4i8 is promoted to v4i16 (v.4h) and default action for v4i8 is to extract each element and issue 4 byte stores. A better strategy would be to extended the promoted v4i16 to v8i16 (with undef elements) and extract and store the word lane which represents the v4i8 subvectores. The construction: define void @foo(<4 x i16> %x, i8* nocapture %p) { %0 = trunc <4 x i16> %x to <4 x i8> %1 = bitcast i8* %p to <4 x i8>* store <4 x i8> %0, <4 x i8>* %1, align 4, !tbaa !2 ret void } Can be optimized from: umov w8, v0.h[3] umov w9, v0.h[2] umov w10, v0.h[1] umov w11, v0.h[0] strb w8, [x0, #3] strb w9, [x0, #2] strb w10, [x0, #1] strb w11, [x0] ret To: xtn v0.8b, v0.8h str s0, [x0] ret The patch also adjust the memory cost for autovectorization, so the C code: void foo (const int src, int width, unsigned char dst) { for (int i = 0; i < width; i++) dst++ = src++; } can be vectorized to: .LBB0_4: // %vector.body // =>This Inner Loop Header: Depth=1 ldr q0, [x0], #16 subs x12, x12, #4 // =4 xtn v0.4h, v0.4s xtn v0.8b, v0.8h st1 { v0.s }[0], [x2], #4 b.ne .LBB0_4 Instead of byte operations. llvm-svn: 335735	2018-06-27 13:58:46 +00:00
Luke Geeson	68cb233c0f	[AArch64] Remove Duplicate FP16 Patterns with same encoding, match on existing patterns llvm-svn: 335715	2018-06-27 09:20:13 +00:00
Simon Pilgrim	9c8f9374b5	[CostModel][AArch64] Add some initial costs for SK_Select and SK_PermuteSingleSrc AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion. This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174. I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more. Differential Revision: https://reviews.llvm.org/D48172 llvm-svn: 335329	2018-06-22 09:45:31 +00:00
Sirish Pande	b60acb9e48	Revert "[AArch64] Coalesce Copy Zero during instruction selection" This reverts commit d8f57105010cc7e78026e511d5def873fc91e0e7. Original Commit: Author: Haicheng Wu <haicheng@codeaurora.org> Date: Sun Feb 18 13:51:33 2018 +0000 [AArch64] Coalesce Copy Zero during instruction selection Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 Author's intention is to remove a BB that has one mov instruction. In order to do that, d8f571050 pessmizes MachineSinking by introducing a copy, such that mov instruction is NOT moved to the BB. Optimization downstream gets rid of the BB with only mov instruction. This works well if we have only one fall through branch as there is only one "extra" mov instruction. If we have multiple fall throughs, we will have a lot of redundant movs. In such a case, it's better to have this BB which has one mov instruction. This is causing degradation in jpeg, fft and other codebases. I believe if we want to remove a BB with only one branch instruction, we should not pessimize Machine Sinking at all, and find some other solution. llvm-svn: 335251	2018-06-21 16:05:24 +00:00
Tim Northover	70666e7765	[AArch64] Implement FLT_ROUNDS macro. Very similar to ARM implementation, just maps to an MRS. Should fix PR25191. Patch by Michael Brase. llvm-svn: 335118	2018-06-20 12:09:01 +00:00
Vlad Tsyrklevich	98724e582e	Revert r334980 and 334983 This reverts commits r334980 and r334983 because they were causing build timeouts on the x86_64-linux-ubsan bot. llvm-svn: 335085	2018-06-20 00:02:32 +00:00
Jessica Paquette	32de26d432	[MachineOutliner] NFC: Remove insertOutlinerPrologue, rename insertOutlinerEpilogue insertOutlinerPrologue was not used by any target, and prologue-esque code was beginning to appear in insertOutlinerEpilogue. Refactor that into one function, buildOutlinedFrame. This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue. llvm-svn: 335076	2018-06-19 21:14:48 +00:00
Sander de Smalen	067eee1c13	[AArch64][SVE] Asm: Fix predicate pattern diagnostics. This patch uses the DiagnosticPredicate for SVE predicate patterns to improve their diagnostics, now giving a 'invalid operand' diagnostic if the type is not an immediate or one of the expected pattern labels. Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48220 llvm-svn: 334983	2018-06-18 21:03:02 +00:00
Sander de Smalen	7ac9e193ec	[AArch64][SVE] Asm: Support for saturating INC/DEC (32bit scalar) instructions. The variants added by this patch are: - SQINC signed increment, e.g. sqinc x0, w0, all, mul #4 - SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4 - UQINC unsigned increment, e.g. uqinc w0, all, mul #4 - UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4 This patch includes asmparser changes to parse a GPR64 as a GPR32 in order to satisfy the constraint check: x0 == GPR64(w0) in: sqinc x0, w0, all, mul #4 ^___^ (must match) Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47716 llvm-svn: 334980	2018-06-18 20:50:33 +00:00
Sander de Smalen	13684d8400	[AArch64][SVE] Asm: Support for saturating INC/DEC (64bit scalar) instructions. Summary: The variants added by this patch are: - SQINC (signed increment) - UQINC (unsigned increment) - SQDEC (signed decrement) - UQDEC (unsigned decrement) For example: uqincw x0, all, mul #4 Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Differential Revision: https://reviews.llvm.org/D47715 llvm-svn: 334948	2018-06-18 14:47:52 +00:00
Sander de Smalen	d521c4353e	[AArch64][SVE] Asm: Support for vector element compares. This patch adds instructions for comparing elements from two vectors, e.g. cmpgt p0.s, p0/z, z0.s, z1.s and also adds support for comparing to a 64-bit wide element vector, e.g. cmpgt p0.s, p0/z, z0.s, z1.d The patch also contains aliases for certain comparisons, e.g.: cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s llvm-svn: 334931	2018-06-18 10:59:19 +00:00
Sander de Smalen	279b7e74e7	[AArch64][SVE] Asm: Support for bitwise operations on predicate vectors. This patch adds support for instructions performing bitwise operations on predicate vectors, including AND, BIC, EOR, NAND, NOR, ORN, ORR, and their status flag setting variants ANDS, BICS, EORS, NANDS, ORNS, ORRS. This patch also adds several aliases: orr p0.b, p1/z, p1.b, p1.b => mov p0.b, p1.b orrs p0.b, p1/z, p1.b, p1.b => movs p0.b, p1.b and p0.b, p1/z, p2.b, p2.b => mov p0.b, p1/z, p2.b ands p0.b, p1/z, p2.b, p2.b => movs p0.b, p1/z, p2.b eor p0.b, p1/z, p2.b, p1.b => not p0.b, p1/z, p2.b eors p0.b, p1/z, p2.b, p1.b => nots p0.b, p1/z, p2.b llvm-svn: 334906	2018-06-17 10:48:21 +00:00
Sander de Smalen	2c25b4cd36	[AArch64][SVE] Asm: Support for SEL (vector/predicate) instructions. Support for SVE's predicated select instructions to select elements from either vector, both in a data-vector and a predicate-vector variant. llvm-svn: 334905	2018-06-17 10:11:04 +00:00
Sander de Smalen	a6edca72ba	[AArch64][SVE] Asm: Support for CPY SIMD/FP and GPR instructions. Predicated splat/copy of SIMD/FP register or general purpose register to SVE vector, along with MOV-aliases. llvm-svn: 334842	2018-06-15 16:39:46 +00:00
Sander de Smalen	18ac8f9f25	[AArch64][SVE] Asm: Support for INC/DEC (scalar) instructions. Increment/decrement scalar register by (scaled) element count given by predicate pattern, e.g. 'incw x0, all, mul #4'. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47713 llvm-svn: 334838	2018-06-15 15:47:44 +00:00
Sander de Smalen	5eb51d7495	[AArch64][SVE] Asm: Support for FADD, FMUL and FMAX immediate instructions. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47712 llvm-svn: 334831	2018-06-15 13:57:51 +00:00
Sander de Smalen	3cbf171479	[AArch64][SVE] Asm: Add parsing/printing support for exact FP immediates. Some instructions require of a limited set of FP immediates as operands, for example '#0.5 or #1.0' for SVE's FADD instruction. This patch adds support for parsing and printing such FP immediates as exact values (e.g. #0.499999 is not accepted for #0.5). Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47711 llvm-svn: 334826	2018-06-15 13:11:49 +00:00
Clement Courbet	5eeed77f87	[TableGen] Emit a fatal error on inconsistencies in resource units vs cycles. Summary: For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files. For more obvious cases, I've ventured a fix. Some notes: - Exynos is especially fishy. - AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice. - The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit. Also see PR37310. Reviewers: RKSimon, craig.topper, javed.absar Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D46356 llvm-svn: 334586	2018-06-13 09:41:49 +00:00
Petr Hosek	7250908016	[AArch64] Support reserving x20 register Register x20 is a callee-saved register which may be used for other purposes in certain contexts, for example to hold special variables within the kernel. This change adds support for reserving this register both to frontend and backend to make this register usable for these purposes. Differential Revision: https://reviews.llvm.org/D46552 llvm-svn: 334531	2018-06-12 20:00:50 +00:00
Luke Geeson	dc82aa44e6	[AArch64] Audit on rL333879 to fix FP16 64bit bitpatterns llvm-svn: 334488	2018-06-12 09:35:20 +00:00
Clement Courbet	f4f6899cdf	[ExynosM1][Sched] Fix resource usage in scheduling model. This is part of https://reviews.llvm.org/D46356. llvm-svn: 334391	2018-06-11 07:33:08 +00:00
Evandro Menezes	b2c8244715	[AArch64, ARM] Add support for Samsung Exynos M4 Create a separate feature set for Exynos M4 and add test cases. llvm-svn: 334115	2018-06-06 18:56:00 +00:00
Peter Smith	57f661bd7d	[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078	2018-06-06 09:40:06 +00:00
Jessica Paquette	aa087327ce	[MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.h This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952	2018-06-04 21:14:16 +00:00
Nicolai Haehnle	01d261f18d	TableGen: Streamline the semantics of NAME Summary: The new rules are straightforward. The main rules to keep in mind are: 1. NAME is an implicit template argument of class and multiclass, and will be substituted by the name of the instantiating def/defm. 2. The name of a def/defm in a multiclass must contain a reference to NAME. If such a reference is not present, it is automatically prepended. And for some additional subtleties, consider these: 3. defm with no name generates a unique name but has no special behavior otherwise. 4. def with no name generates an anonymous record, whose name is unique but undefined. In particular, the name won't contain a reference to NAME. Keeping rules 1&2 in mind should allow a predictable behavior of name resolution that is simple to follow. The old "rules" were rather surprising: sometimes (but not always), NAME would correspond to the name of the toplevel defm. They were also plain bonkers when you pushed them to their limits, as the old version of the TableGen test case shows. Having NAME correspond to the name of the toplevel defm introduces "spooky action at a distance" and breaks composability: refactoring the upper layers of a hierarchy of nested multiclass instantiations can cause unexpected breakage by changing the value of NAME at a lower level of the hierarchy. The new rules don't suffer from this problem. Some existing .td files have to be adjusted because they ended up depending on the details of the old implementation. Change-Id: I694095231565b30f563e6fd0417b41ee01a12589 Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47430 llvm-svn: 333900	2018-06-04 14:26:05 +00:00
Luke Geeson	43e4367961	[AArch64] Audit on rL333634 to fix FP16 Disasm BitPatterns llvm-svn: 333879	2018-06-04 09:41:32 +00:00
Sander de Smalen	d0a6f6a502	[AArch64][SVE] Fix range for DUP immediates (16bit elts) For immediates used in DUP instructions that have the range -128 to 127, or a multiple of 256 in the range -32768 to 32512, one could argue that when the result element size is 16bits (.h), the value can be considered both signed and unsigned. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47619 llvm-svn: 333873	2018-06-04 07:24:23 +00:00
Sander de Smalen	fd54a781f6	[AArch64][SVE] Asm: Print indexed element 0 as FPR. Print the first indexed element as a FP register, for example: mov z0.d, z1.d[0] Is now printed as: mov z0.d, d1 Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47571 llvm-svn: 333872	2018-06-04 07:07:35 +00:00
Sander de Smalen	c33d668ab7	[AArch64][SVE] Asm: Support for indexed DUP instructions. Unpredicated copy of indexed SVE element to SVE vector, along with MOV-aliases. For example: dup z0.h, z1.h[0] duplicates the first 16-bit element from z1 to all elements in the result vector z0. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47570 llvm-svn: 333871	2018-06-04 06:40:55 +00:00
Sander de Smalen	367a53b059	[AArch64][SVE] Asm: Support for FCPY immediate instructions. Predicated copy of floating-point immediate value to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47518 llvm-svn: 333869	2018-06-04 05:58:06 +00:00
Sander de Smalen	512d57f1a5	[AArch64][SVE] Asm: Support for CPY immediate instructions Predicated copy of possibly shifted immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47517 llvm-svn: 333868	2018-06-04 05:40:46 +00:00
Amara Emerson	5a3bb68e12	[AArch64][GlobalISel] Zero-extend s1 values when returning. Before we were relying on the any extend of the s1 to s32, but for AAPCS we need to zero-extend it to at least s8. Fixes PR36719 Differential Revision: https://reviews.llvm.org/D47425 llvm-svn: 333747	2018-06-01 13:20:32 +00:00
Sander de Smalen	f95ea047e5	[AArch64][SVE] Asm: Support for FDUP_ZI (copy fp immediate) instruction. Unpredicated copy of floating-point immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47482 llvm-svn: 333744	2018-06-01 12:54:46 +00:00
Sander de Smalen	97ca6b9e09	[AArch64][SVE] Asm: Support for DUPM (masked immediate) instruction. Unpredicated copy of repeating immediate pattern to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47328 llvm-svn: 333731	2018-06-01 07:25:46 +00:00
Francis Visoiu Mistrih	90aba024c5	[MC] Fallback on DWARF when generating compact unwind on AArch64 Instead of asserting when using the def_cfa directive with a register different from fp, fallback on DWARF. Easily triggered with: .cfi_def_cfa x1, 32; rdar://40249694 Differential Revision: https://reviews.llvm.org/D47593 llvm-svn: 333667	2018-05-31 16:33:26 +00:00
Luke Geeson	2e09995d42	[AArch64] Reverted rL333427 fixing Clang UnitTest Failure llvm-svn: 333634	2018-05-31 08:27:53 +00:00
Roman Tereshin	5a65eb75c7	[GlobalISel][AArch64] LegalizerInfo verifier: Fixing bugs exposed by LegalizerInfo::verify(...) Reviewers: aemerson, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D46339 llvm-svn: 333618	2018-05-31 01:56:05 +00:00
Roman Tereshin	8f1753e994	[GlobalISel][AArch64] LegalizerInfo verifier: Adding LegalizerInfo::verify(...) call w/o fixing bugs This is to make it clear what kind of bugs the LegalizerInfo::verifier is able to catch and test its output Reviewers: aemerson, qcolombet Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D46338 llvm-svn: 333597	2018-05-30 22:10:04 +00:00
Tim Northover	d8949f5002	AArch64: print correct annotation for ADRP addresses. The immediate on an ADRP MCInst needs to be multiplied by 0x1000 to obtain the actual PC-offset that will be calculated. llvm-svn: 333525	2018-05-30 09:54:59 +00:00
Sander de Smalen	bdf09fe7a2	[AArch64][AsmParser] Fix segfault on illegal fpimm. Floating point immediate combining a negative sign and a hexadecimal number, e.g. #-0x0 caused the compiler to crash. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47483 llvm-svn: 333524	2018-05-30 09:54:19 +00:00
Evandro Menezes	f8425340e4	[AArch64] Fix PR32384: bump up the number of stores per memset and memcpy As suggested in https://bugs.llvm.org/show_bug.cgi?id=32384#c1, this change makes the inlining of `memset()` and `memcpy()` more aggressive when compiling for speed. The tuning remains the same when optimizing for size. Patch by: Sebastian Pop <s.pop@samsung.com> Evandro Menezes <e.menezes@samsung.com> Differential revision: https://reviews.llvm.org/D45098 llvm-svn: 333429	2018-05-29 15:58:50 +00:00
Amara Emerson	d5a9e7bbc9	Revert "[AArch64] added FP16 vcvth intrinsic support" This reverts commit r333410 due to bot failures. llvm-svn: 333427	2018-05-29 15:34:22 +00:00
Sander de Smalen	8704b03c4d	[AArch64][SVE] Asm: Support for predicated LSL/LSR (vectors) Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47365 llvm-svn: 333422	2018-05-29 14:40:24 +00:00
Sander de Smalen	26b9b2a8c3	[AArch64][SVE] Asm: Support for AND, ORR, EOR and BIC instructions. This patch addresses the following variants: - bitmask immediate, e.g. 'and z0.d, z0.d, #0x6'. - unpredicated data vectors, e.g. 'and z0.d, z1.d, z2.d'. - predicated data vectors, e.g. 'and z0.d, p0/m, z0.d, z1.d'. And also several aliases, such as: - ORN, alias of ORR. - EON, alias of EOR. - BIC, alias of AND (immediate variant) - MOV, alias of ORR (if unpredicated and source register operands are the same) Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47363 llvm-svn: 333414	2018-05-29 13:08:43 +00:00
Luke Geeson	16092ab3c5	[AArch64] added FP16 vcvth intrinsic support Summary: Change-Id: I0df845749c7689dfc99150ba7c19c7d0dadbd705 Reviewers: javed.absar, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: llvm-commits, SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46311 llvm-svn: 333410	2018-05-29 11:40:33 +00:00
Sander de Smalen	98686c6b15	[AArch64][SVE] Asm: Support for ADD (immediate) instructions. This patch adds addsub_imm8_opt_lsl_(i8\|i16\|i32\|i64) operands that are unsigned values in the range 0 to 255. For element widths of 16 bits or higher it may also be a signed multiple of 256 in the range 0 to 65280. Note: This also does some refactoring to reuse convenience function getShiftedVal<shift>(), and now allows AArch64 scalar 'ADD #-4096' to be accepted to be mapped to SUB #4096. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47310 llvm-svn: 333408	2018-05-29 10:39:49 +00:00
Sander de Smalen	6e2a5b4cf0	Fix ubsan errors introduced by r333263 re. left-shifting negative values. llvm-svn: 333270	2018-05-25 11:41:04 +00:00
Sander de Smalen	62770795a5	[AArch64][SVE] Asm: Support for DUP (immediate) instructions. Unpredicated copy of optionally-shifted immediate to SVE vector, along with MOV-aliases. This patch contains parsing and printing support for cpy_imm8_opt_lsl_(i8\|i16\|i32\|i64). This operand allows a signed value in the range -128 to +127. For element widths of 16 bits or higher it may also be a signed multiple of 256 in the range -32768 to +32512. For element-width of 8 bits a range of -128 to 255 is accepted, since a copy of a byte can be considered either signed/unsigned. Note: This patch renames tryParseAddSubImm() -> tryParseImmWithOptionalShift() and moves the behaviour of trying to shift a plain immediate by an allowed shift-value to its addImmWithOptionalShiftOperands() method, so that the parsing itself is generic and allows immediates from multiple shifted operands. This is done because an immediate can be divisible by both shifted operands. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47309 llvm-svn: 333263	2018-05-25 09:47:52 +00:00
Eli Friedman	9e177882aa	[AArch64] Improve orr+movk sequences for MOVi64imm. The existing code has three different ways to try to lower a 64-bit immediate to the sequence ORR+MOVK. The result is messy: it misses some possible sequences, and the order of the checks means we sometimes emit two MOVKs when we only need one. Instead, just use a simple loop to try all possible two-instruction ORR+MOVK sequences. Differential Revision: https://reviews.llvm.org/D47176 llvm-svn: 333218	2018-05-24 19:38:23 +00:00
Geoff Berry	98150e3a62	[AArch64] Take advantage of variable shift/rotate amount implicit mod operation. Summary: Optimize code generated for variable shifts/rotates by taking advantage of the implicit and/mod done on the variable shift amount register. Resolves bug 27582 and bug 37421. Reviewers: t.p.northover, qcolombet, MatzeB, javed.absar Subscribers: rengolin, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D46844 llvm-svn: 333214	2018-05-24 18:29:42 +00:00
Chad Rosier	3f66363139	[CodeGen][AArch64] Use RegUnits to track register aliases. (NFC) Use RegUnits to track register aliases in AArch64RedundantCopyElimination. Differential Revision: https://reviews.llvm.org/D47269 llvm-svn: 333107	2018-05-23 17:49:38 +00:00
Alex Bradbury	0a59f18951	[AArch64] Use addAliasForDirective to support data directives The AArch64 asm parser currently has custom parsing logic for .hword, .word, and .xword. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. Differential Revision: https://reviews.llvm.org/D47000 llvm-svn: 333077	2018-05-23 11:17:20 +00:00
Eli Friedman	785acce51d	Delete unused variable from r333015. (The assertion suppressed the unused variable warning on Release+Asserts builds, so I didn't notice.) llvm-svn: 333018	2018-05-22 19:38:07 +00:00
Eli Friedman	042dc9e092	[MachineOutliner] Add "thunk" outlining for AArch64. When we're outlining a sequence that ends in a call, we can save up to three instructions in the outlined function by turning the call into a tail-call. I refer to this as thunk outlining because the resulting outlined function looks like a thunk; suggestions welcome for a better name. In addition to making the outlined function shorter, thunk outlining allows outlining calls which would otherwise be illegal to outline: we don't need to save/restore LR, so we don't need to prove anything about the stack access patterns of the callee. To make this work effectively, I also added MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner code; this allows treating an arbitrary instruction as a terminator in the suffix tree. Differential Revision: https://reviews.llvm.org/D47173 llvm-svn: 333015	2018-05-22 19:11:06 +00:00

1 2 3 4 5 ...

2976 Commits