llvm-project

Commit Graph

Author	SHA1	Message	Date
Francis Visoiu Mistrih	e85b06d65f	[CodeGen] Use MIR syntax for MachineMemOperand printing Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)". rdar://38163529 Differential Revision: https://reviews.llvm.org/D42377 llvm-svn: 327580	2018-03-14 21:52:13 +00:00
Francis Visoiu Mistrih	164560bd74	[AArch64] Emit CSR loads in the same order as stores Optionally allow the order of restoring the callee-saved registers in the epilogue to be reversed. The flag -reverse-csr-restore-seq generates the following code: ``` stp x26, x25, [sp, #-64]! stp x24, x23, [sp, #16] stp x22, x21, [sp, #32] stp x20, x19, [sp, #48] ; [..] ldp x24, x23, [sp, #16] ldp x22, x21, [sp, #32] ldp x20, x19, [sp, #48] ldp x26, x25, [sp], #64 ret ``` Note how the CSRs are restored in the same order as they are saved. One exception to this rule is the last `ldp`, which allows us to merge the stack adjustment and the ldp into a post-index ldp. This is done by first generating: ldp x26, x27, [sp] add sp, sp, #64 which gets merged by the arm64 load store optimizer into ldp x26, x25, [sp], #64 The flag is disabled by default. llvm-svn: 327569	2018-03-14 20:34:03 +00:00
Francis Visoiu Mistrih	084e7d8770	[AArch64] Keep track of MIFlags in the LoadStoreOptimizer Merging: * $x26, $x25 = frame-setup LDPXi $sp, 0 * $sp = frame-destroy ADDXri $sp, 64, 0 into an LDPXpost should preserve the flags from both instructions as following: * frame-setup frame-destroy LDPXpost Differential Revision: https://reviews.llvm.org/D44446 llvm-svn: 327533	2018-03-14 17:10:58 +00:00
Martin Storsjo	bde677289a	[AArch64] Don't produce R_AARCH64_TLSLE_LDST32_TPREL_LO12_NC Support for this relocation is missing in both LLD and GNU binutils at the moment. This reverts the ELF parts of SVN r327316. llvm-svn: 327503	2018-03-14 13:09:10 +00:00
Martin Storsjo	7bc64bd889	[AArch64] Fold adds with tprel_lo12_nc and secrel_lo12 into a following ldr/str Differential Revision: https://reviews.llvm.org/D44355 llvm-svn: 327316	2018-03-12 18:47:43 +00:00
George Burgess IV	2282cc2a09	Revert r327199: "Clean up a temp file on the buildbots" "I'll revert this tomorrow," I said yesterday. This should've reached all the bots it can by now. llvm-svn: 327230	2018-03-10 23:22:46 +00:00
Martin Storsjo	cc24096d4d	[AArch64] Implement native TLS for Windows Differential Revision: https://reviews.llvm.org/D43971 llvm-svn: 327220	2018-03-10 19:05:21 +00:00
George Burgess IV	088809def1	Clean up a temp file on the buildbots. r327100 made us stop producing vecreduce-propagate-sd-flags.s, but it's still sticking around on some bots. This makes the bots unhappy. I'll revert this tomorrow. llvm-svn: 327199	2018-03-10 02:51:10 +00:00
Sebastian Pop	b4bd0a404f	[x86][aarch64] ask the backend whether it has a vector blend instruction The code to match and produce more x86 vector blends was enabled for all architectures even though the transform may pessimize the code for other architectures that do not provide a vector blend instruction. Added an aarch64 testcase to check that a VZIP instruction is generated instead of byte movs. Differential Revision: https://reviews.llvm.org/D44118 llvm-svn: 327132	2018-03-09 14:29:21 +00:00
Martin Storsjo	ccac66da83	[AArch64] Fix use of a regex in the win-alloca.ll test. NFC. Check that the variable actually is the same as the one previously matched. llvm-svn: 327107	2018-03-09 09:45:37 +00:00
Matt Morehouse	5a91fc6f74	Attempt to fix vecreduce-propagate-sd-flags.ll test. Buildbots have been failing since r327079. llvm-svn: 327100	2018-03-09 02:04:30 +00:00
Sameer AbuAsal	986865c090	Propagate flags to SDValue in SplitVecOp_VECREDUCE This patch is a fix for PR36642. While legalizing long vector types, make sure the smaller types get the flags of the wider type. bugzilla link: https://bugs.llvm.org/show_bug.cgi?id=36642 Change-Id: I0c2829639f094c862c10a6b51b342d4c2563e1fa llvm-svn: 327079	2018-03-08 23:41:40 +00:00
Sebastian Pop	33bdb3e0e6	[AArch64] add missing pattern for insert_subvector undef The attached testcase started failing after the patch to define isExtractSubvectorCheap with the following pattern mismatch: ISEL: Starting pattern match Initial Opcode index to 85068 Match failed at index 85076 LLVM ERROR: Cannot select: t47: v8i16 = insert_subvector undef:v8i16, t43, Constant:i64<0> The code generated from llvm/lib/Target/AArch64/AArch64InstrInfo.td def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i32 0)), (INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>; is in ninja/lib/Target/AArch64/AArch64GenDAGISel.inc At the location of the error it is: /* 85076*/ OPC_CheckChild2Type, MVT::i32, And it failed to match the type of operand 2. Adding another def-pat for i64 fixes the failed def-pat error: def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i64 0)), (INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>; llvm-svn: 326949	2018-03-07 22:07:13 +00:00
Sebastian Pop	41073e8046	[AArch64] define isExtractSubvectorCheap Following the ARM-neon backend, define isExtractSubvectorCheap to return true when extracting low and high part of a neon register. The patch disables a test in llvm/test/CodeGen/AArch64/arm64-ext.ll This testcase is fragile in the sense that it requires a BUILD_VECTOR to "survive" all DAG transforms until ISelLowering. The testcase is supposed to check that AArch64TargetLowering::ReconstructShuffle() works, and for that we need a BUILD_VECTOR in ISelLowering. As we now transform the BUILD_VECTOR earlier into an VEXT + vector_shuffle, we don't have the BUILD_VECTOR pattern when we get to ISelLowering. As there is no way to disable the combiner to only exercise the code in ISelLowering, the patch disables the testcase. Differential revision: https://reviews.llvm.org/D43973 llvm-svn: 326811	2018-03-06 16:54:55 +00:00
Volkan Keles	2bc42e90ed	GlobalISel: IRTranslate llvm.fabs.* intrinsic Summary: Fabs is a common floating-point operation, especially for some expansions. This patch adds a new generic opcode for llvm.fabs.* intrinsic in order to avoid building/matching this intrinsic. Reviewers: qcolombet, aditya_nandakumar, dsanders, rovka Reviewed By: aditya_nandakumar Subscribers: kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D43864 llvm-svn: 326749	2018-03-05 22:31:55 +00:00
Evandro Menezes	f29e35afff	[AArch64] Harden test case NFC llvm-svn: 326724	2018-03-05 17:42:18 +00:00
Sebastian Pop	ac0bfb5938	fix PR36582 The error occurs when reading i16 elements (as in the testcase) from a v8i8 with a pattern of <0,2,4,6>. As all the data in the vector is accessed, the operation is not a VUZP. The patch stops the pattern recognition of VUZP when EXTRACT_VECTOR_ELT has a different element type than BUILD_VECTOR. llvm-svn: 326722	2018-03-05 17:35:49 +00:00
Evandro Menezes	cd855f70c5	[AArch64] Improve code generation of constant vectors Use the whole gammut of constant immediates available to set up a vector. Instead of using, for example, `mov w0, #0xffff; dup v0.4s, w0`, which transfers between register files, use the more efficient `movi v0.4s, #-1` instead. Not limited to just a few values, but any immediate value that can be encoded by all the variants of `FMOV`, `MOVI`, `MVNI`, thus eliminating the need to there be patterns to optimize special cases. Differential revision: https://reviews.llvm.org/D42133 llvm-svn: 326718	2018-03-05 17:02:47 +00:00
Craig Topper	e7ca6f5456	[DAGCombiner] When combining zero_extend of a truncate, only mask before extending for vectors. Masking first, prevents the extend from being combine with loads. Its also interfering with some vXi1 extraction code. Differential Revision: https://reviews.llvm.org/D42679 llvm-svn: 326500	2018-03-01 22:32:25 +00:00
Sebastian Pop	c33af715d7	[AArch64] generate vuzp instead of mov when a BUILD_VECTOR is created out of a sequence of EXTRACT_VECTOR_ELT with a specific pattern sequence, either <0, 2, 4, ...> or <1, 3, 5, ...>, replace the BUILD_VECTOR with either vuzp1 or vuzp2. With this patch LLVM generates the following code for the first function fun1 in the testcase: adrp x8, .LCPI0_0 ldr q0, [x8, :lo12:.LCPI0_0] tbl v0.16b, { v0.16b }, v0.16b ext v1.16b, v0.16b, v0.16b, #8 uzp1 v0.8b, v0.8b, v1.8b str d0, [x8] ret Without this patch LLVM currently generates this code: adrp x8, .LCPI0_0 ldr q0, [x8, :lo12:.LCPI0_0] tbl v0.16b, { v0.16b }, v0.16b mov v1.16b, v0.16b mov v1.b[1], v0.b[2] mov v1.b[2], v0.b[4] mov v1.b[3], v0.b[6] mov v1.b[4], v0.b[8] mov v1.b[5], v0.b[10] mov v1.b[6], v0.b[12] mov v1.b[7], v0.b[14] str d1, [x8] ret llvm-svn: 326443	2018-03-01 15:47:39 +00:00
Roman Tereshin	2b94972eb9	[GlobalISel][AArch64] Adding -disable-gisel-legality-check CL option Currently it's impossible to test InstructionSelect pass with MIR which is considered illegal by the Legalizer in Assert builds. In early stages of porting an existing backend from SelectionDAG ISel to GlobalISel, however, we would have very basic CallLowering, Legalizer, and RegBankSelect implementations, but rather functional Instruction Select with quite a few patterns selectable due to the semi-automatic porting process borrowing them from SelectionDAG ISel. As we are trying to define legality as a property of being selectable by the instruction selector, it would be nice to be able to easily check what the selector can do in its current state w/o the legality check provided by the Legalizer getting in the way. It also seems beneficial to have a regression testing set up that would not allow the selector to silently regress in its support of the MIR not supported yet by the previous passes in the GlobalISel pipeline. This commit adds -disable-gisel-legality-check command line option to llc that disables those legality checks in RegBankSelect and InstructionSelect passes. It also adds quite a few MIR test cases for AArch64's Instruction Selector. Every one of them would fail on the legality check at the moment, but will select just fine if the check is disabled. Every test MachineFunction is intended to exercise a specific selection rule and that rule only, encoded in the MachineFunction's name by the rule's number, ID, and index of its GIM_Try opcode in TableGen'erated MatchTable (-optimize-match-table=false). Reviewers: ab, dsanders, qcolombet, rovka Reviewed By: bogner Subscribers: kristof.beyls, volkan, aditya_nandakumar, aemerson, rengolin, t.p.northover, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42886 llvm-svn: 326396	2018-03-01 00:27:48 +00:00
Chih-Hung Hsieh	9f9e4681ac	[TLS] use emulated TLS if the target supports only this mode Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341	2018-02-28 17:48:55 +00:00
Geoff Berry	a2b9011290	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Re-enable commit r323991 now that r325931 has been committed to make MachineOperand::isRenamable() check more conservative w.r.t. code changes and opt-in on a per-target basis. llvm-svn: 326208	2018-02-27 16:59:10 +00:00
Evandro Menezes	717941e132	[AArch64] Harden test cases NFC llvm-svn: 326147	2018-02-26 23:19:25 +00:00
Francis Visoiu Mistrih	e4fae4d5b6	[CodeGen] Don't omit any redundant information in -debug output In r322867, we introduced IsStandalone when printing MIR in -debug output. The default behaviour for that was: 1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any redundant information. 2) When -debug-printing a MF entirely, don't print any redundant information. 3) When printing MIR, don't print any redundant information. I'd like to change 2) to: 2) When -debug-printing a MF entirely, don't omit any redundant information. Differential Revision: https://reviews.llvm.org/D43337 llvm-svn: 326094	2018-02-26 15:23:42 +00:00
Evandro Menezes	1afffac05b	[PATCH] [AArch64] Add new target feature to fuse conditional select This feature enables the fusion of the comparison and the conditional select instructions together. Differential revision: https://reviews.llvm.org/D42392 llvm-svn: 325939	2018-02-23 19:27:43 +00:00
Evandro Menezes	c0571bd065	[AArch64] Improve macro fusion test case Improve a vector in the test case for the fusion of address generation and loads or stores. Otherwise, NFC. llvm-svn: 325839	2018-02-22 23:32:06 +00:00
Sander de Smalen	a86f3cfb49	Revert "[DebugInfo][FastISel] Fix dropping dbg.value()" This patch reverts r325440 and r325438 because it triggers an assertion in SelectionDAGBuilder.cpp. Also having debug enabled may unintentionally affect code-gen. The patch is reverted until we find a better solution. llvm-svn: 325825	2018-02-22 19:53:59 +00:00
Evandro Menezes	72f3983633	[AArch64] Refactor instructions using SIMD immediates Get rid of icky goto loops and make the code easier to maintain. Otherwise, NFC. Restore r324903 and fix PR36369. Differentail revision: https://reviews.llvm.org/D43364 llvm-svn: 325621	2018-02-20 20:31:45 +00:00
Amara Emerson	db211892ed	[AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first. This is a follow on commit to r[x] where we fix the other direction of copy. For this case, after converting the source from gpr32 -> fpr32, we use a subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land. https://reviews.llvm.org/D43444 llvm-svn: 325550	2018-02-20 05:11:57 +00:00
Amara Emerson	7e9f348b2d	[AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied to gpr register banks. PR36345. rdar://36478867 Differential Revision: https://reviews.llvm.org/D43310 llvm-svn: 325463	2018-02-18 17:10:49 +00:00
Amara Emerson	bc03baef77	[AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits. These are needed for operations on fp16 types in a later patch. llvm-svn: 325462	2018-02-18 17:03:02 +00:00
Haicheng Wu	aed6e52b3c	[AArch64] Coalesce Copy Zero during instruction selection Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 llvm-svn: 325459	2018-02-18 13:51:33 +00:00
Sander de Smalen	d01cb72f7e	Made test dbg_value_fastisel.ll specific to AArch64 fast-isel. Some buildbots failed on this test (rL325438) because they don't build all targets. I set the triple to aarch64 and moved the test to test/CodeGen/AArch64/fast-isel-dbg-value.ll. llvm-svn: 325440	2018-02-17 17:43:24 +00:00
Martin Storsjo	a63a5b993e	[AArch64] Implement dynamic stack probing for windows This makes sure that alloca() function calls properly probe the stack as needed. Differential Revision: https://reviews.llvm.org/D42356 llvm-svn: 325433	2018-02-17 14:26:32 +00:00
Quentin Colombet	48abac82b8	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" This reverts commit r323991. This commit breaks target that don't model all the register constraints in TableGen. So far the workaround was to set the hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the cases. For instance, when mutating an instruction (like in the lowering of COPYs) the isRenamable flag is not properly updated. The same problem will happen when attaching machine operand from one instruction to another. Geoff Berry is working on a fix in https://reviews.llvm.org/D43042. llvm-svn: 325421	2018-02-17 03:05:33 +00:00
Evandro Menezes	10ae20d80c	[AArch64] Fix BITCAST lowering crash The data type is assumed to be a vector, but sometimes it is not, leading to an assertion. Add simple test-case to verify this. Differential revision: https://reviews.llvm.org/D42599 llvm-svn: 325378	2018-02-16 20:00:57 +00:00
Volkan Keles	9283763865	GlobalISel: IRTranslate llvm.fmuladd.* intrinsic Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, bogner Reviewed By: qcolombet Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D43090 llvm-svn: 324971	2018-02-13 00:47:46 +00:00
Abderrazek Zaafrani	e72d99261f	[AArch64] Fixes for ARMv8.2-A FP16 scalar intrinsic - llvm portion https://reviews.llvm.org/D42993 llvm-svn: 324912	2018-02-12 17:35:42 +00:00
Oliver Stannard	02f08c9d1f	[AArch64] Improve v8.1-A code-gen for atomic load-and Armv8.1-A added an atomic load-clear instruction (which performs bitwise and with the complement of it's operand), but not a load-and instruction. Our current code-generation for atomic load-and always inserts an MVN instruction to invert its argument, even if it could be folded into a constant or another instruction. This adds lowering early in selection DAG to convert a load-and operation into an xor with -1 and a load-clear, allowing the normal DAG optimisations to work on it. To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't see any easy way to do this with an AArch64-specific ISD node, because the code-generation for atomic operations assumes the SDNodes are of type AtomicSDNode. I've left the old tablegen patterns in because they are still needed for global isel. Differential revision: https://reviews.llvm.org/D42478 llvm-svn: 324908	2018-02-12 17:03:11 +00:00
Oliver Stannard	4269917304	[AArch64] Improve v8.1-A code-gen for atomic load-subtract Armv8.1-A added an atomic load-add instruction, but not a load-subtract instruction. Our current code-generation for atomic load-subtract always inserts a NEG instruction to negate it's argument, even if it could be folded into a constant or another instruction. This adds lowering early in selection DAG to convert a load-subtract operation into a subtract and a load-add, allowing the normal DAG optimisations to work on it. I've left the old tablegen patterns in because they are still needed for global isel. Some of the tests in this patch are copied from D35375 by Chad Rosier (which was abandoned). Differential revision: https://reviews.llvm.org/D42477 llvm-svn: 324892	2018-02-12 14:22:03 +00:00
Sanjay Patel	eb8c408e50	[TargetLowering] try to create -1 constant operand for math ops via demanded bits This reverses instcombine's demanded bits' transform which always tries to clear bits in constants. As noted in PR35792 and shown in the test diffs: https://bugs.llvm.org/show_bug.cgi?id=35792 ...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity. I did investigate changing instcombine's behavior, but it would be more work to change canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions, so we'd have to figure out how to not regress those cases. Differential Revision: https://reviews.llvm.org/D42986 llvm-svn: 324839	2018-02-11 14:38:23 +00:00
Jonas Paulsson	7850601fa3	[AArch64] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Martin Storsjö llvm-svn: 324720	2018-02-09 09:22:20 +00:00
Aditya Nandakumar	b14fd2608c	[GISel]: Verify COPIES involving generic registers. Add verification for copies involving generic registers if they are compatible - ie if it is a generic copy, then the types are the same, and if a COPY b/w generic and target virtual register, then the sizes should be the same. Only checks if there are no sub registers involved for now. https://reviews.llvm.org/D37775 llvm-svn: 324696	2018-02-09 01:27:23 +00:00
Sjoerd Meijer	5ea465ded7	[AArch64] Don't materialize 0 with "fmov h0, .." when FullFP16 is not supported We were generating "fmov h0, wzr" instructions when FullFP16 is not enabled. I've not added any tests, because the problem was visible in: test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll, which I had to change: I don't think Cyclone has FullFP16 enabled by default, so it shouldn't be using this v8.2a instruction. I've also removed these rdar tags, please shout if there are any objections. Differential Revision: https://reviews.llvm.org/D43020 llvm-svn: 324581	2018-02-08 08:39:05 +00:00
Eugene Leviant	25347ea895	[LegalizeDAG] Truncate condition operand of ISD::SELECT Differential revision: https://reviews.llvm.org/D42737 llvm-svn: 324447	2018-02-07 05:38:29 +00:00
Craig Topper	6d9e090a64	[Mips][AMDGPU] Update test cases to not use vector lt/gt compares that can be simplified to an equality/inequality or to always true/false. For example 'ugt X, 0' can be simplified to 'ne X, 0'. Or 'uge X, 0' is always true. We already simplify this for scalars in SimplifySetCC, but we don't currently for vectors in SimplifySetCC. D42948 proposes to change that. llvm-svn: 324436	2018-02-07 00:51:37 +00:00
Sanjay Patel	ffe72034a6	[AArch64] add test to show sub-optimal isel; NFC llvm-svn: 324404	2018-02-06 21:25:02 +00:00
Amara Emerson	572f6cecf1	[AArch64][GlobalISel] Fix old use of % sigil in test. My rebase had missed the new $ sigil we're using. llvm-svn: 324051	2018-02-02 02:14:42 +00:00
Amara Emerson	58aea52bc4	[GlobalISel] Constrain the dest reg of IMPLICT_DEF. This fixes a crash where the user is a COPY, which deliberately does not constrain its source operands, resulting in a vreg without a reg class escaping selection. Differential Revision: https://reviews.llvm.org/D42697 llvm-svn: 324047	2018-02-02 01:44:43 +00:00

1 2 3 4 5 ...

2001 Commits