llvm-project

Commit Graph

Author	SHA1	Message	Date
Amara Emerson	2b523f8162	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032	2019-04-09 21:22:33 +00:00
Simon Pilgrim	17586cda4a	[SelectionDAG] Add fcmp UNDEF handling to SelectionDAG::FoldSetCC Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareInstruction, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.). This involves a lot of tweaking to reduced tests as bugpoint loves to reduce fcmp arguments to undef........ Differential Revision: https://reviews.llvm.org/D60006 llvm-svn: 357765	2019-04-05 14:56:21 +00:00
Piotr Sobczak	0376ac1d94	[SelectionDAG] Compute known bits of CopyFromReg Summary: Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if the virtual reg used has one def only. This can be particularly useful when calling isBaseWithConstantOffset() with the ISD::CopyFromReg argument, as more optimizations may get enabled in the result. Also add a missing truncation on X86, found by testing of this patch. Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa Reviewers: bogner, craig.topper, RKSimon Reviewed By: RKSimon Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59535 llvm-svn: 357745	2019-04-05 07:44:09 +00:00
Diana Picus	153c3887e4	[ARM GlobalISel] Support DBG_VALUE Make sure we can map and select DBG_VALUE. llvm-svn: 357681	2019-04-04 10:24:51 +00:00
David L. Jones	8b8a02175a	Revert r357452 - 'SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)' This revision causes tests to fail under ASAN. Since the cause of the failures is not clear (could be ASAN, could be a Clang bug, could be a bug in this revision), the safest course of action seems to be to revert while investigating. llvm-svn: 357667	2019-04-04 02:27:57 +00:00
Sanjay Patel	00dae6b22d	[DAGCombiner] loosen restrictions for moving shuffles after vector binop There are 3 changes to make this correspond to the same transform in instcombine: 1. Remove the legality check - we can't create anything less legal than we started with. 2. Ease the use restriction, so we only bail out if both operands have >1 use. 3. Ease the use restriction for binops with a repeated operand (eg, mul x, x). As discussed in D60150, there's a scalarization opportunity that will be made easier by allowing this transform more generally. llvm-svn: 357580	2019-04-03 13:42:06 +00:00
Hans Wennborg	b669fea42f	SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259) The code was previously checking that candidates for sinking had exactly one use or were a store instruction (which can't have uses). This meant we could sink call instructions only if they had a use. That limitation seemed a bit arbitrary, so this patch changes it to "instruction has zero or one use" which seems more natural and removes the need to special-case stores. Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 357452	2019-04-02 08:01:38 +00:00
Eli Friedman	3813fe0bda	[ARM] Optimize expressions like "return x != 0;" for Thumb1. There's an existing optimization for x != C, but somehow it was missing a special case for 0. While I'm here, also cleaned up the code/comments a bit: the second value produced by the MERGE_VALUES was actually dead, since a CMOV only produces one result. Differential Revision: https://reviews.llvm.org/D59616 llvm-svn: 357437	2019-04-02 00:01:23 +00:00
Eli Friedman	73af6ef2e7	[ARM] Don't try to create "push {r12, lr}" in Thumb1 at -Oz. It's a little tricky to make this issue show up because prologue/epilogue emission normally likes to push at least two registers... but it doesn't when lr is force-spilled due to function length. Not sure if that really makes sense, but I decided not to touch it for now. Differential Revision: https://reviews.llvm.org/D59385 llvm-svn: 357436	2019-04-01 23:55:57 +00:00
Simon Pilgrim	a3fb3d5583	[ARM] Regenerate execute-only float comparison tests Prep work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC) llvm-svn: 357293	2019-03-29 18:21:19 +00:00
Nirav Dave	fe59e14031	[DAGCombine] Prune unnused nodes. Summary: Nodes that have no uses are eventually pruned when they are selected from the worklist. Record nodes newly added to the worklist or DAG and perform pruning after every combine attempt. Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight Reviewed By: jyknight Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58070 llvm-svn: 357283	2019-03-29 17:35:56 +00:00
Simon Pilgrim	b4b98a528b	[ARM] Regenerate vector comparison tests Prep work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC) llvm-svn: 357281	2019-03-29 17:35:11 +00:00
Diana Picus	13ef0c5309	[ARM GlobalISel] Run regbankselect test for Thumb. NFCI This should just work, since ARM mode and Thumb2 mode are at the same level of support now and should map the same to GPR and FPR. llvm-svn: 357159	2019-03-28 10:57:29 +00:00
Diana Picus	52495c472f	[ARM GlobalISel] Fix G_STORE with s1 G_STORE for 1-bit values uses a STRBi12, which stores the whole byte. Zero out the undefined bits before writing. llvm-svn: 357154	2019-03-28 09:09:36 +00:00
Diana Picus	4d512df300	[ARM GlobalISel] Fix selection of G_SELECT G_SELECT uses a 1-bit scalar for the condition, and is currently implemented with a plain CMPri against 0. This means that values such as 0x1110 are interpreted as true, when instead the higher bits should be treated as undefined and therefore ignored. Replace the CMPri with a TSTri against 0x1, which performs an implicit AND, yielding the expected result. llvm-svn: 357153	2019-03-28 09:09:27 +00:00
Nirav Dave	c6dfaa0e83	Revert r356996 "[DAG] Avoid smart constructor-based dangling nodes." This patch appears to trigger very large compile time increases in halide builds. llvm-svn: 357116	2019-03-27 19:54:41 +00:00
Eli Friedman	c388bfa230	[ARM] Don't confuse the scheduler for very large VLDMDIA etc. ARMBaseInstrInfo::getNumLDMAddresses is making bad assumptions about the memory operands of load and store-multiple operations. This doesn't really fix the problem properly, but it's enough to prevent crashing, at least. Fixes https://bugs.llvm.org/show_bug.cgi?id=41231 . Differential Revision: https://reviews.llvm.org/D59834 llvm-svn: 357109	2019-03-27 18:33:30 +00:00
Nirav Dave	a28c514581	[DAG] Avoid smart constructor-based dangling nodes. Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations or not fully pruning unused result values. This can result in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter as the current node deleter to make sure such nodes have the chance of being pruned. Many minor changes, mostly positive. llvm-svn: 356996	2019-03-26 15:08:14 +00:00
Diana Picus	254b11a0fd	[ARM GlobalISel] 64-bit memops should be aligned We currently use only VLDR/VSTR for all 64-bit loads/stores, so the memory operands must be word-aligned. Mark aligned operations as legal and narrow non-aligned ones to 32 bits. While we're here, also mark non-power-of-2 loads/stores as unsupported. llvm-svn: 356872	2019-03-25 08:54:29 +00:00
Eli Friedman	b906bba576	[ARM] Don't form "ands" when it isn't scheduled correctly. In r322972/r323136, the iteration here was changed to catch cases at the beginning of a basic block... but we accidentally deleted an important safety check. Restore that check to the way it was. Fixes https://bugs.llvm.org/show_bug.cgi?id=41116 Differential Revision: https://reviews.llvm.org/D59680 llvm-svn: 356809	2019-03-22 20:49:15 +00:00
Evandro Menezes	4a7739b681	[AArch64, ARM] Add support for Exynos M5 Add Exynos M5 support and test cases. llvm-svn: 356793	2019-03-22 18:42:14 +00:00
Eli Friedman	638be660d7	[ARM] Eliminate redundant "mov rN, sp" instructions in Thumb1. This takes sequences like "mov r4, sp; str r0, [r4]", and optimizes them to something like "str r0, [sp]". For regular stack variables, this optimization was already implemented: we lower loads and stores using frame indexes, which are expanded later. However, when constructing a call frame for a call with more than four arguments, the existing optimization doesn't apply. We need to use stores which are actually relative to the current value of sp, and don't have an associated frame index. This patch adds a special case to handle that construct. At the DAG level, this is an ISD::STORE where the address is a CopyFromReg from SP (plus a small constant offset). This applies only to Thumb1: in Thumb2 or ARM mode, a regular store instruction can access SP directly, so the COPY gets eliminated by existing code. The change to ARMDAGToDAGISel::SelectThumbAddrModeSP is a related cleanup: we shouldn't pretend that it can select anything other than frame indexes. Differential Revision: https://reviews.llvm.org/D59568 llvm-svn: 356601	2019-03-20 19:40:45 +00:00
Matt Arsenault	c2e35a6f32	RegAllocFast: Remove early selection loop, the spill calculation will report cost 0 anyway for free regs The 2nd loop calculates spill costs but reports free registers as cost 0 anyway, so there is little benefit from having a separate early loop. Surprisingly this is not NFC, as many register are marked regDisabled so the first loop often picks up later registers unnecessarily instead of the first one available in the allocation order... Patch by Matthias Braun llvm-svn: 356499	2019-03-19 19:01:34 +00:00
Eli Friedman	68d9a60573	[ARM] Add MachineVerifier logic for some Thumb1 instructions. tMOVr and tPUSH/tPOP/tPOP_RET have register constraints which can't be expressed in TableGen, so check them explicitly. I've unfortunately run into issues with both of these recently; hopefully this saves some time for someone else in the future. Differential Revision: https://reviews.llvm.org/D59383 llvm-svn: 356303	2019-03-15 21:44:49 +00:00
Sam Parker	f82d4ed771	[ARM] Remove EarlyCSE from backend There is an issue with early CSE hitting an assert, so temporarily remove the pass from the Arm backend. Bug: https://bugs.llvm.org/show_bug.cgi?id=41081 Differential Revision: https://reviews.llvm.org/D59410 llvm-svn: 356259	2019-03-15 13:36:37 +00:00
Simon Pilgrim	22bebcbbbf	[ARM] Remove icmp undef from reduced tests Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @efriedma (Eli Friedman) llvm-svn: 356252	2019-03-15 11:14:59 +00:00
Sam Parker	9e73020bfa	[ARM][ParallelDSP] Disable for big-endian Bail early when we don't have a preheader and also if the target is big endian because it's written with only little endian in mind! Differential Revision: https://reviews.llvm.org/D59368 llvm-svn: 356243	2019-03-15 10:19:32 +00:00
Sam Parker	0a833d0ad2	[NFC][ARM] Update test Change some regex to handle commutable instructions. llvm-svn: 356159	2019-03-14 15:36:54 +00:00
Matt Arsenault	4e3e4016bf	ARM: Add ImmArg to intrinsics I found these by asserting in clang for any GCCBuiltin that doesn't require mangling and requires a constant for the builtin. This means that intrinsics are missing which don't use GCCBuiltin, don't have builtins defined in clang, or were missing the constant annotation in the builtin definition. llvm-svn: 356144	2019-03-14 13:46:14 +00:00
Sam Parker	4c4ff13d3c	[ARM][ParallelDSP] Enable multiple uses of loads When choosing whether a pair of loads can be combined into a single wide load, we check that the load only has a sext user and that sext also only has one user. But this can prevent the transformation in the cases when parallel macs use the same loaded data multiple times. To enable this, we need to fix up any other uses after creating the wide load: generating a trunc and a shift + trunc pair to recreate the narrow values. We also need to keep a record of which loads have already been widened. Differential Revision: https://reviews.llvm.org/D59215 llvm-svn: 356132	2019-03-14 11:14:13 +00:00
Sam Parker	3b2ba20afd	[ARM] Run ARMParallelDSP in the IRPasses phase Run EarlyCSE before ParallelDSP and do this in the backend IR opt phase. Differential Revision: https://reviews.llvm.org/D59257 llvm-svn: 356130	2019-03-14 10:57:40 +00:00
Nirav Dave	d6351340bb	[DAGCombiner] If a TokenFactor would be merged into its user, consider the user later. Summary: A number of optimizations are inhibited by single-use TokenFactors not being merged into the TokenFactor using it. This makes we consider if we can do the merge immediately. Most tests changes here are due to the change in visitation causing minor reorderings and associated reassociation of paired memory operations. CodeGen tests with non-reordering changes: X86/aligned-variadic.ll -- memory-based add folded into stored leaq value. X86/constant-combiners.ll -- Optimizes out overlap between stores. X86/pr40631_deadstore_elision -- folds constant byte store into preceding quad word constant store. Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet Reviewed By: courbet Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59260 llvm-svn: 356068	2019-03-13 17:07:09 +00:00
Nikita Popov	149bc099f6	[SDAG] Expand pow2 mulo using shifts Expand MULO with constant power of two operand into a shift. The overflow is checked with (x << shift) >> shift == x, where the right shift will be logical for umulo and arithmetic for smulo (with exception for multiplications by signed_min). Differential Revision: https://reviews.llvm.org/D59041 llvm-svn: 355937	2019-03-12 16:57:25 +00:00
Sam Parker	a7ae60ac93	[ARM][NFC] Delete original smlad tests Because I don't understand svn. llvm-svn: 355908	2019-03-12 11:06:15 +00:00
Sam Parker	28e46e58db	[ARM][NFC] Move smlad tests Created a test/CodeGen/ARM/ParallelDSP folder. llvm-svn: 355907	2019-03-12 11:01:11 +00:00
Nikita Popov	506c1aba4d	[ARM] Use non-constant operand in umulo-32.ll; NFC Currently the store+load is folded and both operands of the umulo end up being constants. To avoid this getting folded away entirely, make sure at least one operand is non-constant. Also remove some allocas which don't seem relevant to the test. llvm-svn: 355776	2019-03-09 13:43:21 +00:00
Nikita Popov	74dde7e5a1	[ARM] Generate test checks for umulo-32.ll; NFC The second test case is going to be changed by D59041, so generate full baseline checks. llvm-svn: 355775	2019-03-09 13:21:15 +00:00
David Green	ffc922ec35	[LSR] Attempt to increase the accuracy of LSR's setup cost In some loops, we end up generating loop induction variables that look like: {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1} As opposed to the simpler: {(zext i16 (%i0 * %i1) to i32),+,-1} i.e we count up from -limit to 0, not the simpler counting down from limit to 0. This is because the scores, as LSR calculates them, are the same and the second is filtered in place of the first. We end up with a redundant SUB from 0 in the code. This patch tries to make the calculation of the setup cost a little more thoroughly, recursing into the scev members to better approximate the setup required. The cost function for comparing LSR costs is: return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds, C1.ScaleCost, C1.ImmCost, C1.SetupCost) < std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds, C2.ScaleCost, C2.ImmCost, C2.SetupCost); So this will only alter results if none of the other variables turn out to be different. Differential Revision: https://reviews.llvm.org/D58770 llvm-svn: 355597	2019-03-07 13:44:40 +00:00
Oliver Stannard	4a9086b537	[ARM] Fix select_cc lowering for fp16 When lowering a select_cc node where the true and false values are of type f16, we can't use a general conditional move because the FP16 instructions do not support conditional execution. Instead, we must ensure that the condition code is one of the four supported by the VSEL instruction. Differential revision: https://reviews.llvm.org/D58813 llvm-svn: 355385	2019-03-05 10:42:34 +00:00
Oliver Stannard	181afc7f3b	[ARM] Fix selection of VLDR.16 instruction with imm offset The isScaledConstantInRange function takes upper and lower bounds which are checked after dividing by the scale, so the bounds checks for half, single and double precision should all be the same. Previously, we had wrong bounds checks for half precision, so selected an immediate the instructions can't actually represent. Differential revision: https://reviews.llvm.org/D58822 llvm-svn: 355305	2019-03-04 09:17:38 +00:00
Oliver Stannard	82fbbc21fd	[ARM] Fix FP16 stack loads/stores for Thumb2 with frame pointer The new addressing mode added for the v8.2A FP16 instructions uses bit 8 of the immediate to encode the sign of the offset, like the other FP loads/stores, so need to be treated the same way. Differential revision: https://reviews.llvm.org/D58816 llvm-svn: 355201	2019-03-01 14:20:28 +00:00
Oliver Stannard	e019e6223b	[ARM] Consider undefined-on-NaN conditions in checkVSELConstraints This function was not checking for the condition code variants which are undefined if either input is NaN, so we were missing selection of the VSEL instruction in some cases when using -fno-honor-nans or -ffast-math. Differential revision: https://reviews.llvm.org/D58812 llvm-svn: 355199	2019-03-01 13:58:25 +00:00
Diana Picus	54829ec5d0	[ARM GlobalISel] Support G_CTLZ for Thumb2 Same as ARM mode but with different opcode. llvm-svn: 355191	2019-03-01 10:12:28 +00:00
Diana Picus	afb3398da0	[ARM GlobalISel] Check target flags in test. NFCI There was a time when we couldn't dump target-specific flags such as arm-sbrel etc, so the tests didn't check for them. We can now be more specific in our tests. llvm-svn: 355189	2019-03-01 10:01:22 +00:00
Diana Picus	3b7beafc77	[ARM GlobalISel] Support global variables for Thumb2 Add the same level of support as for ARM mode (i.e. still no TLS support). In most cases, it is sufficient to replace the opcodes with the t2-equivalent, but there are some idiosyncrasies that I decided to preserve because I don't understand the full implications: * For ARM we use LDRi12 to load from constant pools, but for Thumb we use t2LDRpci (I'm not sure if the ideal would be to use t2LDRi12 for Thumb as well, or to use LDRcp for ARM). * For Thumb we don't have an equivalent for MOV\|LDRLIT_ga_pcrel_ldr, so we have to generate MOV\|LDRLIT_ga_pcrel plus a load from GOT. The tests are in separate files because they're hard enough to read even without doubling the number of checks. llvm-svn: 355077	2019-02-28 10:42:47 +00:00
Luke Cheeseman	9e285bef2b	[ARM] Add Cortex-M35P - Add LLVM backend support for Cortex-M35P - Documentation can be found at https://developer.arm.com/products/processors/cortex-m/cortex-m35p Differentail Revision: https://reviews.llvm.org/D57763 llvm-svn: 354868	2019-02-26 12:02:12 +00:00
David Green	b504f104b2	[ARM] Add some more missing T1 opcodes for the peephole optimisier This adds a few extra Thumb1 opcodes to improve the peephole opimisers ability to remove redundant cmp instructions. tADC and tSBC require a small fixup to prevent MOVS being moved past the instruction, giving the wrong flags. Differential Revision: https://reviews.llvm.org/D58281 llvm-svn: 354791	2019-02-25 15:50:54 +00:00
Dmitri Gribenko	a3a3964f98	Fixed typos in tests: s/CHEKC/CHECK/ Reviewers: ilya-biryukov Subscribers: nemanjai, javed.absar, jsji, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D58611 llvm-svn: 354785	2019-02-25 13:41:59 +00:00
Dmitri Gribenko	751c5fbf6a	Fixed typos in tests: s/CEHCK/CHECK/ Reviewers: ilya-biryukov Subscribers: sanjoy, sdardis, javed.absar, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58608 llvm-svn: 354781	2019-02-25 13:12:33 +00:00
Simon Tatham	b70fc0c5fd	[ARM] Make fullfp16 instructions not conditionalisable. More or less all the instructions defined in the v8.2a full-fp16 extension are defined as UNPREDICTABLE if you put them in an IT block (Thumb) or use with any condition other than AL (ARM). LLVM didn't know that, and was happy to conditionalise them. In order to force these instructions to count as not predicable, I had to make a small Tablegen change. The code generation back end mostly decides if an instruction was predicable by looking for something it can identify as a predicate operand; there's an isPredicable bit flag that overrides that check in the positive direction, but nothing that overrides it in the negative direction. (I considered the alternative approach of actually removing the predicate operand from those instructions, but thought that it would be more painful overall for instructions differing only in data type to have different shapes of operand list. This way, the only code that has to notice the difference is the if-converter.) So I've added an isUnpredicable bit alongside isPredicable, and set that bit on the right subset of FP16 instructions, and also on the VSEL, VMAXNM/VMINNM and VRINT[ANPM] families which should be unpredicable for all data types. I've included a couple of representative regression tests, both of which previously caused an fp16 instruction to be conditionalised in ARM state and (with -arm-no-restrict-it) to be put in an IT block in Thumb. Reviewers: SjoerdMeijer, t.p.northover, efriedma Reviewed By: efriedma Subscribers: jdoerfert, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57823 llvm-svn: 354768	2019-02-25 10:39:53 +00:00

1 2 3 4 5 ...

3673 Commits