llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	ea25eca04a	[AArch64] Extend redundant copy elimination pass to handle non-zero stores. This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle non-zero cases such as: BB#0: cmp x0, #1 b.eq .LBB0_1 .LBB0_1: orr x0, xzr, #0x1 ; <-- redundant copy; x0 known to hold #1. Differential Revision: https://reviews.llvm.org/D29344 llvm-svn: 296809	2017-03-02 20:48:11 +00:00
Vadzim Dambrouski	eafb805506	[MSP430] Add SRet support to MSP430 target This patch adds support for struct return values to the MSP430 target backend. It also reverses the order of argument and return registers in the calling convention to bring it into closer alignment with the published EABI from TI. Patch by Andrew Wygle (awygle). Differential Revision: https://reviews.llvm.org/D29069 llvm-svn: 296807	2017-03-02 20:25:10 +00:00
Evgeny Stupachenko	d655ec56c3	The patch fixes r296770 Summary: Extend -unroll-partial-threshold to 200 for runtime-loop3.ll test as epilogue unroll initially add 1 more IV to the loop. From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 296803	2017-03-02 19:41:38 +00:00
Simon Pilgrim	b3067dc374	[X86][MMX] Fixed i32 extraction on 32-bit targets MMX extraction often ends up as extract_i32(bitcast_v2i32(extract_i64(bitcast_v1i64(x86mmx v), 0)), 0) which fails to simplify on 32-bit targets as i64 isn't legal llvm-svn: 296782	2017-03-02 18:56:06 +00:00
Krzysztof Parzyszek	056c945a5d	[Hexagon] Skip blocks that define vector predicate registers in early-if llvm-svn: 296777	2017-03-02 18:10:59 +00:00
Krzysztof Parzyszek	fcbb7d10fe	[Hexagon] Properly handle 'q' constraint in 128-byte vector mode llvm-svn: 296772	2017-03-02 17:50:24 +00:00
Nemanja Ivanovic	db8425eff0	[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size This patch reduces the stack frame size by not allocating the parameter area if it is not required. In the current implementation LowerFormalArguments_64SVR4 already handles the parameter area, but LowerCall_64SVR4 does not (when calculating the stack frame size). What this patch does is make LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4. Committing on behalf of Hiroshi Inoue. Differential Revision: https://reviews.llvm.org/D29881 llvm-svn: 296771	2017-03-02 17:38:59 +00:00
Evgeny Stupachenko	21bef2cb3c	The patch turns on epilogue unroll for loops with constant recurency start. Summary: Set unroll remainder to epilog if a loop contains a phi with constant parameter: loop: pn = phi [Const, PreHeader], [pn.next, Latch] ... Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D27004 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 296770	2017-03-02 17:38:46 +00:00
Sanjay Patel	fffa179837	[DAGCombiner] avoid assertion when folding binops with opaque constants This bug was introduced with: https://reviews.llvm.org/rL296699 There may be a way to loosen the restriction, but for now just bail out on any opaque constant. The tests show that opacity is target-specific. This goes back to cost calculations in ConstantHoisting based on TTI->getIntImmCost(). llvm-svn: 296768	2017-03-02 17:18:56 +00:00
Geoff Berry	484d756583	Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline." This re-applies r289696, which caused TSan perf regression, which has since been addressed in separate changes (see PR for details). See PR31382. llvm-svn: 296759	2017-03-02 16:16:47 +00:00
Tim Northover	e80d6d1360	GlobalISel: record correct stack usage for signext parameters. The CallingConv.td rules allocate 8 bytes for these kinds of arguments on AAPCS targets, but we were only recording the smaller amount. The difference is theoretical on AArch64 because we don't actually store more than the smaller amount, but it's still much better to have these two components in agreement. Based on Diana Picus's ARM equivalent patch (where it matters a lot more). llvm-svn: 296754	2017-03-02 15:34:18 +00:00
Bjorn Pettersson	e5027cfbcc	[InstCombine] Avoid faulty combines of select-cmp-br Summary: When InstCombine is optimizing certain select-cmp-br patterns it replaces the result of the select in uses outside of the basic block containing the select. This is only legal if the path from the select to the outside use is disjoint from all other paths out from the originating basic block. The problem found was that InstCombiner::replacedSelectWithOperand did not consider the case when both edges out from the br pointed to the same label. In that case the paths aren't disjoint and the transformation is illegal. This patch avoids the faulty rewrites by verifying that there is a single flow to the successor where we want to replace uses. Reviewers: llvm-commits, spatel, majnemer Differential Revision: https://reviews.llvm.org/D30455 llvm-svn: 296752	2017-03-02 15:18:58 +00:00
Matthew Simpson	aee9771ae2	[ARM/AArch64] Update costs for interleaved accesses with wide types After r296750, we're able to match interleaved accesses having types wider than 128 bits. This patch updates the associated TTI costs. Differential Revision: https://reviews.llvm.org/D29675 llvm-svn: 296751	2017-03-02 15:15:35 +00:00
Matthew Simpson	1bfa159db9	[ARM/AArch64] Support wide interleaved accesses This patch teaches (ARM\|AArch64)ISelLowering.cpp to match illegal vector types to interleaved access intrinsics as long as the types are multiples of the vector register width. A "wide" access will now be mapped to multiple interleave intrinsics similar to the way in which non-interleaved accesses with illegal types are legalized into multiple accesses. I'll update the associated TTI costs (in getInterleavedMemoryOpCost) as a follow-on. Differential Revision: https://reviews.llvm.org/D29466 llvm-svn: 296750	2017-03-02 15:11:20 +00:00
Matthew Simpson	455c2ee394	[LV] Considier non-consecutive but vectorizable accesses for VF selection When computing the smallest and largest types for selecting the maximum vectorization factor, we currently ignore loads and stores of pointer types if the memory access is non-consecutive. We do this because such accesses must be scalarized regardless of vectorization factor, and thus shouldn't be considered when determining the factor. This patch makes this check less aggressive by also considering non-consecutive accesses that may be vectorized, such as interleaved accesses. Because we don't know at the time of the check if an accesses will certainly be vectorized (this is a cost model decision given a particular VF), we consider all accesses that can potentially be vectorized. Differential Revision: https://reviews.llvm.org/D30305 llvm-svn: 296747	2017-03-02 13:55:05 +00:00
Andrew V. Tischenko	2855dc7ddc	Added special test covering a problem with PIC relocation model on SLM architecture. The fix will come in D26855. llvm-svn: 296746	2017-03-02 13:47:03 +00:00
Serge Pavlov	e2bf69715f	Do not verify MachimeDominatorTree if it is not calculated If dominator tree is not calculated or is invalidated, set corresponding pointer in the pass state to nullptr. Such pointer value will indicate that operations with dominator tree are not allowed. In particular, it allows to skip verification for such pass state. The dominator tree is not calculated if the machine dominator pass was skipped, it occures in the case of entities with linkage available_externally. The change fixes some test fails observed when expensive checks are enabled. Differential Revision: https://reviews.llvm.org/D29280 llvm-svn: 296742	2017-03-02 12:00:10 +00:00
Peter Collingbourne	ab76a19afb	LTO: When creating a local cache, create the cache directory if it does not already exist. Differential Revision: https://reviews.llvm.org/D30519 llvm-svn: 296726	2017-03-02 02:02:38 +00:00
Matthias Braun	dbcf9e2ee4	LiveRegMatrix: Fix some subreg interference checks Surprisingly, one of the three interference checks in LiveRegMatrix was using the main live range instead of the apropriate subregister range resulting in unnecessarily conservative results. llvm-svn: 296722	2017-03-02 00:35:08 +00:00
Eli Friedman	933863ce61	Revert r296708; causing test failures on ARM hosts. Original commit message: [ARM] Fix insert point for store rescheduling. In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would sink stores for no reason. llvm-svn: 296718	2017-03-02 00:08:50 +00:00
Amaury Sechet	71f511fd1e	[DAGCombiner] mulhi + 1 never overflow. Summary: This can be used to optimize large multiplications after legalization. Depends on D29565 Reviewers: mkuper, spatel, RKSimon, zvi, bkramer, aaboud, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29587 llvm-svn: 296711	2017-03-01 23:44:17 +00:00
Ahmed Bougacha	120ae22d70	[GlobalISel] Add a way for targets to enable GISel. Until now, we've had to use -global-isel to enable GISel. But using that on other targets that don't support it will result in an abort, as we can't build a full pipeline. Additionally, we want to experiment with enabling GISel by default for some targets: we can't just enable GISel by default, even among those target that do have some support, because the level of support varies. This first step adds an override for the target to explicitly define its level of support. For AArch64, do that using a new command-line option (I know..): -aarch64-enable-global-isel-at-O=<N> Where N is the opt-level below which GISel should be used. Default that to -1, so that we still don't enable GISel anywhere. We're not there yet! While there, remove a couple LLVM_UNLIKELYs. Building the pipeline is such a cold path that in practice that shouldn't matter at all. llvm-svn: 296710	2017-03-01 23:33:08 +00:00
Amaury Sechet	683f5743f6	Improve mulhi overflow test. NFC llvm-svn: 296709	2017-03-01 23:31:19 +00:00
Eli Friedman	1c9216b003	[ARM] Fix insert point for store rescheduling. In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would sink stores for no reason. Differential Revision: https://reviews.llvm.org/D30124 llvm-svn: 296708	2017-03-01 23:20:29 +00:00
Eli Friedman	28c2c0e311	[ARM] Check correct instructions for load/store rescheduling. This code starts from the high end of the sorted vector of offsets, and works backwards: it tries to find contiguous offsets, process them, then pops them from the end of the vector. Most of the code agrees with this order of processing, but one loop doesn't: it instead processes elements from the low end of the vector (which are nodes with unrelated offsets). Fix that loop to process the correct elements. This has a few implications. One, we don't incorrectly return early when processing multiple groups of offsets in the same block (which allows rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert point for loads, so they're correctly sorted (which affects the scheduling of vldm-liveness.ll). I think it might also impact some of the heuristics slightly. Differential Revision: https://reviews.llvm.org/D30368 llvm-svn: 296701	2017-03-01 22:56:20 +00:00
Sanjay Patel	92938657a0	[DAGCombiner] fold binops with constant into select-of-constants This is part of the ongoing attempt to improve select codegen for all targets and select canonicalization in IR (see D24480 for more background). The transform is a subset of what is done in InstCombine's FoldOpIntoSelect(). I first noticed a regression in the x86 avx512-insert-extract.ll tests with a patch that hopes to convert more selects to basic math ops. This appears to be a general missing DAG transform though, so I added tests for all standard binops in rL296621 (PowerPC was chosen semi-randomly; it has scripted FileCheck support, but so do ARM and x86). The poor output for "sel_constants_shl_constant" is tracked with: https://bugs.llvm.org/show_bug.cgi?id=32105 Differential Revision: https://reviews.llvm.org/D30502 llvm-svn: 296699	2017-03-01 22:51:31 +00:00
Reid Kleckner	d80b69fa3b	[Constant Hoisting] Avoid inserting instructions before EH pads Now that terminators can be EH pads, this code needs to iterate over the immediate dominators of the EH pad to find a valid insertion point. Fix for PR32107 Patch by Robert Olliff! Differential Revision: https://reviews.llvm.org/D30511 llvm-svn: 296698	2017-03-01 22:41:12 +00:00
Amaury Sechet	250b4a7491	Add test case for mulhi's overflow. NFC llvm-svn: 296696	2017-03-01 22:27:21 +00:00
Reid Kleckner	f7c0980c10	Elide argument copies during instruction selection Summary: Avoids tons of prologue boilerplate when arguments are passed in memory and left in memory. This can happen in a debug build or in a release build when an argument alloca is escaped. This will dramatically affect the code size of x86 debug builds, because X86 fast isel doesn't handle arguments passed in memory at all. It only handles the x86_64 case of up to 6 basic register parameters. This is implemented by analyzing the entry block before ISel to identify copy elision candidates. A copy elision candidate is an argument that is used to fully initialize an alloca before any other possibly escaping uses of that alloca. If an argument is a copy elision candidate, we set a flag on the InputArg. If the the target generates loads from a fixed stack object that matches the size and alignment requirements of the alloca, the SelectionDAG builder will delete the stack object created for the alloca and replace it with the fixed stack object. The load is left behind to satisfy any remaining uses of the argument value. The store is now dead and is therefore elided. The fixed stack object is also marked as mutable, as it may now be modified by the user, and it would be invalid to rematerialize the initial load from it. Supersedes D28388 Fixes PR26328 Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29668 llvm-svn: 296683	2017-03-01 21:42:00 +00:00
Sanjay Patel	3063affbed	[InstCombine] use -instnamer and auto-generate complete checks; NFC llvm-svn: 296673	2017-03-01 20:59:56 +00:00
Sanjay Patel	f8edc3e870	[x86] add vector tests for more coverage of D30502; NFC llvm-svn: 296671	2017-03-01 20:31:23 +00:00
Nemanja Ivanovic	b223cfabcc	Improve scheduling with branch coalescing This patch adds a MachineSSA pass that coalesces blocks that branch on the same condition. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D28249 llvm-svn: 296670	2017-03-01 20:29:34 +00:00
Nirav Dave	0a4703b5ec	[DAG] Prevent Stale nodes from entering worklist Add check that deleted nodes do not get added to worklist. This can occur when a node's operand is simplified to an existing node. This fixes PR32108. Reviewers: jyknight, hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30506 llvm-svn: 296668	2017-03-01 20:19:38 +00:00
Nirav Dave	3de7fce3ac	Add test cases for merging stores of multiply used stores llvm-svn: 296667	2017-03-01 20:18:14 +00:00
Daniel Berlin	283a60875e	NewGVN: Add debug counter for value numbering llvm-svn: 296665	2017-03-01 19:59:26 +00:00
Paul Robinson	8932d64891	[DWARF] Print leading zeros in type signature llvm-svn: 296663	2017-03-01 19:43:29 +00:00
Hans Wennborg	cc4ff78c9d	Revert r296575 "[SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available" It caused miscompiles, e.g. in Chromium (PR32109). llvm-svn: 296654	2017-03-01 18:57:16 +00:00
Paul Robinson	91d74813a6	[DWARF] Default lower bound should respect requested DWARF version. DWARF may define a default lower-bound for arrays in languages defined in a particular DWARF version. But the logic to suppress an unnecessary lower-bound attribute was looking at the hard-coded default DWARF version, not the version that had been requested. Also updated the list with all languages defined in DWARF v5. Differential Revision: http://reviews.llvm.org/D30484 llvm-svn: 296652	2017-03-01 18:32:37 +00:00
Artur Pilipenko	e1b2d31468	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Resubmit r295336 after the bug with non-zero offset patterns on BE targets is fixed (r296336). Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 296651	2017-03-01 18:12:29 +00:00
Krzysztof Parzyszek	5f4dedffd4	[Hexagon] Fix testcase accidentally broken by r296645 llvm-svn: 296647	2017-03-01 17:53:42 +00:00
Krzysztof Parzyszek	8f23dd6d68	[Hexagon] Fix lowering of formal arguments of type i1 On Hexagon, values of type i1 are passed in registers of type i32, even though i1 is not a legal value for these registers. This is a special case and needs special handling to maintain consistency of the lowering information. This fixes PR32089. llvm-svn: 296645	2017-03-01 17:30:10 +00:00
Hans Wennborg	19c0be90f9	[GVNHoist] Don't hoist unsafe scalars at -Oz (PR31729) Based on Aditya Kumar's patch: Differential Revision: https://reviews.llvm.org/D29092 llvm-svn: 296642	2017-03-01 17:15:08 +00:00
Diana Picus	9c52309b37	[ARM] GlobalISel: Lower call params that need extensions Lower i1, i8 and i16 call parameters by extending them before storing them on the stack. Also make sure we encode the correct, extended size in the corresponding memory operand, and that we compute the correct stack size in the end. The latter is a bit more complicated because we used to compute the stack size in the getStackAddress method, based on the Size and Offset of the parameters. However, if the last parameter is sign extended, we'd be using the wrong, non-extended size, and we'd end up with a smaller stack than we need to hold the extended value. Instead of hacking this up based on the value of Size in getStackAddress, we move our stack size handling logic to assignArg, where we have access to the CCState which knows everything we could possibly want to know about the stack. This way we don't need to duplicate any knowledge or resort to any ugly hacks. On this same occasion, update the IRTranslator test to check the sizes of the stores everywhere, not just for sign extended paramteres. llvm-svn: 296631	2017-03-01 15:35:14 +00:00
Sanjay Patel	88a1b8b466	[x86] auto-generate checks; NFC llvm-svn: 296629	2017-03-01 14:46:59 +00:00
Sanjay Patel	f0496a6a5c	[x86] regenerate checks; NFC llvm-svn: 296628	2017-03-01 14:41:57 +00:00
Igor Laevsky	b40152d5d1	[DeadStoreElimination] Check function modref behavior before considering memory clobbered Differential Revision: https://reviews.llvm.org/D29996 llvm-svn: 296625	2017-03-01 14:38:29 +00:00
Simon Dardis	fc261240b2	[mips] Drop unneeded REQUIRES line in test. NFCI rL296111 provides the proper fix. llvm-svn: 296622	2017-03-01 14:31:09 +00:00
Sanjay Patel	ffc6943011	[PPC] add tests for select-of-constants with binop; NFC llvm-svn: 296621	2017-03-01 14:26:49 +00:00
Igor Laevsky	37cba43604	[BasicAA] Take attributes into account when requesting modref info for a call site Differential Revision: https://reviews.llvm.org/D29989 llvm-svn: 296617	2017-03-01 13:19:51 +00:00
Alexey Bataev	4a45efa431	[SLP] Preserve IR flags when vectorizing horizontal reductions. Summary: The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math) when converting scalar horizontal reductions instructions into vectors, just like for other vectorized instructions. It doe not include IR propagation for extra arguments, we need to handle original scalar operations for extra args to propagate correct flags. Reviewers: mkuper, mzolotukhin, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30418 llvm-svn: 296614	2017-03-01 12:43:39 +00:00

1 2 3 4 5 ...

43294 Commits