llvm-project

Commit Graph

Author	SHA1	Message	Date
Benjamin Kramer	0bf086f80f	LoopUnrollRuntime: Check for overflow in the trip count calculation. Fixes PR19823. llvm-svn: 211436	2014-06-21 13:46:25 +00:00
Benjamin Kramer	8dd637aa04	SCEVExpander: Fold constant PHIs harder. The logic below only understands proper IVs. PR20093. llvm-svn: 211433	2014-06-21 11:47:18 +00:00
Stepan Dyatkovskiy	6baeb8805c	Commited patch from Björn Steinbrink: Summary: Different range metadata can lead to different optimizations in later passes, possibly breaking the semantics of the merged function. So range metadata must be taken into consideration when comparing Load instructions. Thanks! llvm-svn: 211391	2014-06-20 19:11:56 +00:00
Karthik Bhat	e03a25da70	Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer. This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as vector shuffles if they are profitable. These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86. Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 llvm-svn: 211339	2014-06-20 04:32:48 +00:00
Hans Wennborg	4dc895164a	Don't build switch lookup tables for dllimport or TLS variables We would previously put dllimport variables in switch lookup tables, which doesn't work because the address cannot be used in a constant initializer. This is basically the same problem that we have in PR19955. Putting TLS variables in switch tables also desn't work, because the address of such a variable is not constant. Differential Revision: http://reviews.llvm.org/D4220 llvm-svn: 211331	2014-06-20 00:38:12 +00:00
Jingyue Wu	37fcb5919d	[ValueTracking] Extend range metadata to call/invoke Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281	2014-06-19 16:50:16 +00:00
Dinesh Dwivedi	562fd7534c	Added instruction combine to transform few more negative values addition to subtraction (Part 1) This patch enables transforms for following patterns. (x + (~(y & c) + 1) --> x - (y & c) (x + (~((y >> z) & c) + 1) --> x - ((y>>z) & c) Differential Revision: http://reviews.llvm.org/D3733 llvm-svn: 211266	2014-06-19 10:36:52 +00:00
Dinesh Dwivedi	b62e52e1b5	Refactored and updated SimplifyUsingDistributiveLaws() to * Find factorization opportunities using identity values. * Find factorization opportunities by treating shl(X, C) as mul (X, shl(C)) * Keep NSW flag while simplifying instruction using factorization. This fixes PR19263. Differential Revision: http://reviews.llvm.org/D3799 llvm-svn: 211261	2014-06-19 08:29:18 +00:00
David Majnemer	6cf6c05322	InstCombine: Stop two transforms dueling InstCombineMulDivRem has: // Canonicalize (X+C1)CI -> XCI+C1CI. InstCombineAddSub has: // WX + YZ --> W (X+Z) iff W == Y These two transforms could fight with each other if C1CI would not fold away to something simpler than a ConstantExpr mul. The InstCombineMulDivRem transform only acted on ConstantInts until r199602 when it was changed to operate on all Constants in order to let it fire on ConstantVectors. To fix this, make this transform more careful by checking to see if we actually folded away C1CI. This fixes PR20079. llvm-svn: 211258	2014-06-19 07:14:33 +00:00
Nick Lewycky	8561a49c27	Move optimization of some cases of (A & C1)\|(B & C2) from instcombine to instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs. llvm-svn: 211252	2014-06-19 03:51:46 +00:00
Nick Lewycky	c961030ac2	Make instsimplify's analysis of icmp eq/ne use computeKnownBits to determine whether the icmp is always true or false. Patch by Suyog Sarda! llvm-svn: 211251	2014-06-19 03:35:49 +00:00
Matt Arsenault	a0050b0961	R600/SI: Add intrinsics for various math instructions. These will be used for custom lowering and for library implementations of various math functions, so it's useful to expose these as builtins. llvm-svn: 211247	2014-06-19 01:19:19 +00:00
Dinesh Dwivedi	657105e582	Fixed jump threading going to infinite loop. This patch add code to remove unreachable blocks from function as they may cause jump threading to stuck in infinite loop. Differential Revision: http://reviews.llvm.org/D3991 llvm-svn: 211103	2014-06-17 14:34:19 +00:00
Jingyue Wu	33bd53df7f	[InstCombine] mark ADD with nuw if no unsigned overflow Summary: As a starting step, we only use one simple heuristic: if the sign bits of both a and b are zero, we can prove "add a, b" do not unsigned overflow, and thus convert it to "add nuw a, b". Updated all affected tests and added two new tests (@zero_sign_bit and @zero_sign_bit2) in AddOverflow.ll Test Plan: make check-all Reviewers: eliben, rafael, meheff, chandlerc Reviewed By: chandlerc Subscribers: chandlerc, llvm-commits Differential Revision: http://reviews.llvm.org/D4144 llvm-svn: 211084	2014-06-17 00:42:07 +00:00
Duncan P. N. Exon Smith	73686d305a	SROA: Only split loads on byte boundaries r199771 accidently broke the logic that makes sure that SROA only splits load on byte boundaries. If such a split happens, some bits get lost when reassembling loads of wider types, causing data corruption. Move the width check up to reject such splits early, avoiding the corruption. Fixes PR19250. Patch by: Björn Steinbrink <bsteinbr@gmail.com> llvm-svn: 211082	2014-06-17 00:19:35 +00:00
Eli Bendersky	ff90324599	Teach LoopUnrollPass to respect loop unrolling hints in metadata. [This is resubmitting r210721, which was reverted due to suspected breakage which turned out to be unrelated]. Some extra review comments were addressed. See D4090 and D4147 for more details. The Clang change that produces this metadata was committed in r210667 Patch by Mark Heffernan. llvm-svn: 211076	2014-06-16 23:53:02 +00:00
Jim Grosbach	fff5663d48	LowerSwitch: track bounding range for the condition tree. When LowerSwitch transforms a switch instruction into a tree of ifs it is actually performing a binary search into the various case ranges, to see if the current value falls into one cases range of values. So, if we have a program with something like this: switch (a) { case 0: do0(); break; case 1: do1(); break; case 2: do2(); break; default: break; } the code produced is something like this: if (a < 1) { if (a == 0) { do0(); } } else { if (a < 2) { if (a == 1) { do1(); } } else { if (a == 2) { do2(); } } } This code is inefficient because the check (a == 1) to execute do1() is not needed. The reason is that because we already checked that (a >= 1) initially by checking that also (a < 2) we basically already inferred that (a == 1) without the need of an extra basic block spawned to check if actually (a == 1). The patch addresses this problem by keeping track of already checked bounds in the LowerSwitch algorithm, so that when the time arrives to produce a Leaf Block that checks the equality with the case value / range the algorithm can decide if that block is really needed depending on the already checked bounds . For example, the above with "a = 1" would work like this: the bounds start as LB: NONE , UB: NONE as (a < 1) is emitted the bounds for the else path become LB: 1 UB: NONE. This happens because by failing the test (a < 1) we know that the value "a" cannot be smaller than 1 if we enter the else branch. After the emitting the check (a < 2) the bounds in the if branch become LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then the upper bound becomes 2 - 1 = 1. When it is time to emit the leaf block for "case 1:" we notice that 1 can be squeezed exactly in between the LB and UB, which means that if we arrived to that block there is no need to emit a block that checks if (a == 1). Patch by: Marcello Maggioni <hayarms@gmail.com> llvm-svn: 211038	2014-06-16 16:55:20 +00:00
Jingyue Wu	baabe5091c	Canonicalize addrspacecast ConstExpr between different pointer types As a follow-up to r210375 which canonicalizes addrspacecast instructions, this patch canonicalizes addrspacecast constant expressions. Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast cosntant expressions, this patch is also a step towards having the frontend emit canonicalized addrspacecasts. Piggyback a minor refactor in InstCombineCasts.cpp Update three affected tests in addrspacecast-alias.ll, access-non-generic.ll and constant-fold-gep.ll and added one new test in constant-fold-address-space-pointer.ll llvm-svn: 211004	2014-06-15 21:40:57 +00:00
Jiangning Liu	96e92c1d75	Move GlobalMerge from Transform to CodeGen. This patch is to move GlobalMerge pass from Transform/Scalar to CodeGen, because GlobalMerge depends on TargetMachine. In the mean time, the macro INITIALIZE_TM_PASS is also moved to CodeGen/Passes.h. With this fix we can avoid making libScalarOpts depend on libCodeGen. llvm-svn: 210951	2014-06-13 22:57:59 +00:00
Tim Northover	20b9f739eb	Atomics: make use of the "cmpxchg weak" instruction. This also simplifies the IR we create slightly: instead of working out where success & failure should go manually, it turns out we can just always jump to a success/failure block created for the purpose. Later phases will sort out the mess without much difficulty. llvm-svn: 210917	2014-06-13 16:45:52 +00:00
Tim Northover	d039abdeeb	Atomics: switch direction of cmpxchg comparison This has two benefits: it makes the result more suitable for direct insertaion into the struct to emulate the new cmpxchg, and it means the name we give the instruction matches its actual effect better. llvm-svn: 210916	2014-06-13 16:45:36 +00:00
Tim Northover	6bf04e4512	SCCP: update for cmpxchg returning { iN, i1 } now. I accidentally missed this one since its use looked OK locally. llvm-svn: 210909	2014-06-13 14:54:09 +00:00
Tim Northover	420a216817	IR: add "cmpxchg weak" variant to support permitted failure. This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903	2014-06-13 14:24:07 +00:00
Duncan P. N. Exon Smith	fd5c553f54	GVN: Enable value forwarding for calloc Enable value forwarding for loads from `calloc()` without an intervening store. This change extends GVN to handle the following case: %1 = tail call noalias i8* @calloc(i64 1, i64 4) %2 = bitcast i8* %1 to i32* ; This load is trivially constant zero %3 = load i32* %2, align 4 This is analogous to the handling for `malloc()` in the same places. `malloc()` returns `undef`; `calloc()` returns a zero value. Note that it is correct to return zero even for out of bounds GEPs since the result of such a GEP would be undefined. Patch by Philip Reames! llvm-svn: 210828	2014-06-12 21:16:19 +00:00
Eli Bendersky	dc6de2ce29	Revert r210721 as it causes breakage in internal builds (and possibly GDB). llvm-svn: 210807	2014-06-12 18:05:39 +00:00
Dinesh Dwivedi	95f0d51bd3	This removes TODO added in http://reviews.llvm.org/D3658 The patch transforms ABS(NABS(X)) -> ABS(X) NABS(ABS(X)) -> NABS(X) Differential Revision: http://reviews.llvm.org/D4040 llvm-svn: 210782	2014-06-12 14:06:00 +00:00
Eli Bendersky	899bef099f	Teach LoopUnrollPass to respect loop unrolling hints in metadata. See http://reviews.llvm.org/D4090 for more details. The Clang change that produces this metadata was committed in r210667 Patch by Mark Heffernan. llvm-svn: 210721	2014-06-11 23:15:35 +00:00
Chad Rosier	5ea14e09e9	[Reassociate] FileCheckize and cleanup a few testcases. No functional change intended. llvm-svn: 210685	2014-06-11 18:28:45 +00:00
Jiangning Liu	b2ae37fb67	Global merge for global symbols. This commit is to improve global merge pass and support global symbol merge. The global symbol merge is not enabled by default. For aarch64, we need some more back-end fix to make it really benifit ADRP CSE. llvm-svn: 210640	2014-06-11 06:44:53 +00:00
Jiangning Liu	3e5b855a51	Rename global-merge to enable-global-merge. llvm-svn: 210639	2014-06-11 06:35:26 +00:00
Juergen Ributzka	b2e4edb5c8	[ConstantHoisting][X86] Improve the cost model for small constants with large types (i64 and above). This improves the X86 cost model for small constants with large types. Before this commit we would even hoist trivial constants such as i96 2. This is related to <rdar://problem/17070936> llvm-svn: 210504	2014-06-10 00:32:29 +00:00
Alp Toker	d3d017cf00	Reduce verbiage of lit.local.cfg files We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496	2014-06-09 22:42:55 +00:00
Matt Arsenault	44f60d0a60	Look through addrspacecasts when turning ptr comparisons into index comparisons. llvm-svn: 210488	2014-06-09 19:20:29 +00:00
Stepan Dyatkovskiy	297c826385	Added functions cross-reference test. Originally this similar was initiated by Björn Steinbrink here: http://reviews.llvm.org/D3437 Bug itself has been fixed by principal changes in MergeFunctions. Though special checks for functions merging are still actual. And the test has been accepted with slight modifications. llvm-svn: 210486	2014-06-09 19:03:02 +00:00
Jingyue Wu	5c7b1aed5d	[SeparateConstOffsetFromGEP] inbounds zext => sext for better splitting For each array index that is in the form of zext(a), convert it to sext(a) if we can prove zext(a) <= max signed value of typeof(a). The conversion helps to split zext(x + y) into sext(x) + sext(y). Reviewed in http://reviews.llvm.org/D4060 llvm-svn: 210444	2014-06-08 23:49:34 +00:00
Jingyue Wu	f2b85881d4	[SeparateConstOffsetFromGEP] make two tests more strict inbounds are not necessary in these two tests. zext(a +nuw b) = zext(a) + zext(b) should hold with or without inbounds. llvm-svn: 210437	2014-06-08 20:01:42 +00:00
Rafael Espindola	4ba22f0813	Revert 209903 and 210040. The messages were "PR19753: Optimize comparisons with "ashr exact" of a constanst." "Added support to optimize comparisons with "lshr exact" of a constant." They were not correctly handling signed/unsigned operation differences, causing pr19958. llvm-svn: 210393	2014-06-07 04:12:35 +00:00
Jingyue Wu	77145d9410	InstCombine: Canonicalize addrspacecast between different element types addrspacecast X addrspace(M)* to Y addrspace(N)* --> bitcast X addrspace(M)* to Y addrspace(M)* addrspacecast Y addrspace(M)* to Y addrspace(N)* Updat all affected tests and add several new tests in addrspacecast.ll. This patch is based on http://reviews.llvm.org/D2186 (authored by Matt Arsenault) with fixes and more tests. llvm-svn: 210375	2014-06-06 21:52:55 +00:00
Michael Zolotukhin	07ee478829	Fix typo in a test from r210342. llvm-svn: 210343	2014-06-06 15:49:47 +00:00
Michael Zolotukhin	1c51612d42	[SLP] Enable vectorization of GEP expressions. The use cases look like the following: x->a = y->a + 10 x->b = y->b + 12 llvm-svn: 210342	2014-06-06 15:34:24 +00:00
Dinesh Dwivedi	3217b6c661	Added select flavour for ABS and NEG(ABS) This patch can identify ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X and can transform ABS(ABS(X)) -> ABS(X) NABS(NABS(X)) -> NABS(X) Differential Revision: http://reviews.llvm.org/D3658 llvm-svn: 210312	2014-06-06 06:54:45 +00:00
Karthik Bhat	bf56d44cab	Fix PR19657 (scalar loads not combined into vector load) If we have common uses on separate paths in the tree; process the one with greater common depth first. This makes sure that we do not assume we need to extract a load when it is actually going to be part of a vectorized tree. Review: http://reviews.llvm.org/D3800 llvm-svn: 210310	2014-06-06 06:20:08 +00:00
Rafael Espindola	42a4c9f9e0	Allow aliases to be unnamed_addr. Alias with unnamed_addr were in a strange state. It is stored in GlobalValue, the language reference talks about "unnamed_addr aliases" but the verifier was rejecting them. It seems natural to allow unnamed_addr in aliases: * It is a property of how it is accessed, not of the data itself. * It is perfectly possible to write code that depends on the address of an alias. This patch then makes unname_addr legal for aliases. One side effect is that the syntax changes for a corner case: In globals, unnamed_addr is now printed before the address space. llvm-svn: 210302	2014-06-06 01:20:28 +00:00
Jingyue Wu	84465473e7	Fixed several correctness issues in SeparateConstOffsetFromGEP Most issues are on mishandling s/zext. Fixes: 1. When rebuilding new indices, s/zext should be distributed to sub-expressions. e.g., sext(a +nsw (b +nsw 5)) = sext(a) + sext(b) + 5 but not sext(a + b) + 5. This also affects the logic of recursively looking for a constant offset, we need to include s/zext into the context of the searching. 2. Function find should return the bitwidth of the constant offset instead of always sign-extending it to i64. 3. Stop shortcutting zext'ed GEP indices. LLVM conceptually sign-extends GEP indices to pointer-size before computing the address. Therefore, gep base, zext(a + b) != gep base, a + b Improvements: 1. Add an optimization for splitting sext(a + b): if a + b is proven non-negative (e.g., used as an index of an inbound GEP) and one of a, b is non-negative, sext(a + b) = sext(a) + sext(b) 2. Function Distributable checks whether both sext and zext can be distributed to operands of a binary operator. This helps us split zext(sext(a + b)) to zext(sext(a) + zext(sext(b)) when a + b does not signed or unsigned overflow. Refactoring: Merge some common logic of handling add/sub/or in find. Testing: Add many tests in split-gep.ll and split-gep-and-gvn.ll to verify the changes we made. llvm-svn: 210291	2014-06-05 22:07:33 +00:00
Rafael Espindola	c286f4bf2a	Add a testcase where there is an overflow when combining two constants. I noticed that a proposed optimization would have prevented this. llvm-svn: 210287	2014-06-05 21:29:49 +00:00
Nick Lewycky	ff114dae5a	Fix coverage for files with global constructors again. Adds a testcase to the commit from r206671, as requested by David Blaikie. llvm-svn: 210239	2014-06-05 04:31:43 +00:00
Alexey Samsonov	ad81f0f419	Use AArch64 instead of now removed ARM64 in test configs llvm-svn: 210229	2014-06-05 00:25:30 +00:00
Rafael Espindola	04c2258624	InstCombine: Improvement to check if signed addition overflows. This patch implements two things: 1. If we know one number is positive and another is negative, we return true as signed addition of two opposite signed numbers will never overflow. 2. Implemented TODO : If one of the operands only has one non-zero bit, and if the other operand has a known-zero bit in a more significant place than it (not including the sign bit) the ripple may go up to and fill the zero, but won't change the sign. e.x - (x & ~4) + 1 We make sure that we are ignoring 0 at MSB. Patch by Suyog Sarda. llvm-svn: 210186	2014-06-04 15:39:14 +00:00
Nick Lewycky	5f53ddd0cc	Ignore line numbers on debug intrinsics. Add an assert to ensure that we aren't emitting line number zero, the .gcno format uses this to indicate that the next field is a filename. llvm-svn: 210068	2014-06-03 04:25:36 +00:00
Rafael Espindola	64c1e18033	Allow alias to point to an arbitrary ConstantExpr. This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is up to MC (or the system assembler) to decide if that expression is valid or not. This reduces our ability to diagnose invalid uses and how early we can spot them, but it also lets us do things like @test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32), i32 ptrtoint (i32* @bar to i32)) to i32) An important implication of this patch is that the notion of aliased global doesn't exist any more. The alias has to encode the information needed to access it in its metadata (linkage, visibility, type, etc). Another consequence to notice is that getSection has to return a "const char ". It could return a NullTerminatedStringRef if there was such a thing, but when that was proposed the decision was to just uses "const char*" for that. llvm-svn: 210062	2014-06-03 02:41:57 +00:00
Rafael Espindola	d1a2c2d905	Add back commit r210029. The code was actually correct. Sorry for the confusion. I have expanded the comment saying why the analysis is valid to avoid me misunderstaning it again in the future. llvm-svn: 210052	2014-06-02 22:01:04 +00:00
Rafael Espindola	80546be566	Convert test to FileCheck. llvm-svn: 210049	2014-06-02 21:23:54 +00:00
Rafael Espindola	582c890fbe	Revert "Add the nsw flag when we detect that an add will not signed overflow." This reverts commit r210029. It was not correctly handling cases where LHS and RHS had multiple but different sign bits. llvm-svn: 210048	2014-06-02 21:12:19 +00:00
Rafael Espindola	6b04ef785e	Added support to optimize comparisons with "lshr exact" of a constant. Patch by Rahul Jain. llvm-svn: 210040	2014-06-02 19:19:04 +00:00
Rafael Espindola	82899febf0	Add the nsw flag when we detect that an add will not signed overflow. We already had a function for checking this, we were just using it only in specialized cases. llvm-svn: 210029	2014-06-02 14:32:58 +00:00
Dinesh Dwivedi	ce5d35a9d0	Added inst combine tarnsform for (1 << X) & C pattrens where C is (some PowerOf2 - 1) This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt "((1 << X) & 7) == 0" ==> "X > 2" "((1 << X) & 7) != 0" ==> "X < 3". Differential Revision: http://reviews.llvm.org/D3678 llvm-svn: 210007	2014-06-02 07:57:24 +00:00
Dinesh Dwivedi	43e127bded	Added inst combine transforms for single bit tests from Chris's note if ((x & C) == 0) x \|= C becomes x \|= C if ((x & C) != 0) x ^= C becomes x &= ~C if ((x & C) == 0) x ^= C becomes x \|= C if ((x & C) != 0) x &= ~C becomes x &= ~C if ((x & C) == 0) x &= ~C becomes nothing Differential Revision: http://reviews.llvm.org/D3777 llvm-svn: 210006	2014-06-02 07:24:36 +00:00
Benjamin Kramer	4968944376	[Reassociate] Similar to "X + -X" -> "0", added code to handle "X + ~X" -> "-1". Handle "X + ~X" -> "-1" in the function Value Reassociate::OptimizeAdd(Instruction I, SmallVectorImpl<ValueEntry> &Ops); This patch implements: TODO: We could handle "X + ~X" -> "-1" if we wanted, since "-X = ~X+1". Patch by Rahul Jain! Differential Revision: http://reviews.llvm.org/D3835 llvm-svn: 209973	2014-05-31 15:01:54 +00:00
Matt Arsenault	c8fc08c31b	Make bitcast, extractelement, and insertelement considered cheap for speculation. This helps more branches into selects. On R600, vectors are cheap and anything that helps remove branches is very good. llvm-svn: 209914	2014-05-30 18:34:43 +00:00
Rafael Espindola	c323952cb4	PR19753: Optimize comparisons with "ashr exact" of a constanst. Patch by suyog sarda. llvm-svn: 209903	2014-05-30 15:54:32 +00:00
Tim Northover	b4ddc0845a	ARM & AArch64: make use of common cmpxchg idioms after expansion The C and C++ semantics for compare_exchange require it to return a bool indicating success. This gets mapped to LLVM IR which follows each cmpxchg with an icmp of the value loaded against the desired value. When lowered to ldxr/stxr loops, this extra comparison is redundant: its results are implicit in the control-flow of the function. This commit makes two changes: it replaces that icmp with appropriate PHI nodes, and then makes sure earlyCSE is called after expansion to actually make use of the opportunities revealed. I've also added -{arm,aarch64}-enable-atomic-tidy options, so that existing fragile tests aren't perturbed too much by the change. Many of them either rely on undef/unreachable too pervasively to be restored to something well-defined (particularly while making sure they test the same obscure assert from many years ago), or depend on a particular CFG shape, which is disrupted by SimplifyCFG. rdar://problem/16227836 llvm-svn: 209883	2014-05-30 10:09:59 +00:00
Karthik Bhat	5ab7795649	Allow vectorization of intrinsics such as powi,cttz and ctlz in Loop and SLP Vectorizer. This patch adds support to vectorize intrinsics such as powi, cttz and ctlz in Vectorizer. These intrinsics are different from other intrinsics as second argument to these function must be same in order to vectorize them and it should be represented as a scalar. Review: http://reviews.llvm.org/D3851#inline-32769 and http://reviews.llvm.org/D3937#inline-32857 llvm-svn: 209873	2014-05-30 04:31:24 +00:00
Nick Lewycky	59633cb478	When analyzing params/args for readnone/readonly, don't forget to consider that a pointer argument may be passed through a callsite to the return, and that we may need to analyze it. Fixes a bug reported on llvm-dev: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073098.html llvm-svn: 209870	2014-05-30 02:31:27 +00:00
Arnold Schwaighofer	e2067680a6	LoopVectorizer: Add a check that the backedge taken count + 1 does not overflow The loop vectorizer instantiates be-taken-count + 1 as the loop iteration count. If this expression overflows the generated code was invalid. In case of overflow the code now jumps to the scalar loop. Fixes PR17288. llvm-svn: 209854	2014-05-29 22:10:01 +00:00
Louis Gerbarg	c6b506a0ae	Add support for combining GEPs across PHI nodes Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ \|-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ \|-->PHI1-->GEP3 ==> \|-->PHI2->GEP12->GEP3 == > \|-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. llvm-svn: 209843	2014-05-29 20:29:47 +00:00
Rafael Espindola	a248f536b3	Revert "Revert "Revert "InstCombine: Improvement to check if signed addition overflows.""" This reverts commit r209776. It was miscompiling llvm::SelectionDAGISel::MorphNode. llvm-svn: 209817	2014-05-29 14:39:16 +00:00
Dinesh Dwivedi	d266cb1a0b	LCSSA should be performed on the outermost affected loop while unrolling loop. During loop-unroll, loop exits from the current loop may end up in in different outer loop. This requires to re-form LCSSA recursively for one level down from the outer most loop where loop exits are landed during unroll. This fixes PR18861. Differential Revision: http://reviews.llvm.org/D2976 llvm-svn: 209796	2014-05-29 06:47:23 +00:00
Michael J. Spencer	289067cc3d	Add LoadCombine pass. This pass is disabled by default. Use -combine-loads to enable in -O[1-3] Differential revision: http://reviews.llvm.org/D3580 llvm-svn: 209791	2014-05-29 01:55:07 +00:00
Rafael Espindola	6196b7430e	Revert "Revert "InstCombine: Improvement to check if signed addition overflows."" This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure llvm-svn: 209776	2014-05-28 21:43:52 +00:00
Rafael Espindola	910528a3eb	Revert "Add support for combining GEPs across PHI nodes" This reverts commit r209755. it was the real cause of the libc++ build failure. llvm-svn: 209775	2014-05-28 21:41:21 +00:00
Rafael Espindola	fb59b05ca4	Revert "InstCombine: Improvement to check if signed addition overflows." This reverts commit r209746. It looks it is causing a crash while building libcxx. I am trying to get a reduced testcase. llvm-svn: 209762	2014-05-28 18:48:10 +00:00
Louis Gerbarg	727f1cbb17	Add support for combining GEPs across PHI nodes Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ \|-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ \|-->PHI1-->GEP3 ==> \|-->PHI2->GEP12->GEP3 == > \|-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. llvm-svn: 209755	2014-05-28 17:38:31 +00:00
Rafael Espindola	085b57941f	InstCombine: Improvement to check if signed addition overflows. This patch implements two things: 1. If we know one number is positive and another is negative, we return true as signed addition of two opposite signed numbers will never overflow. 2. Implemented TODO : If one of the operands only has one non-zero bit, and if the other operand has a known-zero bit in a more significant place than it (not including the sign bit) the ripple may go up to and fill the zero, but won't change the sign. e.x - (x & ~4) + 1 We make sure that we are ignoring 0 at MSB. Patch by Suyog Sarda. llvm-svn: 209746	2014-05-28 15:30:40 +00:00
Arnaud A. de Grandmaison	de5ff26865	No need for those tests to go thru llvm-as and/or llvm-dis. opt can handle them by itself. llvm-svn: 209689	2014-05-27 22:03:28 +00:00
Jingyue Wu	76cbea6b6d	Fixed a test in r209670 The test was outdated with r209537. llvm-svn: 209671	2014-05-27 18:12:55 +00:00
Jingyue Wu	80a738dc62	Distribute sext/zext to the operands of and/or/xor This is an enhancement to SeparateConstOffsetFromGEP. With this patch, we can extract a constant offset from "s/zext and/or/xor A, B". Added a new test @ext_or to verify this enhancement. Refactoring the code, I also extracted some common logic to function Distributable. llvm-svn: 209670	2014-05-27 18:00:00 +00:00
Filipe Cabecinhas	e8d6a1e82f	Post-commit fixes for r209643 Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio Fixed the argument order to select (the mask semantics to blendv* are the inverse of select) and fixed the tests Added parenthesis to the assert condition Ran clang-format llvm-svn: 209667	2014-05-27 16:54:33 +00:00
Filipe Cabecinhas	82ac07c283	Convert some X86 blendv* intrinsics into IR. Summary: Implemented an InstCombine transformation that takes a blendv* intrinsic call and translates it into an IR select, if the mask is constant. This will eventually get lowered into blends with immediates if possible, or pblendvb (with an option to further optimize if we can transform the pblendvb into a blend+immediate instruction, depending on the selector). It will also enable optimizations by the IR passes, which give up on sight of the intrinsic. Both the transformation and the lowering of its result to asm got shiny new tests. The transformation is a bit convoluted because of blendvp[sd]'s definition: Its mask is a floating point value! This forces us to convert it and get the highest bit. I suppose this happened because the mask has type __m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin. I will send an email to llvm-dev to discuss if we want to change this or not. Reviewers: grosbach, delena, nadav Differential Revision: http://reviews.llvm.org/D3859 llvm-svn: 209643	2014-05-27 03:42:20 +00:00
Tim Northover	3b0846e8f7	AArch64/ARM64: move ARM64 into AArch64's place This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577	2014-05-24 12:50:23 +00:00
Tim Northover	cc08e1fe1b	AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64. I'm doing this in two phases for a better "git blame" record. This commit removes the previous AArch64 backend and redirects all functionality to ARM64. It also deduplicates test-lines and removes orphaned AArch64 tests. The next step will be "git mv ARM64 AArch64" and rewire most of the tests. Hopefully LLVM is still functional, though it would be even better if no-one ever had to care because the rename happens straight afterwards. llvm-svn: 209576	2014-05-24 12:42:26 +00:00
Michael Zolotukhin	d4c724625a	Implement sext(C1 + C2X) --> sext(C1) + sext(C2X) and sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568	2014-05-24 08:09:57 +00:00
Nico Rieck	bd15945ab2	Fix broken FileCheck prefixes llvm-svn: 209538	2014-05-23 19:06:24 +00:00
Jingyue Wu	bbb6e4a885	Add the extracted constant offset using GEP Fixed a TODO in r207783. Add the extracted constant offset using GEP instead of ugly ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR easier to understand. Updated all affected tests, and added a new test in split-gep.ll to cover a corner case where emitting uglygep is necessary. llvm-svn: 209537	2014-05-23 18:39:40 +00:00
Justin Bogner	cbb8438bb3	ScalarEvolution: Fix handling of AddRecs in isKnownPredicate ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487	2014-05-23 00:06:56 +00:00
Diego Novillo	7f8af8bf91	Add support for missed and analysis optimization remarks. Summary: This adds two new diagnostics: -pass-remarks-missed and -pass-remarks-analysis. They take the same values as -pass-remarks but are intended to be triggered in different contexts. -pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed, which passes call when they tried to apply a transformation but couldn't. -pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis, which passes call when they want to inform the user about analysis results. The patch also: 1- Adds support in the inliner for the two new remarks and a test case. 2- Moves emitOptimizationRemark* functions to the llvm namespace. 3- Adds an LLVMContext argument instead of making them member functions of LLVMContext. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3682 llvm-svn: 209442	2014-05-22 14:19:46 +00:00
Eli Bendersky	f13a05607c	Similar to bitcast, treat addrspacecast as a foldable operand. Added a test sink-addrspacecast.ll to verify this change. Patch by Jingyue Wu. llvm-svn: 209343	2014-05-22 00:02:52 +00:00
Nick Lewycky	ec373545b8	Teach isKnownNonNull that a nonnull return is not null. Add a test for this case as well as the case of a nonnull attribute (already handled but not tested). llvm-svn: 209193	2014-05-20 05:13:21 +00:00
Juergen Ributzka	431761771c	[ConstantHoisting][X86] Change the cost model to never hoist constants for types larger than i128. Currently the X86 backend doesn't support types larger than i128 very well. For example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted. This fix changes the cost model to never hoist constants for types larger than i128. Once the codegen issues have been resolved, the cost model can be updated to allow also larger types. This is related to <rdar://problem/16954938> llvm-svn: 209162	2014-05-19 21:00:53 +00:00
Peter Collingbourne	68a889757d	Check the alwaysinline attribute on the call as well as on the caller. Differential Revision: http://reviews.llvm.org/D3815 llvm-svn: 209150	2014-05-19 18:25:54 +00:00
Benjamin Kramer	6dd790c617	Flip on vectorization of bswap intrinsics. The cost model conservatively assumes that it will always get scalarized and that's about as good as we can get with the generic TTI; reasoning whether a shuffle with an efficient lowering is available is hard. We can override that conservative estimate for some targets in the future. llvm-svn: 209125	2014-05-19 13:48:08 +00:00
Dinesh Dwivedi	f82f16e3e6	Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)' This removes TODO added in r208849 [http://reviews.llvm.org/D3629] MIN(MIN(A, 97), 23) -> MIN(A, 23) MAX(MAX(A, 23), 97) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3785 llvm-svn: 209110	2014-05-19 07:08:32 +00:00
NAKAMURA Takumi	7ef81a4f98	Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes" It broke clang selfhosting even after r209065. llvm-svn: 209067	2014-05-17 14:39:21 +00:00
Louis Gerbarg	8d2a43e9be	Add support for combining GEPs across PHI nodes Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ \|-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ \|-->PHI1-->GEP3 ==> \|-->PHI2->GEP12->GEP3 == > \|-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. rdar://15547484 llvm-svn: 209049	2014-05-16 23:47:24 +00:00
Reid Kleckner	fceb76f5f9	Add comdat key field to llvm.global_ctors and llvm.global_dtors This allows us to put dynamic initializers for weak data into the same comdat group as the data being initialized. This is necessary for MSVC ABI compatibility. Once we have comdats for guard variables, we can use the combination to help GlobalOpt fire more often for weak data with guarded initialization on other platforms. Reviewers: nlewycky Differential Revision: http://reviews.llvm.org/D3499 llvm-svn: 209015	2014-05-16 20:39:27 +00:00
Rafael Espindola	6b238633b7	Fix most of PR10367. This patch changes the design of GlobalAlias so that it doesn't take a ConstantExpr anymore. It now points directly to a GlobalObject, but its type is independent of the aliasee type. To avoid changing all alias related tests in this patches, I kept the common syntax @foo = alias i32* @bar to mean the same as now. The cases that used to use cast now use the more general syntax @foo = alias i16, i32* @bar. Note that GlobalAlias now behaves a bit more like GlobalVariable. We know that its type is always a pointer, so we omit the '*'. For the bitcode, a nice surprise is that we were writing both identical types already, so the format change is minimal. Auto upgrade is handled by looking through the casts and no new fields are needed for now. New bitcode will simply have different types for Alias and Aliasee. One last interesting point in the patch is that replaceAllUsesWith becomes smart enough to avoid putting a ConstantExpr in the aliasee. This seems better than checking and updating every caller. A followup patch will delete getAliasedGlobal now that it is redundant. Another patch will add support for an explicit offset. llvm-svn: 209007	2014-05-16 19:35:39 +00:00
David Majnemer	78910fc4da	InstSimplify: Improve handling of ashr/lshr Summary: Analyze the range of values produced by ashr/lshr cst, %V when it is being used in an icmp. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3774 llvm-svn: 209000	2014-05-16 17:14:03 +00:00
David Majnemer	ea8d5dbf24	InstSimplify: Optimize using dividend in sdiv Summary: The dividend in an sdiv tells us the largest and smallest possible results. Use this fact to optimize comparisons against an sdiv with a constant dividend. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3795 llvm-svn: 208999	2014-05-16 16:57:04 +00:00
Rafael Espindola	5a52b9f139	Revert "Implement global merge optimization for global variables." This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978	2014-05-16 13:02:18 +00:00
Jiangning Liu	932e1c3924	Implement global merge optimization for global variables. This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934	2014-05-15 23:45:42 +00:00
Reid Kleckner	900d46ff39	Don't insert lifetime.end markers between a musttail call and ret The allocas going out of scope are immediately killed by the return instruction. This is a resend of r208912, which was committed accidentally. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3792 llvm-svn: 208920	2014-05-15 21:10:46 +00:00

1 2 3 4 5 ...

4525 Commits