llvm-project

Commit Graph

Author	SHA1	Message	Date
Owen Anderson	a8d1c3e74e	Fix an issue with the MergeBasicBlockIntoOnlyPred() helper function where it did not properly handle the case where the predecessor block was the entry block to the function. The only in-tree client of this is JumpThreading, which worked around the issue in its own code. This patch moves the solution into the helper so that JumpThreading (and other clients) do not have to replicate the same fix everywhere. llvm-svn: 212875	2014-07-12 07:12:47 +00:00
Hal Finkel	511fea7acd	Feeding isSafeToSpeculativelyExecute its DataLayout pointer (in Sink) This is the one remaining place I see where passing isSafeToSpeculativelyExecute a DataLayout pointer might matter (at least for loads) -- I think I got the others in r212720. Most of the other remaining callers of isSafeToSpeculativelyExecute only use it for call sites (or otherwise exclude loads). llvm-svn: 212730	2014-07-10 16:07:11 +00:00
Hal Finkel	a995f92627	Feeding isSafeToSpeculativelyExecute its DataLayout pointer isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the past, this was mainly used to make better decisions regarding divisions known not to trap, and so was not all that important for users concerned with "cheap" instructions. However, now it also helps look through bitcasts for dereferencable loads, and will also be important if/when we add a dereferencable pointer attribute. This is some initial work to feed a DataLayout pointer through to callers of isSafeToSpeculativelyExecute, generally where one was already available. llvm-svn: 212720	2014-07-10 14:41:31 +00:00
Hal Finkel	2e42c34d05	Allow isDereferenceablePointer to look through some bitcasts isDereferenceablePointer should not give up upon encountering any bitcast. If we're casting from a pointer to a larger type to a pointer to a small type, we can continue by examining the bitcast's operand. This missing capability was noted in a comment in the function. In order for this to work, isDereferenceablePointer now takes an optional DataLayout pointer (essentially all callers already had such a pointer available). Most code uses isDereferenceablePointer though isSafeToSpeculativelyExecute (which already took an optional DataLayout pointer), and to enable the LICM test case, LICM needs to actually provide its DL pointer to isSafeToSpeculativelyExecute (which it was not doing previously). llvm-svn: 212686	2014-07-10 05:27:53 +00:00
Rafael Espindola	adf21f2a56	Update the MemoryBuffer API to use ErrorOr. llvm-svn: 212405	2014-07-06 17:43:13 +00:00
Arnold Schwaighofer	ed988fb97d	GVN: Preserve invariant.load metadata If both instructions to be replaced are marked invariant the resulting instruction is invariant. rdar://13358910 Fix by Erik Eckstein! llvm-svn: 211801	2014-06-26 19:51:19 +00:00
Eli Bendersky	5d5e18da3e	Rename loop unrolling and loop vectorizer metadata to have a common prefix. [LLVM part] These patches rename the loop unrolling and loop vectorizer metadata such that they have a common 'llvm.loop.' prefix. Metadata name changes: llvm.vectorizer.* => llvm.loop.vectorizer.* llvm.loopunroll.* => llvm.loop.unroll.* This was a suggestion from an earlier review (http://reviews.llvm.org/D4090) which added the loop unrolling metadata. Patch by Mark Heffernan. llvm-svn: 211710	2014-06-25 15:41:00 +00:00
Evgeniy Stepanov	d99cca2c7a	Factor out part of LICM::sink into a helper function. llvm-svn: 211678	2014-06-25 09:17:21 +00:00
Evgeniy Stepanov	10280dac1d	[LICM] Don't create more than one copy of an instruction per loop exit block when sinking. Fixes exponential compilation complexity in PR19835, caused by LICM::sink not handling the following pattern well: f = op g e = op f, g d = op e c = op d, e b = op c a = op b, c When an instruction with N uses is sunk, each of its operands gets N new uses (all of them - phi nodes). In the example above, if a had 1 use, c would have 2, e would have 4, and g would have 8. llvm-svn: 211673	2014-06-25 07:54:58 +00:00
Dinesh Dwivedi	8bb5fb0661	Updated comments as suggested by Rafael. Thanks. llvm-svn: 211268	2014-06-19 14:11:53 +00:00
Dinesh Dwivedi	657105e582	Fixed jump threading going to infinite loop. This patch add code to remove unreachable blocks from function as they may cause jump threading to stuck in infinite loop. Differential Revision: http://reviews.llvm.org/D3991 llvm-svn: 211103	2014-06-17 14:34:19 +00:00
Duncan P. N. Exon Smith	73686d305a	SROA: Only split loads on byte boundaries r199771 accidently broke the logic that makes sure that SROA only splits load on byte boundaries. If such a split happens, some bits get lost when reassembling loads of wider types, causing data corruption. Move the width check up to reject such splits early, avoiding the corruption. Fixes PR19250. Patch by: Björn Steinbrink <bsteinbr@gmail.com> llvm-svn: 211082	2014-06-17 00:19:35 +00:00
Eli Bendersky	ff90324599	Teach LoopUnrollPass to respect loop unrolling hints in metadata. [This is resubmitting r210721, which was reverted due to suspected breakage which turned out to be unrelated]. Some extra review comments were addressed. See D4090 and D4147 for more details. The Clang change that produces this metadata was committed in r210667 Patch by Mark Heffernan. llvm-svn: 211076	2014-06-16 23:53:02 +00:00
Nick Lewycky	b06a796051	Remove extra whitespace in function declaration. No functionality change. llvm-svn: 210965	2014-06-14 03:48:29 +00:00
Jiangning Liu	96e92c1d75	Move GlobalMerge from Transform to CodeGen. This patch is to move GlobalMerge pass from Transform/Scalar to CodeGen, because GlobalMerge depends on TargetMachine. In the mean time, the macro INITIALIZE_TM_PASS is also moved to CodeGen/Passes.h. With this fix we can avoid making libScalarOpts depend on libCodeGen. llvm-svn: 210951	2014-06-13 22:57:59 +00:00
Tim Northover	6bf04e4512	SCCP: update for cmpxchg returning { iN, i1 } now. I accidentally missed this one since its use looked OK locally. llvm-svn: 210909	2014-06-13 14:54:09 +00:00
Tim Northover	420a216817	IR: add "cmpxchg weak" variant to support permitted failure. This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903	2014-06-13 14:24:07 +00:00
Rafael Espindola	db4ed0bdab	Remove 'using std::errro_code' from lib. llvm-svn: 210871	2014-06-13 02:24:39 +00:00
Rafael Espindola	3acea39853	Don't use 'using std::error_code' in include/llvm. This should make sure that most new uses use the std prefix. llvm-svn: 210835	2014-06-12 21:46:39 +00:00
Duncan P. N. Exon Smith	fd5c553f54	GVN: Enable value forwarding for calloc Enable value forwarding for loads from `calloc()` without an intervening store. This change extends GVN to handle the following case: %1 = tail call noalias i8* @calloc(i64 1, i64 4) %2 = bitcast i8* %1 to i32* ; This load is trivially constant zero %3 = load i32* %2, align 4 This is analogous to the handling for `malloc()` in the same places. `malloc()` returns `undef`; `calloc()` returns a zero value. Note that it is correct to return zero even for out of bounds GEPs since the result of such a GEP would be undefined. Patch by Philip Reames! llvm-svn: 210828	2014-06-12 21:16:19 +00:00
Eli Bendersky	dc6de2ce29	Revert r210721 as it causes breakage in internal builds (and possibly GDB). llvm-svn: 210807	2014-06-12 18:05:39 +00:00
Eli Bendersky	899bef099f	Teach LoopUnrollPass to respect loop unrolling hints in metadata. See http://reviews.llvm.org/D4090 for more details. The Clang change that produces this metadata was committed in r210667 Patch by Mark Heffernan. llvm-svn: 210721	2014-06-11 23:15:35 +00:00
Jiangning Liu	d623c528c5	Create macro INITIALIZE_TM_PASS. Pass initialization requires to initialize TargetMachine for back-end specific passes. This commit creates a new macro INITIALIZE_TM_PASS to simplify this kind of initialization. llvm-svn: 210641	2014-06-11 07:04:37 +00:00
Jiangning Liu	b2ae37fb67	Global merge for global symbols. This commit is to improve global merge pass and support global symbol merge. The global symbol merge is not enabled by default. For aarch64, we need some more back-end fix to make it really benifit ADRP CSE. llvm-svn: 210640	2014-06-11 06:44:53 +00:00
Jiangning Liu	3e5b855a51	Rename global-merge to enable-global-merge. llvm-svn: 210639	2014-06-11 06:35:26 +00:00
Eric Christopher	2af33756c7	We already have a reference to the TargetMachine, use that. llvm-svn: 210580	2014-06-10 20:39:39 +00:00
Jingyue Wu	5c7b1aed5d	[SeparateConstOffsetFromGEP] inbounds zext => sext for better splitting For each array index that is in the form of zext(a), convert it to sext(a) if we can prove zext(a) <= max signed value of typeof(a). The conversion helps to split zext(x + y) into sext(x) + sext(y). Reviewed in http://reviews.llvm.org/D4060 llvm-svn: 210444	2014-06-08 23:49:34 +00:00
Jingyue Wu	01ceeb190d	[SeparateConstOffsetFromGEP] Fix an illegitimate optimization on zext zext(a + b) != zext(a) + zext(b) even if a + b >= 0 && b >= 0. e.g., a = i4 0b1111, b = i4 0b0001 zext a + b to i8 = zext 0b0000 to i8 = 0b00000000 (zext a to i8) + (zext b to i8) = 0b00001111 + 0b00000001 = 0b00010000 llvm-svn: 210439	2014-06-08 20:19:38 +00:00
Jingyue Wu	48a5abeec0	Refactor canonicalizing array indices to a helper function No functionality changes. llvm-svn: 210438	2014-06-08 20:15:45 +00:00
Jingyue Wu	84465473e7	Fixed several correctness issues in SeparateConstOffsetFromGEP Most issues are on mishandling s/zext. Fixes: 1. When rebuilding new indices, s/zext should be distributed to sub-expressions. e.g., sext(a +nsw (b +nsw 5)) = sext(a) + sext(b) + 5 but not sext(a + b) + 5. This also affects the logic of recursively looking for a constant offset, we need to include s/zext into the context of the searching. 2. Function find should return the bitwidth of the constant offset instead of always sign-extending it to i64. 3. Stop shortcutting zext'ed GEP indices. LLVM conceptually sign-extends GEP indices to pointer-size before computing the address. Therefore, gep base, zext(a + b) != gep base, a + b Improvements: 1. Add an optimization for splitting sext(a + b): if a + b is proven non-negative (e.g., used as an index of an inbound GEP) and one of a, b is non-negative, sext(a + b) = sext(a) + sext(b) 2. Function Distributable checks whether both sext and zext can be distributed to operands of a binary operator. This helps us split zext(sext(a + b)) to zext(sext(a) + zext(sext(b)) when a + b does not signed or unsigned overflow. Refactoring: Merge some common logic of handling add/sub/or in find. Testing: Add many tests in split-gep.ll and split-gep-and-gvn.ll to verify the changes we made. llvm-svn: 210291	2014-06-05 22:07:33 +00:00
Benjamin Kramer	4968944376	[Reassociate] Similar to "X + -X" -> "0", added code to handle "X + ~X" -> "-1". Handle "X + ~X" -> "-1" in the function Value Reassociate::OptimizeAdd(Instruction I, SmallVectorImpl<ValueEntry> &Ops); This patch implements: TODO: We could handle "X + ~X" -> "-1" if we wanted, since "-X = ~X+1". Patch by Rahul Jain! Differential Revision: http://reviews.llvm.org/D3835 llvm-svn: 209973	2014-05-31 15:01:54 +00:00
Michael J. Spencer	289067cc3d	Add LoadCombine pass. This pass is disabled by default. Use -combine-loads to enable in -O[1-3] Differential revision: http://reviews.llvm.org/D3580 llvm-svn: 209791	2014-05-29 01:55:07 +00:00
Jingyue Wu	80a738dc62	Distribute sext/zext to the operands of and/or/xor This is an enhancement to SeparateConstOffsetFromGEP. With this patch, we can extract a constant offset from "s/zext and/or/xor A, B". Added a new test @ext_or to verify this enhancement. Refactoring the code, I also extracted some common logic to function Distributable. llvm-svn: 209670	2014-05-27 18:00:00 +00:00
Owen Anderson	115aa160e6	Make the LoopRotate pass's maximum header size configurable both programmatically and via the command line, mirroring similar functionality in LoopUnroll. In situations where clients used custom unrolling thresholds, their intent could previously be foiled by LoopRotate having a hardcoded threshold. llvm-svn: 209617	2014-05-26 08:58:51 +00:00
Jingyue Wu	bbb6e4a885	Add the extracted constant offset using GEP Fixed a TODO in r207783. Add the extracted constant offset using GEP instead of ugly ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR easier to understand. Updated all affected tests, and added a new test in split-gep.ll to cover a corner case where emitting uglygep is necessary. llvm-svn: 209537	2014-05-23 18:39:40 +00:00
Diego Novillo	7f8af8bf91	Add support for missed and analysis optimization remarks. Summary: This adds two new diagnostics: -pass-remarks-missed and -pass-remarks-analysis. They take the same values as -pass-remarks but are intended to be triggered in different contexts. -pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed, which passes call when they tried to apply a transformation but couldn't. -pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis, which passes call when they want to inform the user about analysis results. The patch also: 1- Adds support in the inliner for the two new remarks and a test case. 2- Moves emitOptimizationRemark* functions to the llvm namespace. 3- Adds an LLVMContext argument instead of making them member functions of LLVMContext. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3682 llvm-svn: 209442	2014-05-22 14:19:46 +00:00
Quentin Colombet	c88baa5c10	[LSR] Canonicalize reg1 + ... + regN into reg1 + ... + 1*regN. This commit introduces a canonical representation for the formulae. Basically, as soon as a formula has more that one base register, the scaled register field is used for one of them. The register put into the scaled register is preferably a loop variant. The commit refactors how the formulae are built in order to produce such representation. This yields a more accurate, but still perfectible, cost model. <rdar://problem/16731508> llvm-svn: 209230	2014-05-20 19:25:04 +00:00
Matt Arsenault	04b67ceeeb	Use range for llvm-svn: 209147	2014-05-19 17:52:48 +00:00
Rafael Espindola	5a52b9f139	Revert "Implement global merge optimization for global variables." This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978	2014-05-16 13:02:18 +00:00
Jiangning Liu	932e1c3924	Implement global merge optimization for global variables. This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934	2014-05-15 23:45:42 +00:00
Alp Toker	beaca19c7c	Fix typos llvm-svn: 208839	2014-05-15 01:52:21 +00:00
Jay Foad	a0653a3e6c	Rename ComputeMaskedBits to computeKnownBits. "Masked" has been inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811	2014-05-14 21:14:37 +00:00
Benjamin Kramer	3f085ba6d6	GVN: Fix non-determinism in map iteration. Iterating over a DenseMaop is non-deterministic and results to unpredictable IR output. Based on a patch by Daniel Reynaud! llvm-svn: 208728	2014-05-13 21:06:40 +00:00
Benjamin Kramer	d97f95e2f2	GVN: rangify a couple of loops. No functionality change. llvm-svn: 208727	2014-05-13 21:06:36 +00:00
Nick Lewycky	54a824b758	Improve wording to make it sounds more like a change than an analysis. llvm-svn: 208370	2014-05-08 23:04:46 +00:00
Richard Smith	c45f3f7433	Simplify and fix incorrect comment. No functionality change. llvm-svn: 208272	2014-05-08 01:08:43 +00:00
Nick Lewycky	7185b5d60c	Detabify. llvm-svn: 208019	2014-05-06 00:46:20 +00:00
Nick Lewycky	5ef6bc8815	Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017	2014-05-05 23:59:03 +00:00
Benjamin Kramer	9130cb8547	LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. llvm-svn: 207940	2014-05-04 19:12:38 +00:00
Akira Hatanaka	f76388dd7e	[GVN] Pass the phi-translated address of a load instead of the untranslated address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where PRE is applied to a load that is not partially redundant. <rdar://problem/16638765>. llvm-svn: 207853	2014-05-02 17:59:17 +00:00

1 2 3 4 5 ...

6066 Commits