llvm-project

Commit Graph

Author	SHA1	Message	Date
David Majnemer	29130c5e8d	IndVarSimplify: check if loop invariant expansion can trap IndVarSimplify is willing to move divide instructions outside of their loop bodies if they are invariant of the loop. However, it may not be safe to expand them if we do not know if they can trap. Instead, check to see if it is not safe to expand the instruction and skip the expansion. This fixes PR16041. Testcase by Rafael Ávila de Espíndola. llvm-svn: 183239	2013-06-04 17:51:58 +00:00
Rafael Espindola	a5e536ab0e	Second part of pr16069 The problem this time seems to be a thinko. We were assuming that in the CFG A \| \ \| B \| / C speculating the basic block B would cause only the phi value for the B->C edge to be speculated. That is not true, the phi's are semantically in the edges, so if the A->B->C path is taken, any code needed for A->C is not executed and we have to consider it too when deciding to speculate B. llvm-svn: 183226	2013-06-04 14:11:59 +00:00
Hans Wennborg	5cf30be6e4	Typo: s/caes/cases/ in SimplifyCFG llvm-svn: 183219	2013-06-04 11:22:30 +00:00
Nick Lewycky	688d668e5c	Delete dead safety check. llvm-svn: 183167	2013-06-03 23:15:20 +00:00
David Majnemer	c82f27af2a	SimplifyCFG: Do not transform PHI to select if doing so would be unsafe PR16069 is an interesting case where an incoming value to a PHI is a trap value while also being a 'ConstantExpr'. We do not consider this case when performing the 'HoistThenElseCodeToIf' optimization. Instead, make our modifications more conservative if we detect that we cannot transform the PHI to a select. llvm-svn: 183152	2013-06-03 20:43:12 +00:00
David Majnemer	8e7dd2f628	SimplifyCFG: Small cleanup, use ICmpInst::isEquality() llvm-svn: 183151	2013-06-03 20:39:50 +00:00
Kostya Serebryany	9e62b301e6	[asan] ASan Linux MIPS32 support (llvm part), patch by Jyun-Yan Y llvm-svn: 183104	2013-06-03 14:46:56 +00:00
Nick Lewycky	3f715e260a	When determining the new index for an insertelement, we may not assume that an index greater than the size of the vector is invalid. The shuffle may be shrinking the size of the vector. Fixes a crash! Also drop the maximum recursion depth of the safety check for this optimization to five. llvm-svn: 183080	2013-06-01 20:51:31 +00:00
David Majnemer	91142c485e	SimplifyCFG: Fix typo in comment for ComputeSpeculationCost llvm-svn: 183078	2013-06-01 19:43:23 +00:00
Benjamin Kramer	7c275640e7	Move getRealLinkageName to a common place and remove all the duplicates of it. Also simplify code a bit while there. No functionality change. llvm-svn: 183076	2013-06-01 17:51:14 +00:00
Arnold Schwaighofer	7b1b4db35e	LoopVectorize: Change API call to get the backedge taken count Use ScalarEvolution's getBackedgeTakenCount API instead of getExitCount since that is really what we want to know. Using the more specific getExitCount was safe because we made sure that there is only one exiting block. No functionality change. llvm-svn: 183047	2013-05-31 21:48:56 +00:00
Quentin Colombet	bf490d4a32	Loop Strength Reduce: Scaling factor cost. Account for the cost of scaling factor in Loop Strength Reduce when rating the formulae. This uses a target hook. The default implementation of the hook is: if the addressing mode is legal, the scaling factor is free. <rdar://problem/13806271> llvm-svn: 183045	2013-05-31 21:29:03 +00:00
Arnold Schwaighofer	70a9be5297	LoopVectorize: PHIs with only outside users should prevent vectorization We check that instructions in the loop don't have outside users (except if they are reduction values). Unfortunately, we skipped this check for if-convertable PHIs. Fixes PR16184. llvm-svn: 183035	2013-05-31 19:53:50 +00:00
Quentin Colombet	8aa7abe2ae	Modify how the formulae are rated in Loop Strength Reduce. Namely, check if the target allows to fold more that one register in the addressing mode and if yes, adjust the cost accordingly. Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2 needs a temporary register for the computation, whereas it was correctly estimated for reg1 + scale * reg2. <rdar://problem/13973908> llvm-svn: 183021	2013-05-31 17:20:29 +00:00
Rafael Espindola	65281bf36e	Simplify multiplications by vectors whose elements are powers of 2. Patch by Andrea Di Biagio. llvm-svn: 183005	2013-05-31 14:27:15 +00:00
Evgeniy Stepanov	888385e40f	[msan] Handle mixed track-origins and keep-going settings (llvm part). Before this change, each module defined a weak_odr global __msan_track_origins with a value of 1 if origin tracking is enabled, 0 if disabled. If there are modules with different values, any of them may win. If 0 wins, and there is at least one module with 1, the program will most likely crash. With this change, __msan_track_origins is only emitted if origin tracking is on. Then runtime library detects if there is at least one module with origin tracking, and enables runtime support for it. llvm-svn: 182997	2013-05-31 12:04:29 +00:00
Nick Lewycky	a2b7720618	Reapply with r182909 with a fix to the calculation of the new indices for insertelement instructions. llvm-svn: 182976	2013-05-31 00:59:42 +00:00
Evgeniy Stepanov	2c14269883	Revert r182909. PR/16177 llvm-svn: 182919	2013-05-30 09:40:17 +00:00
Nick Lewycky	d7f27094c0	Swizzle vector inputs if it helps us eliminate shuffles. llvm-svn: 182909	2013-05-30 04:33:38 +00:00
NAKAMURA Takumi	d11b42aaad	LoopVectorize.cpp: Fix abuse of StringRef on Twine. Twine captures the pointer of StringRef. llvm-svn: 182820	2013-05-29 03:13:47 +00:00
NAKAMURA Takumi	d57ea87080	Whitespace. llvm-svn: 182819	2013-05-29 03:13:41 +00:00
Paul Redmond	5fdf836ba4	Add support for llvm.vectorizer metadata - llvm.loop.parallel metadata has been renamed to llvm.loop to be more generic by making the root of additional loop metadata. - Loop::isAnnotatedParallel now looks for llvm.loop and associated llvm.mem.parallel_loop_access - document llvm.loop and update llvm.mem.parallel_loop_access - add support for llvm.vectorizer.width and llvm.vectorizer.unroll - document llvm.vectorizer.* metadata - add utility class LoopVectorizerHints for getting/setting loop metadata - use llvm.vectorizer.width=1 to indicate already vectorized instead of already_vectorized - update existing tests that used llvm.loop.parallel and llvm.vectorizer.already_vectorized Reviewed by: Nadav Rotem llvm-svn: 182802	2013-05-28 20:00:34 +00:00
James Molloy	f6f121e277	Extend RemapInstruction and friends to take an optional new parameter, a ValueMaterializer. Extend LinkModules to pass a ValueMaterializer to RemapInstruction and friends to lazily create Functions for lazily linked globals. This is a big win when linking small modules with large (mostly unused) library modules. llvm-svn: 182776	2013-05-28 15:17:05 +00:00
Evgeniy Stepanov	fca012334b	[msan] Fix argument shadow alignment. llvm-svn: 182771	2013-05-28 13:07:43 +00:00
Michael J. Spencer	df1ecbd734	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Michael Gottesman	e67f40c514	[objc-arc] KnownSafe does not imply that it is safe to perform code motion across CFG edges since even if it is safe to remove RR pairs, we may still be able to move a retain/release into a loop. rdar://13949644 llvm-svn: 182670	2013-05-24 20:44:05 +00:00
Michael Gottesman	5a91bbf33a	[objc-arc] Make sure that multiple owners is propogated correctly through the pass via the usage of a global data structure. rdar://13750319 llvm-svn: 182669	2013-05-24 20:44:02 +00:00
Benjamin Kramer	6ac1e62377	LoopVectorize: LoopSimplify can't canonicalize loops with an indirectbr in it, don't assert on those cases. Fixes PR16139. llvm-svn: 182656	2013-05-24 18:05:35 +00:00
Joey Gouly	b34294d0e4	Run clang-format over the scalarizePHI function. llvm-svn: 182640	2013-05-24 12:33:28 +00:00
Joey Gouly	83699284be	scalarizePHI needs to insert the next ExtractElement in the same block as the BinaryOperator, not in the block where the IRBuilder is currently inserting into. Fixes a bug where scalarizePHI would create instructions that would not dominate all uses. llvm-svn: 182639	2013-05-24 12:29:54 +00:00
Daniel Malea	fddddbeab0	Re-implement DebugIR in a way that does not subclass AssemblyWriter: - move AsmWriter.h from public headers into lib - marked all AssemblyWriter functions as non-virtual; no need to override them - DebugIR now "plugs into" AssemblyWriter with an AssemblyAnnotationWriter helper - exposed flags to control hiding of a) debug metadata b) debug intrinsic calls C/R: Paul Redmond llvm-svn: 182617	2013-05-23 22:34:33 +00:00
Benjamin Kramer	ad5c24f161	More symbols that should be static. llvm-svn: 182590	2013-05-23 16:09:15 +00:00
Michael Gottesman	740db977f6	[objc-arc] Fixed number of prefixing slashes in some comments in a function from 3 to 2 to match the rest of ObjCARCOpts. llvm-svn: 182557	2013-05-23 02:35:21 +00:00
Nadav Rotem	9e00eb38a2	SLPVectorizer: Change the order in which new instructions are added to the function. We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs). * Imroved the numbering API. * Changed the placement of new instructions to the last root. * Fixed a bug with external tree users with non-zero lane. * Fixed a bug in the placement of in-tree users. llvm-svn: 182508	2013-05-22 19:47:32 +00:00
Jean-Luc Duprat	0dda6f168c	This is an update to a previous commit (r181216). The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499	2013-05-22 18:29:31 +00:00
Arnold Schwaighofer	12b0d1cda0	LoopVectorize: Make Value pointers that could be RAUW'ed a VH The Value pointers we store in the induction variable list can be RAUW'ed by a call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing in some other places where we store pointers that could potentially be RAUW'ed. Fixes PR16073. llvm-svn: 182485	2013-05-22 16:54:56 +00:00
Evgeniy Stepanov	ebd7f8e7ef	[msan] A no-op implementation of VarArg handling. This stuff is used on platforms where MSan does not have a proper VarArg implementation (anything other than x86_64 at the moment). llvm-svn: 182375	2013-05-21 12:27:47 +00:00
Bill Wendling	5f4740390e	Remove unused #include. llvm-svn: 182315	2013-05-20 20:59:12 +00:00
Hal Finkel	a969df84ab	Rename LoopSimplify.h to LoopUtils.h As discussed, LoopUtils.h is a better name. llvm-svn: 182314	2013-05-20 20:46:30 +00:00
Hal Finkel	a12d82b421	Expose InsertPreheaderForLoop from LoopSimplify to other passes Other passes, PPC counter-loop formation for example, also need to add loop preheaders outside of the regular loop simplification pass. This makes InsertPreheaderForLoop a global function so that it can be used by other passes. No functionality change intended. llvm-svn: 182299	2013-05-20 16:47:07 +00:00
Arnold Schwaighofer	693a1ca628	LoopVectorize: Handle single edge PHIs We might encouter single edge PHIs - handle them with an identity select. Fixes PR15990. llvm-svn: 182199	2013-05-18 18:38:34 +00:00
Matt Arsenault	52ddb7bcdd	Add missing -- C++ -- to headers llvm-svn: 182164	2013-05-17 21:43:39 +00:00
Benjamin Kramer	d84a63398e	LoopVectorize: Simplify code. No functionality change. llvm-svn: 182100	2013-05-17 14:48:17 +00:00
Evgeniy Stepanov	1e7643243d	[msan] Switch TLS globals to initial-exec model. They are always defined in the main executable. llvm-svn: 181994	2013-05-16 09:14:05 +00:00
Arnold Schwaighofer	88e7fddc8c	LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvert We only want to check this once, not for every conditional block in the loop. No functionality change (except that we don't perform a check redudantly anymore). llvm-svn: 181942	2013-05-15 22:38:14 +00:00
Michael Gottesman	b4e7f4d841	[objc-arc] Fixed a spelling error and made the statistic descriptions be consistent about their usage of periods. llvm-svn: 181901	2013-05-15 17:43:03 +00:00
Arnold Schwaighofer	09cee97270	LoopVectorize: Fix comments No functionality change. llvm-svn: 181862	2013-05-15 02:02:45 +00:00
Arnold Schwaighofer	2d920477a4	LoopVectorize: Hoist conditional loads if possible InstCombine can be uncooperative to vectorization and sink loads into conditional blocks. This prevents vectorization. Undo this optimization if there are unconditional memory accesses to the same addresses in the loop. radar://13815763 llvm-svn: 181860	2013-05-15 01:44:30 +00:00
Sylvestre Ledru	149e281aa8	Fix two typo llvm-svn: 181848	2013-05-14 23:36:24 +00:00
Manman Ren	b3c52fb45b	GlobalOpt: fix an issue where CXAAtExitFn points to a deleted function. CXAAtExitFn was set outside a loop and before optimizations where functions can be deleted. This patch will set CXAAtExitFn inside the loop and after optimizations. Seg fault when running LTO because of accesses to a deleted function. rdar://problem/13838828 llvm-svn: 181838	2013-05-14 21:52:44 +00:00
Michael Gottesman	0c8b562851	Removed trailing whitespace. llvm-svn: 181760	2013-05-14 06:40:10 +00:00
Arnold Schwaighofer	2e7a922a15	LoopVectorize: Handle loops with multiple forward inductions We used to give up if we saw two integer inductions. After this patch, we base further induction variables on the chosen one like we do in the reverse induction and pointer induction case. Fixes PR15720. radar://13851975 llvm-svn: 181746	2013-05-14 00:21:18 +00:00
Michael Gottesman	f3f9e3b10a	[objc-arc-opts] Added debug statements when we set and unset whether a pointer is known positive. llvm-svn: 181745	2013-05-14 00:08:09 +00:00
Michael Gottesman	a76143eeee	[objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or. In the presense of a block being initialized, the frontend will emit the objc_retain on the original pointer and the release on the pointer loaded from the alloca. The optimizer will through the provenance analysis realize that the two are related (albiet different), but since we only require KnownSafe in one direction, will match the inner retain on the original pointer with the guard release on the original pointer. This is fixed by ensuring that in the presense of allocas we only unconditionally remove pointers if both our retain and our release are KnownSafe (i.e. we are KnownSafe in both directions) since we must deal with the possibility that the frontend will emit what (to the optimizer) appears to be unbalanced retain/releases. An example of the miscompile is: %A = alloca retain(%x) retain(%x) <--- Inner Retain store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) release(%x) <--- Guarding Release getting optimized to: %A = alloca retain(%x) store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) rdar://13750319 llvm-svn: 181743	2013-05-13 23:49:42 +00:00
Matt Beaumont-Gay	e55d9492e3	Move a couple more statistics inside '#ifndef NDEBUG'. Suppresses an unused-variable warning in -Asserts builds. llvm-svn: 181733	2013-05-13 21:10:49 +00:00
Michael Gottesman	993fbf704a	[objc-arc-opts] Add comment to BBState making it clear that get{TopDown,BottomUp}PtrState will create a new PtrState object if it does not find a PtrState for Arg. llvm-svn: 181726	2013-05-13 19:40:39 +00:00
Michael Gottesman	9fc50b82a4	[objc-arc] Move the before optimization statistics gathering phase out of OptimizeIndividualCalls. This makes the statistics gathering completely independent of the actual optimization occuring, preventing any sort of bleeding over from occuring. Additionally, it simplifies a switch statement in the non-statistic gathering case. llvm-svn: 181719	2013-05-13 18:29:07 +00:00
Duncan Sands	0480b9b54e	Suppress GCC compiler warnings in release builds about variables that are only read in asserts. llvm-svn: 181689	2013-05-13 07:50:47 +00:00
Nadav Rotem	33dcf0a70f	SLPVectorizer: Swap LHS and RHS. No functionality change. llvm-svn: 181684	2013-05-13 05:13:13 +00:00
Nadav Rotem	ce42cc6d4d	SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users. The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract. llvm-svn: 181674	2013-05-12 22:58:45 +00:00
Nadav Rotem	cbf6d24d50	SLPVectorizer: Clear the map that maps between scalars to vectors after each round of vectorization. Testcase in the next commit. llvm-svn: 181673	2013-05-12 22:55:57 +00:00
David Majnemer	6c30f49af3	InstCombine: Flip the order of two urem transforms There are two transforms in visitUrem that conflict with each other. ) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. ) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668	2013-05-12 00:07:05 +00:00
Arnold Schwaighofer	f2305e4467	LoopVectorize: Use the widest induction variable type Use the widest induction type encountered for the cannonical induction variable. We used to turn the following loop into an empty loop because we used i8 as induction variable type and truncated 1024 to 0 as trip count. int a[1024]; void fail() { int reverse_induction = 1023; unsigned char forward_induction = 0; while ((reverse_induction) >= 0) { forward_induction++; a[reverse_induction] = forward_induction; --reverse_induction; } } radar://13862901 llvm-svn: 181667	2013-05-11 23:04:28 +00:00
Arnold Schwaighofer	a544fefa32	LoopVectorize: Use variable instead of repeated function call No functionality change intended. llvm-svn: 181666	2013-05-11 23:04:26 +00:00
Arnold Schwaighofer	1ba84df437	LoopVectorize: Use IRBuilder interface in more places No functionality change intended. llvm-svn: 181665	2013-05-11 23:04:24 +00:00
David Majnemer	470b077bca	InstCombine: Turn urem to bitwise-and more often Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661	2013-05-11 09:01:28 +00:00
Nadav Rotem	cdfb48d2fe	SLPVectorizer: Add support for trees with external users. For example: bar() { int a = A[i]; int b = A[i+1]; B[i] = a; B[i+1] = b; foo(a); <--- a is used outside the vectorized expression. } llvm-svn: 181648	2013-05-10 22:59:33 +00:00
Nadav Rotem	0686e5cb05	Add a debug print llvm-svn: 181647	2013-05-10 22:56:18 +00:00
Benjamin Kramer	14e915f7b4	InstCombine: Don't claim to be able to evaluate any shl in a zexted type. The shift amount may be larger than the type leading to undefined behavior. Limit the transform to constant shift amounts. While there update the bits to clear in the result which may enable additional optimizations. PR15959. llvm-svn: 181604	2013-05-10 16:26:37 +00:00
Benjamin Kramer	a6645e8b8f	InstCombine: Verify the type before transforming uitofp into select. PR15952. llvm-svn: 181586	2013-05-10 09:16:52 +00:00
Dmitri Gribenko	9bf66a5fd0	Fix a documentation warning: \bried -> \brief llvm-svn: 181551	2013-05-09 21:16:18 +00:00
Shuxin Yang	1d8d7e4d38	[GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next iteration. This on step toward non-iterative GVN. My local hack suggests that getting rid of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++). I cannot explain why not 2x or more at this moment. llvm-svn: 181532	2013-05-09 18:34:27 +00:00
Rafael Espindola	007521673b	Don't replace an alias in llvm.used with its target. When we replace an internal alias with its target, be careful not to replace the entry in llvm.used (and llvm.compiler_used). llvm-svn: 181524	2013-05-09 17:22:59 +00:00
Benjamin Kramer	21b972ae94	InstCombine: Don't just copy known bits from the first operand of an srem. That's obviously wrong. Conservatively restrict it to the sign bit, which matches the original intention of this analysis. Fixes PR15940. llvm-svn: 181518	2013-05-09 16:32:32 +00:00
Arnold Schwaighofer	2e8c69cf97	LoopVectorizer: Don't assert on the absence of induction variables A computable loop exit count does not imply the presence of an induction variable. Scalar evolution can return a value for an infinite loop. Fixes PR15926. llvm-svn: 181495	2013-05-09 00:32:18 +00:00
Daniel Malea	3c5bed1670	Add DebugIR pass -- emits IR file and replace source lines with IR lines in MD - requires existing debug information to be present - fixes up file name and line number information in metadata - emits a "<orig_filename>-debug.ll" succinct IR file (without !dbg metadata or debug intrinsics) that can be read by a debugger - initialize pass in opt tool to enable the "-debug-ir" flag - lit tests to follow llvm-svn: 181467	2013-05-08 20:44:14 +00:00
Nick Lewycky	5fb1963f2a	Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst by switching to a ValueMap. Patch by Andrea DiBiagio! llvm-svn: 181397	2013-05-08 09:00:10 +00:00
Arnold Schwaighofer	3610139ac5	LoopVectorizer: Improve reduction variable identification The two nested loops were confusing and also conservative in identifying reduction variables. This patch replaces them by a worklist based approach. llvm-svn: 181369	2013-05-07 21:55:37 +00:00
Arnold Schwaighofer	e78b76fbed	LoopVectorize: getConsecutiveVector must respect signed arithmetic We were passing an i32 to ConstantInt::get where an i64 was needed and we must also pass the sign if we pass negatives numbers. The start index passed to getConsecutiveVector must also be signed. Should fix PR15882. llvm-svn: 181286	2013-05-07 04:37:05 +00:00
David Majnemer	70f286d95f	InstCombine: (X ^ signbit) + C -> X + (signbit ^ C) llvm-svn: 181249	2013-05-06 21:21:31 +00:00
Andrew Trick	9c72b071fe	Rotate multi-exit loops even if the latch was simplified. Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230	2013-05-06 17:58:18 +00:00
Jean-Luc Duprat	3e4fc3ef24	Provide InstCombines for the following 3 cases: A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A(1 - uitofp i1 C) + B(uitofp i1 C) -> select C, A, B llvm-svn: 181216	2013-05-06 16:55:50 +00:00
Nadav Rotem	632b25b743	Update the comment to mention that we use TTI. llvm-svn: 181178	2013-05-06 03:06:36 +00:00
Nadav Rotem	c70ef4e93c	Revert r164763 because it introduces new shuffles. Thanks Nick Lewycky for pointing this out. llvm-svn: 181177	2013-05-06 02:39:09 +00:00
Rafael Espindola	c229a4fff4	Fix const merging when an alias of a const is llvm.used. We used to disable constant merging not only if a constant is llvm.used, but also if an alias of a constant is llvm.used. This change fixes that. llvm-svn: 181175	2013-05-06 01:48:55 +00:00
Benjamin Kramer	3e3f2a4b8d	LoopVectorize: Print values instead of pointers in debug output. llvm-svn: 181157	2013-05-05 14:54:52 +00:00
Arnold Schwaighofer	d96e427eac	LoopVectorize: Add support for floating point min/max reductions Add support for min/max reductions when "no-nans-float-math" is enabled. This allows us to assume we have ordered floating point math and treat ordered and unordered predicates equally. radar://13723044 llvm-svn: 181144	2013-05-05 01:54:48 +00:00
Arnold Schwaighofer	f5183729db	LoopVectorizer: Cleanup of miminimum/maximum pattern match code No need for setting the operands. The pointers are going to be bound by the matcher. radar://13723044 llvm-svn: 181142	2013-05-05 01:54:44 +00:00
Arnold Schwaighofer	a670a0a3aa	LoopVectorize: We don't need an identity element for min/max reductions We can just use the initial element that feeds the reduction. max(max(x, y), z) == max(max(x,y), max(x,z)) radar://13723044 llvm-svn: 181141	2013-05-05 01:54:42 +00:00
Dmitri Gribenko	3238fb7595	Add ArrayRef constructor from None, and do the cleanups that this constructor enables Patch by Robert Wilhelm. llvm-svn: 181138	2013-05-05 00:40:33 +00:00
Nick Lewycky	881e9d62e2	Tabs to spaces. No functionality change. llvm-svn: 181082	2013-05-04 01:08:15 +00:00
Shuxin Yang	637b9bebd4	Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change. This function consists of following steps: 1. Collect dependent memory accesses. 2. Analyze availability. 3. Perform fully redundancy elimination, or 4. Perform PRE, depending on the availability Step 2, 3 and 4 are now moved to three helper routines. llvm-svn: 181047	2013-05-03 19:17:26 +00:00
Nadav Rotem	4ce060b3da	LoopVectorizer: Add support for if-conversion of PHINodes with 3+ incoming values. By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements. We can now vectorize this loop: int foo(int A, int B, int n) { for (int i=0; i < n; i++) { int x = 9; if (A[i] > B[i]) { if (A[i] > 19) { x = 3; } else if (B[i] < 4 ) { x = 4; } else { x = 5; } } A[i] = x; } } llvm-svn: 181037	2013-05-03 17:42:55 +00:00
Shuxin Yang	af2c3ddf0d	[GV] Remove dead code which is really difficult to decipher. Actually it took me couple of hours trying to make sense of them and only to find they are dead code. I guess the original author used "allSingleSucc" to indicate if there are any critial edge emanating from some blocks, and tried to perform code motion (actually speculation) in the presence of these critical edges; but later on he/she changed mind and decided to perform edge-splitting first. llvm-svn: 180951	2013-05-02 21:14:31 +00:00
Filip Pizlo	dec20e43c0	This patch breaks up Wrap.h so that it does not have to include all of the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881	2013-05-01 20:59:00 +00:00
Nadav Rotem	1e211913b5	SROA: Generate selects instead of shuffles when blending values because this is the cannonical form. Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often. llvm-svn: 180875	2013-05-01 19:53:30 +00:00
Jim Grosbach	d11584a7f7	Revert "InstCombine: Fold more shuffles of shuffles." This reverts commit r180802 There's ongoing discussion about whether this is the right place to make this transformation. Reverting for now while we figure it out. llvm-svn: 180834	2013-05-01 00:25:27 +00:00
Richard Trieu	624c2ebcbb	Fix a use after free. RI is freed before the call to getDebugLoc(). To prevent this, capture the location before RI is freed. llvm-svn: 180824	2013-04-30 22:45:10 +00:00
Nadav Rotem	9feda6071a	Fix a typo llvm-svn: 180806	2013-04-30 21:04:51 +00:00
Jim Grosbach	0b914fe839	InstCombine: Fold more shuffles of shuffles. Always fold a shuffle-of-shuffle into a single shuffle when there's only one input vector in the first place. Continue to be more conservative when there's multiple inputs. rdar://13402653 PR15866 llvm-svn: 180802	2013-04-30 20:43:52 +00:00
Adrian Prantl	8beccf9e6d	Spelling. Thanks, Eric. llvm-svn: 180794	2013-04-30 17:33:32 +00:00
Adrian Prantl	0941638a1b	Set debug locations for branch instructions created during inlining, even the inlined function has multiple returns. rdar://problem/12415623 llvm-svn: 180793	2013-04-30 17:08:16 +00:00
David Majnemer	d73f37bb83	Fix a bug in foldSelectICmpAndOr. Differences in bitwidth between X and Y could exist even if C1 and C2 have the same Log2 representation. llvm-svn: 180779	2013-04-30 10:36:33 +00:00
David Majnemer	8d048d0482	Fix "Combine bit test + conditional or into simple math" This fixes the optimization introduced in r179748 and reverted in r179750. While the optimization was sound, it did not properly respect differences in bit-width. llvm-svn: 180777	2013-04-30 08:57:58 +00:00
Arnold Schwaighofer	474df6d3ed	SimplifyCFG: If convert single conditional stores This resurrects r179957, but adds code that makes sure we don't touch atomic/volatile stores: This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case where the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. llvm-svn: 180731	2013-04-29 21:28:24 +00:00
Michael Gottesman	03cf3c8966	Add in some conditional compilation in order to silence an unused variable warning. llvm-svn: 180700	2013-04-29 07:29:08 +00:00
Michael Gottesman	214ca90f8e	[objc-arc] Apply the RV optimization to retains next to calls in ObjCARCContract instead of ObjCARCOpts. Turning retains into retainRV calls disrupts the data flow analysis in ObjCARCOpts. Thus we move it as late as we can by moving it into ObjCARCContract. We leave in the conversion from retainRV -> retain in ObjCARCOpt since it enables the dataflow analysis. rdar://10813093 llvm-svn: 180698	2013-04-29 06:53:53 +00:00
Michael Gottesman	9c11815978	Added statistics to count the number of retains/releases before/after optimization. llvm-svn: 180697	2013-04-29 06:16:57 +00:00
Michael Gottesman	8005ad3f3e	Removed trailing whitespace. llvm-svn: 180696	2013-04-29 06:16:55 +00:00
Michael Gottesman	3e3977c49f	Fix for r180693. = /. llvm-svn: 180694	2013-04-29 05:25:39 +00:00
Michael Gottesman	a87bb8f50b	[objc-arc-annotations] Moved the disabling of call movement to ConnectTDBUTraversals so that I can prevent Changed = true from being set. This prevents an infinite loop. llvm-svn: 180693	2013-04-29 05:13:13 +00:00
Shuxin Yang	04a4fd43aa	Fix a XOR reassociation bug. When Reassociator optimize "(x \| C1)" ^ "(X & C2)", it may swap the two subexpressions, however, it forgot to swap cached constants (of C1 and C2) accordingly. rdar://13739160 llvm-svn: 180676	2013-04-27 18:02:12 +00:00
Adrian Prantl	d00333a4b2	fix a typo that due to cu&paste quadrupled itself rdar://problem/13056109 llvm-svn: 180618	2013-04-26 18:10:50 +00:00
Adrian Prantl	29b9de7bf1	Bugfix for the debug intrinsic handling in InstCombiner: Since we can't guarantee that the original dbg.declare instrinsic is removed by LowerDbgDeclare(), we need to make sure that we are not inserting the same dbg.value intrinsic over and over. This removes tons of redundant DIEs when compiling optimized code. rdar://problem/13056109 llvm-svn: 180615	2013-04-26 17:48:33 +00:00
Nadav Rotem	13306816fc	LoopVectorizer: Calculate the number of pointers to disambiguate at runtime based on the numbers of reads and writes. llvm-svn: 180593	2013-04-26 05:08:59 +00:00
Michael Gottesman	47cf8a4c12	Revert "[objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls that were once autoreleaseRV instructions." This reverts commit r180222. I think this might tie in with a different problem which will require a different approach potentially. I am reverting this in the case I need to go down that second path. My apologies for the noise. = /. llvm-svn: 180590	2013-04-26 01:12:18 +00:00
Nadav Rotem	f43cbeee15	LoopVectorizer: No need to generate pointer disambiguation checks between readonly pointers. llvm-svn: 180570	2013-04-25 19:55:03 +00:00
Michael Gottesman	fdb497a9b2	[objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls that were once autoreleaseRV instructions. Due to the semantics of ARC, we must be extremely conservative with autorelease calls inserted by the frontend since ARC gaurantees that said object will be in the autorelease pool after that point, an optimization invariant that the optimizer must respect. On the other hand, we are allowed significantly more flexibility with autoreleaseRV instructions. Often times though this flexibility is disrupted by early transformations which transform objc_autoreleaseRV => objc_autorelease if said instruction is no longer being used as part of an RV pair (generally due to inlining). Since we can not tell the difference in between an autorelease put into place by the frontend and one created through said ``strength reduction'' we can not perform these optimizations. The addition of this set gets around said issues by allowing us to differentiate in between said two cases. rdar://problem/13697741. llvm-svn: 180222	2013-04-24 22:18:18 +00:00
Michael Gottesman	cd5b02701c	Fixed comment typo. llvm-svn: 180221	2013-04-24 22:18:15 +00:00
Arnold Schwaighofer	3fa801fbc2	LoopVectorizer: Change variable name Stride to ConsecutiveStride This makes it easier to read the code. No functionality change. llvm-svn: 180197	2013-04-24 16:16:03 +00:00
Arnold Schwaighofer	a6578f7056	LoopVectorize: Scalarize padded types This patch disables memory-instruction vectorization for types that need padding bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on x86_64. Because the load/store vectorization is performed by the bit casting to a packed vector, which has incompatible memory layout due to the lack of padding bytes, the present vectorizer produces inconsistent result for memory instructions of those types. This patch checks an equality of the AllocSize of a scalar type and allocated size for each vector element, to ensure that there is no padding bytes and the array can be read/written using vector operations. Patch by Daisuke Takahashi! Fixes PR15758. llvm-svn: 180196	2013-04-24 16:16:01 +00:00
Arnold Schwaighofer	23a0589bce	LoopVectorizer: Bail out if we don't have datalayout we need it llvm-svn: 180195	2013-04-24 16:15:58 +00:00
Adrian Prantl	15db52bf6d	Make sure the instruction right after an inlined function has a debug location. This solves a problem where range of an inlined subroutine is emitted wrongly. Patch by Manman Ren. Fixes rdar://problem/12415623 llvm-svn: 180140	2013-04-23 19:56:03 +00:00
Nadav Rotem	71c9d6d333	LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure that the order in which the elements are scalarized is the same as the original order. This fixes a miscompilation in FreeBSD's regex library. llvm-svn: 180121	2013-04-23 17:12:42 +00:00
Pekka Jaaskelainen	d3c90e132a	Call the potentially costly isAnnotatedParallel() only once. Made the uniform write test's checks a bit stricter. llvm-svn: 180119	2013-04-23 16:44:43 +00:00
Pekka Jaaskelainen	6f2f66b63f	Refuse to (even try to) vectorize loops which have uniform writes, even if erroneously annotated with the parallel loop metadata. Fixes Bug 15794: "Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata" llvm-svn: 180081	2013-04-23 08:08:51 +00:00
Eric Christopher	04d4e9312c	Move C++ code out of the C headers and into either C++ headers or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063	2013-04-22 22:47:22 +00:00
Anat Shemer	10260a75e3	Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users. llvm-svn: 180045	2013-04-22 20:51:10 +00:00
Rafael Espindola	74f2e46eef	Clarify that llvm.used can contain aliases. Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019	2013-04-22 14:58:02 +00:00
Benjamin Kramer	0212dc27ed	SROA: Don't crash on a select with two identical operands. This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982	2013-04-21 17:48:39 +00:00
Arnold Schwaighofer	6eb32b31bd	Revert "SimplifyCFG: If convert single conditional stores" There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980	2013-04-21 13:09:04 +00:00
Nadav Rotem	c57af326a4	SLPVectorize: Add support for vectorization of casts. llvm-svn: 179975	2013-04-21 08:05:59 +00:00
Nadav Rotem	98ad5f0f4c	SLPVectorizer: Fix a bug in the code that scans the tree in search of nodes with multiple users. We did not terminate the switch case and we executed the search routine twice. llvm-svn: 179974	2013-04-21 07:37:56 +00:00
Michael Gottesman	3eab2e43d2	When we strength reduce an objc_retainBlock call to objc_retain, increment NumPeeps and make sure that Changed is set to true. llvm-svn: 179968	2013-04-21 00:50:27 +00:00
Michael Gottesman	1e43004295	Fixed comment typo. llvm-svn: 179967	2013-04-21 00:44:46 +00:00
Michael Gottesman	df110ac9ec	[objc-arc] Fixed typo in debug message. llvm-svn: 179966	2013-04-21 00:30:50 +00:00
Michael Gottesman	cdb7c15ce8	[objc-arc] Fixed comment typo. llvm-svn: 179965	2013-04-21 00:25:04 +00:00
Michael Gottesman	fb9ece9a7c	[objc-arc] Refactored OptimizeReturns so that it uses continue instead of a large multi-level nested if statement. llvm-svn: 179964	2013-04-21 00:25:01 +00:00
Michael Gottesman	01338a442a	[objc-arc] Added debug statement saying when we are resetting a sequence's progress. This will make it clearer when we are actually resetting a sequence's progress vs just changing state. This is an important distinction because the former case clears any pointers that we are tracking while the later does not. llvm-svn: 179963	2013-04-20 23:36:57 +00:00
Nadav Rotem	8aca44a623	Fix PR15800. Do not try to vectorize vectors and structs. llvm-svn: 179960	2013-04-20 22:29:43 +00:00
Arnold Schwaighofer	3546ccf465	SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957	2013-04-20 21:42:09 +00:00
Benjamin Kramer	519b2e3087	VecUtils: Clean up uses of dyn_cast. llvm-svn: 179936	2013-04-20 10:36:17 +00:00
Benjamin Kramer	4600bcc337	SLPVectorizer: Strength reduce SmallVectors to ArrayRefs. Avoids a couple of copies and allows more flexibility in the clients. llvm-svn: 179935	2013-04-20 09:49:10 +00:00
Nadav Rotem	ce2660d639	SLPVectorizer: Reduce the compile time by eliminating the search for some of the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr. llvm-svn: 179933	2013-04-20 07:29:34 +00:00
Nadav Rotem	998e035cae	refactor tryToVectorizePair to a new method that supports vectorization of lists. llvm-svn: 179932	2013-04-20 07:22:58 +00:00
Nadav Rotem	890387289e	Fix an unused variable warning. llvm-svn: 179931	2013-04-20 06:40:28 +00:00
Nadav Rotem	83c7c41bc2	SLPVectorizer: Improve the cost model for loop invariant broadcast values. llvm-svn: 179930	2013-04-20 06:13:47 +00:00
Nadav Rotem	dfe1c93ca4	Report the number of stores that were found in the debug message. llvm-svn: 179929	2013-04-20 05:23:11 +00:00
Nadav Rotem	dfd8fcbb00	Fix the header comment. llvm-svn: 179928	2013-04-20 05:18:51 +00:00
Nadav Rotem	5ed99674e9	Use 64bit arithmetic for calculating distance between pointers. llvm-svn: 179927	2013-04-20 05:17:47 +00:00

1 2 3 4 5 ...

10442 Commits