llvm-project

Commit Graph

Author	SHA1	Message	Date
Quentin Colombet	8aa7abe2ae	Modify how the formulae are rated in Loop Strength Reduce. Namely, check if the target allows to fold more that one register in the addressing mode and if yes, adjust the cost accordingly. Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2 needs a temporary register for the computation, whereas it was correctly estimated for reg1 + scale * reg2. <rdar://problem/13973908> llvm-svn: 183021	2013-05-31 17:20:29 +00:00
Michael J. Spencer	df1ecbd734	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Shuxin Yang	1d8d7e4d38	[GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next iteration. This on step toward non-iterative GVN. My local hack suggests that getting rid of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++). I cannot explain why not 2x or more at this moment. llvm-svn: 181532	2013-05-09 18:34:27 +00:00
Nick Lewycky	5fb1963f2a	Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst by switching to a ValueMap. Patch by Andrea DiBiagio! llvm-svn: 181397	2013-05-08 09:00:10 +00:00
Andrew Trick	9c72b071fe	Rotate multi-exit loops even if the latch was simplified. Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230	2013-05-06 17:58:18 +00:00
Shuxin Yang	637b9bebd4	Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change. This function consists of following steps: 1. Collect dependent memory accesses. 2. Analyze availability. 3. Perform fully redundancy elimination, or 4. Perform PRE, depending on the availability Step 2, 3 and 4 are now moved to three helper routines. llvm-svn: 181047	2013-05-03 19:17:26 +00:00
Shuxin Yang	af2c3ddf0d	[GV] Remove dead code which is really difficult to decipher. Actually it took me couple of hours trying to make sense of them and only to find they are dead code. I guess the original author used "allSingleSucc" to indicate if there are any critial edge emanating from some blocks, and tried to perform code motion (actually speculation) in the presence of these critical edges; but later on he/she changed mind and decided to perform edge-splitting first. llvm-svn: 180951	2013-05-02 21:14:31 +00:00
Filip Pizlo	dec20e43c0	This patch breaks up Wrap.h so that it does not have to include all of the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881	2013-05-01 20:59:00 +00:00
Nadav Rotem	1e211913b5	SROA: Generate selects instead of shuffles when blending values because this is the cannonical form. Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often. llvm-svn: 180875	2013-05-01 19:53:30 +00:00
Shuxin Yang	04a4fd43aa	Fix a XOR reassociation bug. When Reassociator optimize "(x \| C1)" ^ "(X & C2)", it may swap the two subexpressions, however, it forgot to swap cached constants (of C1 and C2) accordingly. rdar://13739160 llvm-svn: 180676	2013-04-27 18:02:12 +00:00
Eric Christopher	04d4e9312c	Move C++ code out of the C headers and into either C++ headers or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063	2013-04-22 22:47:22 +00:00
Rafael Espindola	74f2e46eef	Clarify that llvm.used can contain aliases. Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019	2013-04-22 14:58:02 +00:00
Benjamin Kramer	0212dc27ed	SROA: Don't crash on a select with two identical operands. This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982	2013-04-21 17:48:39 +00:00
Chris Lattner	8cf09416ea	Fix a comment, PR15777. llvm-svn: 179775	2013-04-18 17:42:14 +00:00
Jim Grosbach	0f38c1e3a7	Fix a typo in comment. llvm-svn: 179542	2013-04-15 17:40:48 +00:00
Shuxin Yang	331f01dcb4	Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate. I brazenly think this change is slightly simpler than r178793 because: - no "state" in functor - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" While I can reproduce the probelm in Valgrind, it is rather difficult to come up a standalone testing case. The reason is that when an iterator is invalidated, the stale invalidated elements are not yet clobbered by nonsense data, so the optimizer can still proceed successfully. Thank Benjamin for fixing this bug and generously providing the test case. llvm-svn: 179062	2013-04-08 22:00:43 +00:00
Chandler Carruth	0e8a52d18f	Fix PR15674 (and PR15603): a SROA think-o. The fix for PR14972 in r177055 introduced a real think-o in the store side, likely because I was much more focused on the load side. While we can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily widen a value to be stored, as that changes the width of memory access! Lock down the code path in the store rewriting which would do this to only handle the intended circumstance. All of the existing tests continue to pass, and I've added a test from the PR. llvm-svn: 178974	2013-04-07 11:47:54 +00:00
Shuxin Yang	95adf5258f	Disable the optimization about promoting vector-element-access with symbolic index. This optimization is unstable at this moment; it 1) block us on a very important application 2) PR15200 3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll (the CHECK command compare the output against wrong result) I personally believe this optimization should not have any impact on the autovectorized code, as auto-vectorizer is supposed to put gather/scatter in a "right" way. Although in theory downstream optimizaters might reveal some gather/scatter optimization opportunities, the chance is quite slim. For the hand-crafted vectorizing code, in term of redundancy elimination, load-CSE, copy-propagation and DSE can collectively achieve the same result, but in much simpler way. On the other hand, these optimizers are able to improve the code in a incremental way; in contrast, SROA is sort of all-or-none approach. However, SROA might slighly win in stack size, as it tries to figure out a stretch of memory tightenly cover the area accessed by the dynamic index. rdar://13174884 PR15200 llvm-svn: 178912	2013-04-05 21:07:08 +00:00
Benjamin Kramer	dd67654af6	Reassociate: Avoid iterator invalidation. OpndPtrs stored pointers into the Opnd vector that became invalid when the vector grows. Store indices instead. Sadly I only have a large testcase that only triggers under valgrind, so I didn't include it. llvm-svn: 178793	2013-04-04 21:15:42 +00:00
Shuxin Yang	6662fd0f15	Correct assertion condition llvm-svn: 178484	2013-04-01 18:13:05 +00:00
Shuxin Yang	7b0c94e207	Implement XOR reassociation. It is based on following rules: rule 1: (x \| c1) ^ c2 => (x & ~c1) ^ (c1^c2), only useful when c1=c2 rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2)) rule 3: (x \| c1) ^ (x \| c2) = (x & c3) ^ c3 where c3 = c1 ^ c2 rule 4: (x \| c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2 It reduces an application's size (in terms of # of instructions) by 8.9%. Reviwed by Pete Cooper. Thanks a lot! rdar://13212115 llvm-svn: 178409	2013-03-30 02:15:01 +00:00
Jakub Staszak	4f9d1e85d0	Minor cleanups. No functionality change. llvm-svn: 177837	2013-03-24 09:56:28 +00:00
Jakub Staszak	f6df1e3def	Use dyn_cast instead of isa && cast. No functionality change. llvm-svn: 177836	2013-03-24 09:25:47 +00:00
Chandler Carruth	34f0c7fcaf	[SROA] Prefix names using a custom IRBuilder inserter. The key part of this is ensuring that name prefixes remain in a Twine form until we get to a point where we can nuke them under NDEBUG. This is tricky using the old APIs as they played fast and loose with Twine, which is prone to serious error. The inserter is much cleaner as it is actually in the call stack leading to the setName call, and so has a good opportunity to prepend the prefix. This matters more than you might imagine because most runs over an alloca find a single partition, and rewrite 3 or 4 instructions referring to it. As a consequence doing this lazily and exclusively with Twine allows the optimizer to delete more of it and shaves another 2% to 3% off of the release build's SROA run time for PR15412. I also think the APIs are cleaner, and the use of Twine is more reliable, so I consider it a win-win despite the churn required to reach this state. llvm-svn: 177631	2013-03-21 09:52:18 +00:00
Meador Inge	cf691565ed	simplify-libcalls: Removed unused variable The 'Modified' variable should have been removed from SimplifyLibCalls in r177619, but was missed. This commit removes it. llvm-svn: 177622	2013-03-21 02:44:07 +00:00
Meador Inge	6b6a161ccf	Move library call prototype attribute inference to functionattrs The simplify-libcalls pass implemented a doInitialization hook to infer function prototype attributes for well-known functions. Given that the simplify-libcalls pass is going away and that the functionattrs pass is already in place to deduce function attributes, I am moving this logic to the functionattrs pass. This approach was discussed during patch review: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html. llvm-svn: 177619	2013-03-21 00:55:59 +00:00
Chandler Carruth	0fad17527b	Fix a silly search-and-replace goof with r177495 that only broke non-release builds. llvm-svn: 177498	2013-03-20 07:40:56 +00:00
Chandler Carruth	d177f86124	[SROA] Don't preserve the IR names in release builds. This is espcially important because the new SROA pass goes to great lengths to provide helpful names for debugging, and as a consequence they can become very slow to render. Good for between 5% and 15% of the SROA runtime on some slow test cases such as the one in PR15412. llvm-svn: 177495	2013-03-20 07:30:36 +00:00
Chandler Carruth	0941b66283	Move the endif to the correct line so we don't have warnings about unused statistics variables. llvm-svn: 177494	2013-03-20 06:47:00 +00:00
Chandler Carruth	5f5b616344	Introduce some new statistics to help track the exact behavior of the new SROA pass. llvm-svn: 177493	2013-03-20 06:30:46 +00:00
Quentin Colombet	2393cb92b8	Update global merge pass according to Duncan's advices: - Remove useless includes - Change misleading comments - Move code into doFinalization llvm-svn: 177445	2013-03-19 21:46:49 +00:00
Arnaud A. de Grandmaison	87c473f0d1	IndVarSimplify: do not recompute an IV value outside of the loop if : - it is trivially known to be used inside the loop in a way that can not be optimized away - there is no use outside of the loop which can take advantage of the computation hoisting llvm-svn: 177432	2013-03-19 20:00:22 +00:00
Andrew Trick	f3a2544dba	Revert "Cleanup some SCEV logic a bit." This reverts commit 82cd8f7382322bee7a71cdc31f7a923c44d37d32. Just add a comment instead! llvm-svn: 177377	2013-03-19 05:10:27 +00:00
Andrew Trick	de78866594	Cleanup some SCEV logic a bit. Make the code more obvious to scan-build and humans. llvm-svn: 177375	2013-03-19 04:14:59 +00:00
Andrew Trick	a1c01ba8c7	Tighten up an internal LSR API that should check for NULL. No test case, but should fix a scan_build warning. llvm-svn: 177374	2013-03-19 04:14:57 +00:00
Jakub Staszak	bc421efddf	Make method private. Keep coding standard. llvm-svn: 177348	2013-03-18 23:31:30 +00:00
Quentin Colombet	8fc340976d	Extend global merge pass to optionally consider global constant variables. Also add some checks to not merge globals used within landing pad instructions or marked as "used". llvm-svn: 177331	2013-03-18 22:30:07 +00:00
Chandler Carruth	f74654d274	Mark internal classes as POD-like to get better behavior out of SmallVector and DenseMap. This speeds up SROA by 25% on PR15412. llvm-svn: 177259	2013-03-18 08:36:46 +00:00
Chandler Carruth	a1c54bbe34	PR14972: SROA vs. GVN exposed a really bad bug in SROA. The fundamental problem is that SROA didn't allow for overly wide loads where the bits past the end of the alloca were masked away and the load was sufficiently aligned to ensure there is no risk of page fault, or other trapping behavior. With such widened loads, SROA would delete the load entirely rather than clamping it to the size of the alloca in order to allow mem2reg to fire. This was exposed by a test case that neatly arranged for GVN to run first, widening certain loads, followed by an inline step, and then SROA which miscompiles the code. However, I see no reason why this hasn't been plaguing us in other contexts. It seems deeply broken. Diagnosing all of the above took all of 10 minutes of debugging. The really annoying aspect is that fixing this completely breaks the pass. ;] There was an implicit reliance on the fact that no loads or stores extended past the alloca once we decided to rewrite them in the final stage of SROA. This was used to encode information about whether the loads and stores had been split across multiple partitions of the original alloca. That required threading explicit tracking of whether a use of a partition is split across multiple partitions. Once that was done, another problem arose: we allowed splitting of integer loads and stores iff they were loads and stores to the entire alloca. This is a really arbitrary limitation, and splitting at least some integer loads and stores is crucial to maximize promotion opportunities. My first attempt was to start removing the restriction entirely, but currently that does Very Bad Things by causing many common alloca patterns to be fully decomposed into i8 operations and lots of or-ing together to produce larger integers on demand. The code bloat is terrifying. That is still the right end-goal, but substantial work must be done to either merge partitions or ensure that small i8 values are eagerly merged in some other pass. Sadly, figuring all this out took essentially all the time and effort here. So the end result is that we allow splitting only when the load or store at least covers the alloca. That ensures widened loads and stores don't hurt SROA, and that we don't rampantly decompose operations more than we have previously. All of this was already fairly well tested, and so I've just updated the tests to cover the wide load behavior. I can add a test that crafts the pass ordering magic which caused the original PR, but that seems really brittle and to provide little benefit. The fundamental problem is that widened loads should Just Work. llvm-svn: 177055	2013-03-14 11:32:24 +00:00
Dan Gohman	00253592c7	Change the order of the operands in patchAndReplaceAllUsesWith so that they're more consistent with Value::replaceAllUsesWith. llvm-svn: 176872	2013-03-12 16:22:56 +00:00
Jakub Staszak	fd56611b49	Keep coding stanard. llvm-svn: 176661	2013-03-07 22:20:06 +00:00
Jakub Staszak	db4579d796	Don't create IRBuilder if we can return from the method earlier. llvm-svn: 176660	2013-03-07 22:10:33 +00:00
Preston Gurd	485296d1e8	Bypass Slow Divides * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442	2013-03-04 18:13:57 +00:00
Benjamin Kramer	ee40b9a2d4	CVP: If we have a PHI with an incoming select, try to skip the select. This is a common pattern with dyn_cast and similar constructs, when the PHI no longer depends on the select it can often be turned into a simpler construct or even get hoisted out of the loop. PR15340. llvm-svn: 175995	2013-02-24 15:34:43 +00:00
Bill Wendling	09bd1f71ee	Implement the NoBuiltin attribute. The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835	2013-02-22 00:12:35 +00:00
Chad Rosier	9b7f9c3e9e	Remove dead code and whitespace. llvm-svn: 175804	2013-02-21 21:40:51 +00:00
Chad Rosier	4d87d45a05	Update a comment that looks to have been accidentally deleted many moons ago. llvm-svn: 175658	2013-02-20 20:15:55 +00:00
Jakub Staszak	ae2fd9c97d	Remove unused variable. llvm-svn: 175568	2013-02-19 22:17:58 +00:00
Jakub Staszak	3c6583a1b1	Minor cleanups. No functionality change. llvm-svn: 175567	2013-02-19 22:14:45 +00:00
Jakub Staszak	90fbe91c58	Remove unneeded #includes. llvm-svn: 175565	2013-02-19 22:06:38 +00:00

1 2 3 4 5 ...

5740 Commits