llvm-project

Commit Graph

Author	SHA1	Message	Date
Yuchen Wu	5947c8fa99	BlockFrequencyInfo: Readded getEntryFreq. llvm-svn: 197839	2013-12-20 22:11:11 +00:00
Michael Gottesman	fb9164f0d2	[block-freq] Teach branch probability how to return the edge weight in between a BasicBlock and one of its successors. IMHO At some point BasicBlock should be refactored along the lines of MachineBasicBlock so that successors/weights are actually embedded within the block. Now is not that time though. llvm-svn: 197303	2013-12-14 02:24:25 +00:00
Matt Arsenault	d3ee7af2f4	Teach MemoryBuiltins about address spaces llvm-svn: 197292	2013-12-14 00:27:48 +00:00
Michael Gottesman	b0c1ed8f4c	[block-freq] Update BlockFrequencyInfo/MachineBlockFrequencyInfo to use the new print methods. llvm-svn: 197289	2013-12-14 00:25:42 +00:00
Michael Gottesman	fd5c4b2c09	[block-freq] Add the equivalent methods to MachineBlockFrequencyInfo and BlockFrequencyInfo that were added to BlockFrequencyImpl in r197285 and r197284. llvm-svn: 197287	2013-12-14 00:06:03 +00:00
Chandler Carruth	37d25de459	[inliner] Fix PR18206 by preventing inlining functions that call setjmp through an invoke instruction. The original patch for this was written by Mark Seaborn, but I've reworked his test case into the existing returns_twice test case and implemented the fix by the prior refactoring to actually run the cost analysis over invoke instructions, and then here fixing our detection of the returns_twice attribute to work for both calls and invokes. We never noticed because we never saw an invoke. =[ llvm-svn: 197216	2013-12-13 08:00:01 +00:00
Chandler Carruth	0814d2adf0	[inliner] Completely change (and fix) how the inline cost analysis handles terminator instructions. The inline cost analysis inheritted some pretty rough handling of terminator insts from the original cost analysis, and then made it much, much worse by factoring all of the important analyses into a separate instruction visitor. That instruction visitor never visited the terminator. This works fine for things like conditional branches, but for many other things we simply computed The Wrong Value. First example are unconditional branches, which should be free but were counted as full cost. This is most significant for conditional branches where the condition simplifies and folds during inlining. We paid a 1 instruction tax on every branch in a straight line specialized path. =[ Oh, we also claimed that the unreachable instruction had cost. But it gets worse. Let's consider invoke. We never applied the call penalty. We never accounted for the cost of the arguments. Nope. Worse still, we didn't handle the correctness constraints of not inlining recursive invokes, or exception throwing returns_twice functions. Oops. See PR18206. Sadly, PR18206 requires yet another fix, but this refactoring is at least a huge step in that direction. llvm-svn: 197215	2013-12-13 07:59:56 +00:00
Chandler Carruth	cb5beb347a	[cleanup] Remove trailing whitespace before I start changing this file. llvm-svn: 197149	2013-12-12 11:59:26 +00:00
Jakub Staszak	3ab283c157	Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces overall time of LLVM compilation by ~1%. llvm-svn: 196667	2013-12-07 21:20:17 +00:00
Alp Toker	f907b891da	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Eric Christopher	67c0bfeae8	Fix typo. llvm-svn: 196434	2013-12-04 23:55:09 +00:00
Chandler Carruth	6378cf539f	[PM] Split the CallGraph out from the ModulePass which creates the CallGraph. This makes the CallGraph a totally generic analysis object that is the container for the graph data structure and the primary interface for querying and manipulating it. The pass logic is separated into its own class. For compatibility reasons, the pass provides wrapper methods for most of the methods on CallGraph -- they all just forward. This will allow the new pass manager infrastructure to provide its own analysis pass that constructs the same CallGraph object and makes it available. The idea is that in the new pass manager, the analysis pass's 'run' method returns a concrete analysis 'result'. Here, that result is a 'CallGraph'. The 'run' method will typically do only minimal work, deferring much of the work into the implementation of the result object in order to be lazy about computing things, but when (like DomTree) there is some up-front computation, the analysis does it prior to handing the result back to the querying pass. I know some of this is fairly ugly. I'm happy to change it around if folks can suggest a cleaner interim state, but there is going to be some amount of unavoidable ugliness during the transition period. The good thing is that this is very limited and will naturally go away when the old pass infrastructure goes away. It won't hang around to bother us later. Next up is the initial new-PM-style call graph analysis. =] llvm-svn: 195722	2013-11-26 04:19:30 +00:00
Chandler Carruth	878b55372a	[PM] Reformat some code with clang-format as I'm going to be editting as part of generalizing the call graph infrastructure for the new pass manager. llvm-svn: 195718	2013-11-26 03:45:26 +00:00
Chandler Carruth	9a398f453d	[PM] Rename the 'Mod' member to the more idiomatic 'M'. No functionality changed. llvm-svn: 195701	2013-11-26 00:37:23 +00:00
Kostya Serebryany	0b458286e1	Don't speculate loads under ThreadSanitizer Summary: Don't speculate loads under ThreadSanitizer. This fixes https://code.google.com/p/thread-sanitizer/issues/detail?id=40 Also discussed here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-November/067929.html Reviewers: chandlerc Reviewed By: chandlerc CC: llvm-commits, dvyukov Differential Revision: http://llvm-reviews.chandlerc.com/D2227 llvm-svn: 195324	2013-11-21 07:29:28 +00:00
Paul Robinson	dcbe35bad5	The 'optnone' attribute means don't inline anything into this function (except functions marked always_inline). Functions with 'optnone' must also have 'noinline' so they don't get inlined into any other function. Based on work by Andrea Di Biagio. llvm-svn: 195046	2013-11-18 21:44:03 +00:00
Benjamin Kramer	5f2768c377	Annotate APInt methods where it's not clear whether they are in place with warn_unused_result. Fix ScalarEvolution bugs uncovered by this. llvm-svn: 194928	2013-11-16 16:25:41 +00:00
Matt Arsenault	a8fe22baba	Use correct size for address space in BasicAA. The tests just hit this with a different sized address space since I haven't figured out how to use this to break it. I thought I committed this a long time ago, and I'm not sure why missing this hasn't caused any problems. llvm-svn: 194903	2013-11-16 00:36:43 +00:00
Matt Arsenault	b03bd4d96b	Add addrspacecast instruction. Patch by Michele Scandale! llvm-svn: 194760	2013-11-15 01:34:59 +00:00
Michael Gottesman	fd8aee76eb	Added BlockFrequencyInfo::view for displaying the block frequency propagation graph via graphviz. This is useful for debugging issues in the BlockFrequency implementation since one can easily visualize where probability mass and other errors occur in the propagation. llvm-svn: 194654	2013-11-14 02:27:46 +00:00
Yunzhong Gao	5cbcf56a7e	Fixing a heisenbug where the memory dependence analysis behaves differently with and without -g. Adding a test case to make sure that the threshold used in the memory dependence analysis is respected. The test case also checks that debug intrinsics are not counted towards this threshold. Differential Revision: http://llvm-reviews.chandlerc.com/D2141 llvm-svn: 194646	2013-11-14 01:10:52 +00:00
Michael Gottesman	857d6e3dd1	Fixed 80+ violations. llvm-svn: 194634	2013-11-14 00:05:07 +00:00
Sebastian Pop	7ee147246f	add more comments around the delinearization of arrays llvm-svn: 194612	2013-11-13 22:37:58 +00:00
Jakub Staszak	9dca4b3eeb	Simplify code. No functionality change. llvm-svn: 194602	2013-11-13 20:18:38 +00:00
Benjamin Kramer	9e501ec239	Move Delinearization pass into an anonymous namespace. llvm-svn: 194582	2013-11-13 15:35:17 +00:00
Sebastian Pop	c62c679c1b	delinearization of arrays llvm-svn: 194527	2013-11-12 22:47:20 +00:00
Wan Xiaofei	b2c8cdc766	Change data structure to memorize computed result in ScalarEvolution Replace std::map with SmallVector to memorize the cached result since SCEV usually belongs to little Loop/BB Linear scan on SmallVector is faster than std::map. Code reviewer : Andrew Trick. Test result : Pass Unit Test & LLVM Test Suite 401.bzip2 0.425721 0.419981 101.37% 403.gcc 24.53855 24.2667 101.12% 429.mcf 0.060847 0.059944 101.51% 433.milc 0.646009 0.636119 101.55% 444.namd 1.383928 1.370614 100.97% 445.gobmk 5.836575 5.800225 100.63% 450.soplex 1.911257 1.895963 100.81% 456.hmmer 1.039565 1.032534 100.68% 458.sjeng 0.897401 0.885567 101.34% 464.h264ref 3.645908 3.577991 101.90% 470.lbm 0.049456 0.048398 102.19% 471.omnetpp 5.638575 5.60435 100.61% bitmnp01 0.045738 0.045291 100.99% cjpegv2data 0.304359 0.302833 100.50% idctrn01 0.046433 0.045763 101.46% quake2 4.534416 4.4952 100.87% quake 2.688566 2.659208 101.10% xcsoar 12.42545 12.30385 100.99% linpack 0.038739 0.03803 101.86% matrix01 0.053564 0.0528 101.45% nbench 0.402867 0.395803 101.78% tblook01 0.021265 0.021015 101.19% ttsprk01 0.066384 0.065566 101.25% llvm-svn: 194459	2013-11-12 09:40:41 +00:00
Matt Arsenault	b12f2f3b60	Use size function instead of manually calculating it. llvm-svn: 194345	2013-11-10 03:18:50 +00:00
Chandler Carruth	7caea41545	Move the old pass manager infrastructure into a legacy namespace and give the files a legacy prefix in the right directory. Use forwarding headers in the old locations to paper over the name change for most clients during the transitional period. No functionality changed here! This is just clearing some space to reduce renaming churn later on with a new system. Even when the new stuff starts to go in, it is going to be hidden behind a flag and off-by-default as it is still WIP and under development. This patch is specifically designed so that very little out-of-tree code has to change. I'm going to work as hard as I can to keep that the case. Only direct forward declarations of the PassManager class are impacted by this change. llvm-svn: 194324	2013-11-09 12:26:54 +00:00
Andrew Trick	34e2f0c4ea	Rewrite SCEV's backedge taken count computation. Patch by Michele Scandale! Rewrite of the functions used to compute the backedge taken count of a loop on LT and GT comparisons. I decided to split the handling of LT and GT cases becasue the trick "a > b == -a < -b" in some cases prevents the trip count computation due to the multiplication by -1 on the two operands of the comparison. This issue comes from the conservative computation of value range of SCEVs: taking the negative SCEV of an expression that have a small positive range (e.g. [0,31]), we would have a SCEV with a fullset as value range. Indeed, in the new rewritten function I tried to better handle the maximum backedge taken count computation when MAX/MIN expression are used to handle the cases where no entry guard is found. Some test have been modified in order to check the new value correctly (I manually check them and reasoning on possible overflow the new values seem correct). I finally added a new test case related to the multiplication by -1 issue on GT comparisons. llvm-svn: 194116	2013-11-06 02:08:26 +00:00
Matt Arsenault	a8e894405c	Fix another constant folding address space place I missed. This fixes an assertion failure with a different sized address space. llvm-svn: 194014	2013-11-04 20:46:52 +00:00
Hal Finkel	4d94930bcb	Consider (x == -1) unlikely in BranchProbabilityInfo This adds another heuristic to BPI, similar to the existing heuristic that considers (x == 0) unlikely to be true. As suggested in the PACT'98 paper by Deitrich, Cheng, and Hwu, -1 is often used to indicate an invalid index, and equality comparisons with -1 are also unlikely to succeed. Local experimentation supports this hypothesis: This yields a 1-2% speedup in the test-suite sqlite benchmark on the PPC A2 core, with no significant regressions. llvm-svn: 193855	2013-11-01 10:58:22 +00:00
Rafael Espindola	6554e5a94d	Merge CallGraph and BasicCallGraph. llvm-svn: 193734	2013-10-31 03:03:55 +00:00
Benjamin Kramer	6094f30da2	SCEV: Make the final add of an inbounds GEP nuw if we know that the index is positive. We can't do this for the general case as saying a GEP with a negative index doesn't have unsigned wrap isn't valid for negative indices. %gep = getelementptr inbounds i32* %p, i64 -1 But an inbounds GEP cannot run past the end of address space. So we check for the very common case of a positive index and make GEPs derived from that NUW. Together with Andy's recent non-unit stride work this lets us analyze loops like void foo3(int a, int b) { for (; a < b; a++) {} } PR12375, PR12376. Differential Revision: http://llvm-reviews.chandlerc.com/D2033 llvm-svn: 193514	2013-10-28 07:30:06 +00:00
Shuxin Yang	2e1890e18b	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. llvm-svn: 193489	2013-10-27 03:08:44 +00:00
Wan Xiaofei	be640b28c0	Quick look-up for block in loop. This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460	2013-10-26 03:08:02 +00:00
Andrew Trick	57243da70f	Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438	2013-10-25 21:35:56 +00:00
Andrew Trick	29abce3189	Fix LSR: don't normalize quadratic recurrences. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) ScalarEvolutionNormalization was attempting to normalize by adding and subtracting strides. Chained recurrences don't work that way. llvm-svn: 193437	2013-10-25 21:35:52 +00:00
Rafael Espindola	64cc1b0043	Call destroy from ~BasicCallGraph. This fix a memory leak found by valgrind. Calling it from the base class destructor would not destroy the BasicCallGraph bits. FIXME: BasicCallGraph is the only thing that inherits from CallGraph. Can we merge the two? llvm-svn: 193412	2013-10-25 15:01:34 +00:00
Nuno Lopes	340b0463e6	fix PR17635: false positive with packed structures LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object llvm-svn: 193317	2013-10-24 09:17:24 +00:00
Shuxin Yang	e4fb375995	Use address-taken to disambiguate global variable and indirect memops. Major steps include: 1). introduces a not-addr-taken bit-field in GlobalVariable 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable dosen't have its address taken. 3). AA use this info for disambiguation. llvm-svn: 193251	2013-10-23 17:28:19 +00:00
Andrew Trick	d5990ad9b6	Clarify SCEV comments. We handle for(i=n; i>0; i -= s) by canonicalizing within SCEV to for(i=-n; i<0; i += s). llvm-svn: 193147	2013-10-22 05:09:40 +00:00
Manman Ren	f3850c0807	TBAA: fix PR17620. We can have a struct type with a single field and the field does not start with 0. In that case, we should correctly update the offset. llvm-svn: 193137	2013-10-22 01:40:25 +00:00
Matt Arsenault	404c60a7c3	Use more type helper functions llvm-svn: 193109	2013-10-21 19:43:56 +00:00
Matt Arsenault	be18b8a3ca	Fix creating bitcasts between address spaces in SCEV. The test before wasn't successfully testing this since it was missing the datalayout piece to change the size of the second address space. llvm-svn: 193102	2013-10-21 18:41:10 +00:00
Matt Arsenault	4ed49b5301	Remove unused SCEV functions llvm-svn: 193097	2013-10-21 18:08:09 +00:00
Andrew Trick	768b917dc8	SCEV should use NSW to get trip count for positive nonunit stride loops. SCEV currently fails to compute loop counts for nonunit stride loops. This comes up frequently. It prevents loop optimization and forces vectorization to insert extra loop checks. For example: void foo(int n, int *x) { for (int i = 0; i < n; i += 3) { x[i] = i; x[i+1] = i+1; x[i+2] = i+2; } } We need to properly handle the case in which limit > INT_MAX-stride. In the above case: n > INT_MAX-3. In this case the loop counter will step beyond the limit and overflow at the same time. However, knowing that signed integer overlow in undefined, we can assume the loop test behavior is arbitrary after overflow. This obeys both C undefined behavior rules, and the more strict LLVM poison value rules. I'm finally fixing this in response to Hal Finkel's persistence. The most probable reason that we never optimized this before is that we were being careful to handle case where the developer expected a side-effect free infinite loop relying on overflow: for (int i = 0; i < n; i += s) { ++j; } return j; If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might expect an infinite loop. However there are plenty of ways to achieve this effect without relying on undefined behavior of signed overflow. llvm-svn: 193015	2013-10-18 23:43:53 +00:00
Craig Topper	ef9e993eaa	Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. llvm-svn: 192672	2013-10-15 05:20:47 +00:00
Matt Arsenault	40dddd7147	Rename DataLayout variables TD -> DL llvm-svn: 191927	2013-10-03 19:50:01 +00:00
Benjamin Kramer	d2757ba1be	CaptureTracking: Plug a loophole in the "too many uses" heuristic. The heuristic was added to avoid spending too much compile time A specially crafted test case (PR17461, PR16474) with many uses on a select or bitcast instruction can still trigger the slow case. Add a check for that case. This only affects compile time, don't have a good way to test it. llvm-svn: 191896	2013-10-03 13:24:02 +00:00

1 2 3 4 5 ...

4677 Commits