llvm-project

Commit Graph

Author	SHA1	Message	Date
Bill Wendling	93f70b78fd	Use the enum value of the attributes when adding them to the attributes builder. llvm-svn: 165494	2012-10-09 09:11:20 +00:00
Alexey Samsonov	3b861ec989	Fix PR14016. DeadArgumentElimination pass can replace one LLVM function with another, invalidating a pointer stored in debug info metadata entry for this function. To fix this, we collect debug info descriptors for functions before running a DeadArgumentElimination pass and "patch" pointers in metadata nodes if we replace a function. llvm-svn: 165490	2012-10-09 08:13:15 +00:00
Bill Wendling	c9b22d735a	Create enums for the different attributes. We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488	2012-10-09 07:45:08 +00:00
Bill Wendling	c1e8e74cbd	Convert to using the Attributes::Builder class to create attributes. llvm-svn: 165468	2012-10-09 00:47:36 +00:00
Nick Lewycky	7c3b5d9444	Give CaptureTracker::shouldExplore a base implementation. Most users want to do the same thing. No functionality change. llvm-svn: 165435	2012-10-08 22:12:48 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
Bill Wendling	e8619aa1c1	Use method to query for attributes. llvm-svn: 165209	2012-10-04 06:58:52 +00:00
Bill Wendling	d777398ee4	Add method to query for 'NoAlias' attribute on call/invoke instructions. llvm-svn: 165208	2012-10-04 06:52:09 +00:00
Bill Wendling	d0935f7069	Use method to query for attributes. llvm-svn: 165207	2012-10-04 06:49:41 +00:00
Bill Wendling	9cae65918c	Query for attributes via the correct method call. llvm-svn: 165206	2012-10-04 06:48:57 +00:00
Chandler Carruth	4e4359935b	Turn the new SROA pass back on. Let's see if it sticks this time. =] Again, let me know if anything breaks due to this! llvm-svn: 164986	2012-10-02 04:24:01 +00:00
Benjamin Kramer	8403625123	ArgumentPromotion: Remove ancient workaround for a bug in the C backend. Fun fact: The CBE learned how to deal with this situation before it was removed. llvm-svn: 164918	2012-09-30 17:31:56 +00:00
Evan Cheng	8c6b06d4a0	GlobalDCE should be run at -O2 / -Os to eliminate unused dtor, etc. rdar://9142819 llvm-svn: 164850	2012-09-28 21:23:26 +00:00
Benjamin Kramer	ed84360a45	GlobalOpt: non-constexpr bitcasts or GEPs can occur even if the global value is only stored once. Fixes PR13968. llvm-svn: 164815	2012-09-28 10:01:27 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Nick Lewycky	2e646236fb	Disable the new SROA pass to get the tree back in working order. We don't yet have testcases for the current problems. llvm-svn: 164731	2012-09-26 22:43:04 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00
Bill Wendling	eb33723ace	Move Attribute::typeIncompatible inside of the Attributes class. llvm-svn: 164629	2012-09-25 20:38:59 +00:00
Chandler Carruth	8232bf53c6	Enable the new SROA pass by default. Queue the fallout. ;] llvm-svn: 164480	2012-09-24 01:10:25 +00:00
Benjamin Kramer	9bc3efc81c	LNT builders have picked up new SROA, disable it to get the remaining builders green again. llvm-svn: 164124	2012-09-18 13:43:00 +00:00
Chandler Carruth	42cb9cb14f	Add a major missing piece to the new SROA pass: aggressive splitting of FCAs. This is essential in order to promote allocas that are used in struct returns by frontends like Clang. The FCA load would block the rest of the pass from firing, resulting is significant regressions with the bullet benchmark in the nightly test suite. Thanks to Duncan for repeated discussions about how best to do this, and to both him and Benjamin for review. This appears to have blocked many places where the pass tries to fire, and so I'm expect somewhat different results with this fix added. As with the last big patch, I'm including a change to enable the SROA by default temporarily. Ben is going to remove this as soon as the LNT bots pick up the patch. I'm just trying to get a round of LNT numbers from the stable machines in the lab. NOTE: Four clang tests are expected to fail in the brief window where this is enabled. Sorry for the noise! llvm-svn: 164119	2012-09-18 12:57:43 +00:00
Benjamin Kramer	ed11e35e57	Disable new sroa now that all buildbots have tested it. What we have so far: - Some clang test failures (these were known already) - Perf results are mixed, some big regressions http://llvm.org/perf/db_default/v4/nts/3844 http://llvm.org/perf/db_default/v4/nts/3845 bullet suffers a lot. matmul is interesting: slower scalar code, faster with -vectorize. - Some dragonegg selfhost bots crash in SROA during selfhost now http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.6-self-host-checks/builds/1632 http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.5-self-host/builds/1891 llvm-svn: 163968	2012-09-15 15:11:10 +00:00
Chandler Carruth	70b44c5ccf	Port the SSAUpdater-based promotion logic from the old SROA pass to the new one, and add support for running the new pass in that mode and in that slot of the pass manager. With this the new pass can completely replace the old one within the pipeline. The strategy for enabling or disabling the SSAUpdater logic is to do it by making the requirement of the domtree analysis optional. By default, it is required and we get the standard mem2reg approach. This is usually the desired strategy when run in stand-alone situations. Within the CGSCC pass manager, we disable requiring of the domtree analysis and consequentially trigger fallback to the SSAUpdater promotion. In theory this would allow the pass to re-use a domtree if one happened to be available even when run in a mode that doesn't require it. In practice, it lets us have a single pass rather than two which was simpler for me to wrap my head around. There is a hidden flag to force the use of the SSAUpdater code path for the purpose of testing. The primary testing strategy is just to run the existing tests through that path. One notable difference is that it has custom code to handle lifetime markers, and one of the tests has been enhanced to exercise that code. This has survived a bootstrap and the test suite without serious correctness issues, however my run of the test suite produced very alarming performance numbers. I don't entirely understand or trust them though, so more investigation is on-going. To aid my understanding of the performance impact of the new SROA now that it runs throughout the optimization pipeline, I'm enabling it by default in this commit, and will disable it again once the LNT bots have picked up one iteration with it. I want to get those bots (which are much more stable) to evaluate the impact of the change before I jump to any conclusions. NOTE: Several Clang tests will fail because they run -O3 and check the result's order of output. They'll go back to passing once I disable it again. llvm-svn: 163965	2012-09-15 11:43:14 +00:00
Chandler Carruth	6ba9824c2b	Actually keep the flag default-off for now. =/ That's what I get for being busy testing this... llvm-svn: 163890	2012-09-14 10:18:54 +00:00
Chandler Carruth	1b398ae0ae	Introduce a new SROA implementation. This is essentially a ground up re-think of the SROA pass in LLVM. It was initially inspired by a few problems with the existing pass: - It is subject to the bane of my existence in optimizations: arbitrary thresholds. - It is overly conservative about which constructs can be split and promoted. - The vector value replacement aspect is separated from the splitting logic, missing many opportunities where splitting and vector value formation can work together. - The splitting is entirely based around the underlying type of the alloca, despite this type often having little to do with the reality of how that memory is used. This is especially prevelant with unions and base classes where we tail-pack derived members. - When splitting fails (often due to the thresholds), the vector value replacement (again because it is separate) can kick in for preposterous cases where we simply should have split the value. This results in forming i1024 and i2048 integer "bit vectors" that tremendously slow down subsequnet IR optimizations (due to large APInts) and impede the backend's lowering. The new design takes an approach that fundamentally is not susceptible to many of these problems. It is the result of a discusison between myself and Duncan Sands over IRC about how to premptively avoid these types of problems and how to do SROA in a more principled way. Since then, it has evolved and grown, but this remains an important aspect: it fixes real world problems with the SROA process today. First, the transform of SROA actually has little to do with replacement. It has more to do with splitting. The goal is to take an aggregate alloca and form a composition of scalar allocas which can replace it and will be most suitable to the eventual replacement by scalar SSA values. The actual replacement is performed by mem2reg (and in the future SSAUpdater). The splitting is divided into four phases. The first phase is an analysis of the uses of the alloca. This phase recursively walks uses, building up a dense datastructure representing the ranges of the alloca's memory actually used and checking for uses which inhibit any aspects of the transform such as the escape of a pointer. Once we have a mapping of the ranges of the alloca used by individual operations, we compute a partitioning of the used ranges. Some uses are inherently splittable (such as memcpy and memset), while scalar uses are not splittable. The goal is to build a partitioning that has the minimum number of splits while placing each unsplittable use in its own partition. Overlapping unsplittable uses belong to the same partition. This is the target split of the aggregate alloca, and it maximizes the number of scalar accesses which become accesses to their own alloca and candidates for promotion. Third, we re-walk the uses of the alloca and assign each specific memory access to all the partitions touched so that we have dense use-lists for each partition. Finally, we build a new, smaller alloca for each partition and rewrite each use of that partition to use the new alloca. During this phase the pass will also work very hard to transform uses of an alloca into a form suitable for promotion, including forming vector operations, speculating loads throguh PHI nodes and selects, etc. After splitting is complete, each newly refined alloca that is a candidate for promotion to a scalar SSA value is run through mem2reg. There are lots of reasonably detailed comments in the source code about the design and algorithms, and I'm going to be trying to improve them in subsequent commits to ensure this is well documented, as the new pass is in many ways more complex than the old one. Some of this is still a WIP, but the current state is reasonbly stable. It has passed bootstrap, the nightly test suite, and Duncan has run it successfully through the ACATS and DragonEgg test suites. That said, it remains behind a default-off flag until the last few pieces are in place, and full testing can be done. Specific areas I'm looking at next: - Improved comments and some code cleanup from reviews. - SSAUpdater and enabling this pass inside the CGSCC pass manager. - Some datastructure tuning and compile-time measurements. - More aggressive FCA splitting and vector formation. Many thanks to Duncan Sands for the thorough final review, as well as Benjamin Kramer for lots of review during the process of writing this pass, and Daniel Berlin for reviewing the data structures and algorithms and general theory of the pass. Also, several other people on IRC, over lunch tables, etc for lots of feedback and advice. llvm-svn: 163883	2012-09-14 09:22:59 +00:00
Nadav Rotem	97d44349c9	Fix an 80 char line limit. llvm-svn: 163808	2012-09-13 16:27:32 +00:00
Benjamin Kramer	8bcc971174	Make MemoryBuiltins aware of TargetLibraryInfo. This disables malloc-specific optimization when -fno-builtin (or -ffreestanding) is specified. This has been a problem for a long time but became more severe with the recent memory builtin improvements. Since the memory builtin functions are used everywhere, this required passing TLI in many places. This means that functions that now have an optional TLI argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead mallocs anymore if the TLI argument is missing. I've updated most passes to do the right thing. Fixes PR13694 and probably others. llvm-svn: 162841	2012-08-29 15:32:21 +00:00
Bill Wendling	8555a37c04	Move the "findUsedStructTypes" functionality outside of the Module class. The "findUsedStructTypes" method is very expensive to run. It needs to be optimized so that LTO can run faster. Splitting this method out of the Module class will help this occur. For instance, it can keep a list of seen objects so that it doesn't process them over and over again. llvm-svn: 161228	2012-08-03 00:30:35 +00:00
Nick Lewycky	7d0f110cb3	It's not safe to blindly remove invoke instructions. This happens when we encounter an invoke of an allocation function. This should fix the dragonegg bootstrap. Testcase to follow, later. llvm-svn: 160757	2012-07-25 21:19:40 +00:00
Nick Lewycky	38be931223	Don't delete one more instruction than we're allowed to. This should fix the Darwin bootstrap. Testcase exists but isn't fully reduced, I expect to commit the testcase this evening. llvm-svn: 160693	2012-07-24 21:33:00 +00:00
Nick Lewycky	faa9c3b035	Teach globalopt to not nuke all stores to globals. Keep them around of they might be deliberate "one time" leaks, so that leak checkers can find them. This is a reapply of r160602 with the fix that this time I'm committing the code I thought I was committing last time; the I->eraseFromParent() goes after the break out of the loop. llvm-svn: 160664	2012-07-24 07:21:08 +00:00
Nick Lewycky	9669c198ba	Revert r160602. llvm-svn: 160603	2012-07-21 09:03:15 +00:00
Nick Lewycky	72b83e5eaa	Teach globalopt to play nice with leak checkers. This is a reapplication of r160529 that was subsequently reverted. The fix was to not call GV->eraseFromParent() right before the caller does the same. The existing testcases already caught this bug if run under valgrind. llvm-svn: 160602	2012-07-21 08:29:45 +00:00
Nick Lewycky	7707e23429	Revert r160529 due to crashes. llvm-svn: 160532	2012-07-19 23:59:21 +00:00
Nick Lewycky	0fa6a28141	Don't wipe out global variables that are probably storing pointers to heap memory. This makes clang play nice with leak checkers. llvm-svn: 160529	2012-07-19 22:35:28 +00:00
Benjamin Kramer	f364a63c3e	Replace some explicit compare loops with std::equal. No functionality change. llvm-svn: 160501	2012-07-19 10:46:05 +00:00
Bill Wendling	ea6397f67b	Remove tabs. llvm-svn: 160477	2012-07-19 00:11:40 +00:00
Duncan Sands	e8ce94fcd7	GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection. llvm-svn: 159546	2012-07-02 18:55:39 +00:00
Chandler Carruth	aafe0918bc	Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h This was always part of the VMCore library out of necessity -- it deals entirely in the IR. The .cpp file in fact was already part of the VMCore library. This is just a mechanical move. I've tried to go through and re-apply the coding standard's preferred header sort, but at 40-ish files, I may have gotten some wrong. Please let me know if so. I'll be committing the corresponding updates to Clang and Polly, and Duncan has DragonEgg. Thanks to Bill and Eric for giving the green light for this bit of cleanup. llvm-svn: 159421	2012-06-29 12:38:19 +00:00
Bill Wendling	e38859dc8e	Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h. The reasoning is because the DebugInfo module is simply an interface to the debug info MDNodes and has nothing to do with analysis. llvm-svn: 159312	2012-06-28 00:05:13 +00:00
Matt Beaumont-Gay	a58862310c	Revert r159136 due to PR13124. Original commit message: If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 159272	2012-06-27 17:10:33 +00:00
Rafael Espindola	540c3d23df	If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 159136	2012-06-25 14:30:31 +00:00
NAKAMURA Takumi	704de074b8	llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. llvm-svn: 159112	2012-06-24 13:32:01 +00:00
Nick Lewycky	b74ae9c5b2	Tab to spaces. No functionality change. llvm-svn: 159104	2012-06-24 04:07:14 +00:00
Hans Wennborg	cbe34b4cc9	Extend the IL for selecting TLS models (PR9788) This allows the user/front-end to specify a model that is better than what LLVM would choose by default. For example, a variable might be declared as @x = thread_local(initialexec) global i32 42 if it will not be used in a shared library that is dlopen'ed. If the specified model isn't supported by the target, or if LLVM can make a better choice, a different model may be used. llvm-svn: 159077	2012-06-23 11:37:03 +00:00
Nuno Lopes	0b60ebbf79	fix whitespace in my last commit. sorry for the churn :S enough for today; going to sleep. llvm-svn: 158953	2012-06-22 00:29:58 +00:00
Nuno Lopes	9792d68381	remove extractMallocCallFromBitCast, since it was tailor maded for its sole user. Update GlobalOpt accordingly. llvm-svn: 158952	2012-06-22 00:25:01 +00:00
Rafael Espindola	1821c6c3b0	Some optimizations done by globalopt are safe only for internal linkage, not linkonce linkage. For example, it is not valid to add unnamed_addr. This also fixes a crash in g++.dg/opt/static5.C. llvm-svn: 158528	2012-06-15 18:00:24 +00:00
Rafael Espindola	def1b09be2	Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and globaldce. Globaldce was already removing linkonce globals, but globalopt was not. llvm-svn: 158476	2012-06-14 22:48:13 +00:00
Benjamin Kramer	bde9176663	Fix typos found by http://github.com/lyda/misspell-check llvm-svn: 157885	2012-06-02 10:20:22 +00:00
Chris Lattner	3cb6f83ebb	switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients. llvm-svn: 157556	2012-05-28 01:47:44 +00:00
Patrik Hägglund	8a1e316c15	Fix the inliner so that the optsize function attribute don't alter the inline threshold if the global inline threshold is lower (as for -Oz). Reviewed by Chandler Carruth and Bill Wendling. llvm-svn: 157323	2012-05-23 13:42:57 +00:00
Jay Foad	ca0c499609	Teach Function::hasAddressTaken that BlockAddress doesn't really take the address of a function. llvm-svn: 156703	2012-05-12 08:30:16 +00:00
Chandler Carruth	0fde00150d	Move the CodeExtractor utility to a dedicated header file / source file, and expose it as a utility class rather than as free function wrappers. The simple free-function interface works well for the bugpoint-specific pass's uses of code extraction, but in an upcoming patch for more advanced code extraction, they simply don't expose a rich enough interface. I need to expose various stages of the process of doing the code extraction and query information to decide whether or not to actually complete the extraction or give up. Rather than build up a new predicate model and pass that into these functions, just take the class that was actually implementing the functions and lift it up into a proper interface that can be used to perform code extraction. The interface is cleaned up and re-documented to work better in a header. It also is now setup to accept the blocks to be extracted in the constructor rather than in a method. In passing this essentially reverts my previous commit here exposing a block-level query for eligibility of extraction. That is no longer necessary with the more rich interface as clients can query the extraction object for eligibility directly. This will reduce the number of walks of the input basic block sequence by quite a bit which is useful if this enters the normal optimization pipeline. llvm-svn: 156163	2012-05-04 10:18:49 +00:00
Bill Wendling	82b90a3804	Add a Fixme. llvm-svn: 154793	2012-04-16 04:23:52 +00:00
Hal Finkel	204bf5352a	By default, use Early-CSE instead of GVN for vectorization cleanup. As has been suggested by Duncan and others, Early-CSE and GVN should do similar redundancy elimination, but Early-CSE is much less expensive. Most of my autovectorization benchmarks show a performance regresion, but all of these are < 0.1%, and so I think that it is still worth using the less expensive pass. llvm-svn: 154673	2012-04-13 17:15:33 +00:00
Bill Wendling	585583c8dd	Code-gen may inject code into the IR before it emits the ASM. The linker obviously cannot know that this code is present, let alone used. So prevent the internalize pass from internalizing those global values which code-gen may insert. llvm-svn: 154645	2012-04-13 01:06:27 +00:00
Chandler Carruth	7ae90d4d2d	Add two statistics to help track how we are computing the inline cost. Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome. llvm-svn: 154492	2012-04-11 10:15:10 +00:00
Bill Wendling	932b992888	Add an option to turn off the expensive GVN load PRE part of GVN. llvm-svn: 153902	2012-04-02 22:16:50 +00:00
Chandler Carruth	45ae88f5fc	Belatedly address some code review from Chris. As a side note, I really dislike array_pod_sort... Do we really still care about any STL implementations that get this so wrong? Does libc++? llvm-svn: 153834	2012-04-01 10:41:24 +00:00
Chandler Carruth	c5bfb3c0f5	Fix a pretty scary bug I introduced into the always inliner with a single missing character. Somehow, this had gone untested. I've added tests for returns-twice logic specifically with the always-inliner that would have caught this, and fixed the bug. Thanks to Matt for the careful review and spotting this!!! =D llvm-svn: 153832	2012-04-01 10:21:05 +00:00
Chandler Carruth	a88a0faaa3	Give the always-inliner its own custom filter. It shouldn't have to pay the very high overhead of the complex inline cost analysis when all it wants to do is detect three patterns which must not be inlined. Comment the code, clean it up, and leave some hints about possible performance improvements if this ever shows up on a profile. Moving this off of the (now more expensive) inline cost analysis is particularly important because we have to run this inliner even at -O0. llvm-svn: 153814	2012-03-31 13:17:18 +00:00
Chandler Carruth	edd2826f3e	Remove a bunch of empty, dead, and no-op methods from all of these interfaces. These methods were used in the old inline cost system where there was a persistent cache that had to be updated, invalidated, and cleared. We're now doing more direct computations that don't require this intricate dance. Even if we resume some level of caching, it would almost certainly have a simpler and more narrow interface than this. llvm-svn: 153813	2012-03-31 12:48:08 +00:00
Chandler Carruth	0539c071ea	Initial commit for the rewrite of the inline cost analysis to operate on a per-callsite walk of the called function's instructions, in breadth-first order over the potentially reachable set of basic blocks. This is a major shift in how inline cost analysis works to improve the accuracy and rationality of inlining decisions. A brief outline of the algorithm this moves to: - Build a simplification mapping based on the callsite arguments to the function arguments. - Push the entry block onto a worklist of potentially-live basic blocks. - Pop the first block off of the front of the worklist (for breadth-first ordering) and walk its instructions using a custom InstVisitor. - For each instruction's operands, re-map them based on the simplification mappings available for the given callsite. - Compute any simplification possible of the instruction after re-mapping, and store that back int othe simplification mapping. - Compute any bonuses, costs, or other impacts of the instruction on the cost metric. - When the terminator is reached, replace any conditional value in the terminator with any simplifications from the mapping we have, and add any successors which are not proven to be dead from these simplifications to the worklist. - Pop the next block off of the front of the worklist, and repeat. - As soon as the cost of inlining exceeds the threshold for the callsite, stop analyzing the function in order to bound cost. The primary goal of this algorithm is to perfectly handle dead code paths. We do not want any code in trivially dead code paths to impact inlining decisions. The previous metric was extremely flawed here, and would always subtract the average cost of two successors of a conditional branch when it was proven to become an unconditional branch at the callsite. There was no handling of wildly different costs between the two successors, which would cause inlining when the path actually taken was too large, and no inlining when the path actually taken was trivially simple. There was also no handling of the code path, only the immediate successors. These problems vanish completely now. See the added regression tests for the shiny new features -- we skip recursive function calls, SROA-killing instructions, and high cost complex CFG structures when dead at the callsite being analyzed. Switching to this algorithm required refactoring the inline cost interface to accept the actual threshold rather than simply returning a single cost. The resulting interface is pretty bad, and I'm planning to do lots of interface cleanup after this patch. Several other refactorings fell out of this, but I've tried to minimize them for this patch. =/ There is still more cleanup that can be done here. Please point out anything that you see in review. I've worked really hard to try to mirror at least the spirit of all of the previous heuristics in the new model. It's not clear that they are all correct any more, but I wanted to minimize the change in this single patch, it's already a bit ridiculous. One heuristic that is not yet mirrored is to allow inlining of functions with a dynamic alloca if the caller has a dynamic alloca. I will add this back, but I think the most reasonable way requires changes to the inliner itself rather than just the cost metric, and so I've deferred this for a subsequent patch. The test case is XFAIL-ed until then. As mentioned in the review mail, this seems to make Clang run about 1% to 2% faster in -O0, but makes its binary size grow by just under 4%. I've looked into the 4% growth, and it can be fixed, but requires changes to other parts of the inliner. llvm-svn: 153812	2012-03-31 12:42:41 +00:00
Benjamin Kramer	53dc873342	Internalize: Remove reference of @llvm.noinline, it was replaced with the noinline attribute a long time ago. llvm-svn: 153806	2012-03-31 11:03:47 +00:00
Benjamin Kramer	aa9e4a5e59	GlobalOpt: If we have an inbounds GEP from a ConstantAggregateZero global that we just determined to be constant, replace all loads from it with a zero value. llvm-svn: 153576	2012-03-28 14:50:09 +00:00
Chandler Carruth	b9e35fbc1e	Make a seemingly tiny change to the inliner and fix the generated code size bloat. Unfortunately, I expect this to disable the majority of the benefit from r152737. I'm hopeful at least that it will fix PR12345. To explain this requires... quite a bit of backstory I'm afraid. TL;DR: The change in r152737 actually did The Wrong Thing for linkonce-odr functions. This change makes it do the right thing. The benefits we saw were simple luck, not any actual strategy. Benchmark numbers after a mini-blog-post so that I've written down my thoughts on why all of this works and doesn't work... To understand what's going on here, you have to understand how the "bottom-up" inliner actually works. There are two fundamental modes to the inliner: 1) Standard fixed-cost bottom-up inlining. This is the mode we usually think about. It walks from the bottom of the CFG up to the top, looking at callsites, taking information about the callsite and the called function and computing th expected cost of inlining into that callsite. If the cost is under a fixed threshold, it inlines. It's a touch more complicated than that due to all the bonuses, weights, etc. Inlining the last callsite to an internal function gets higher weighth, etc. But essentially, this is the mode of operation. 2) Deferred bottom-up inlining (a term I just made up). This is the interesting mode for this patch an r152737. Initially, this works just like mode #1, but once we have the cost of inlining into the callsite, we don't just compare it with a fixed threshold. First, we check something else. Let's give some names to the entities at this point, or we'll end up hopelessly confused. We're considering inlining a function 'A' into its callsite within a function 'B'. We want to check whether 'B' has any callers, and whether it might be inlined into those callers. If so, we also check whether inlining 'A' into 'B' would block any of the opportunities for inlining 'B' into its callers. We take the sum of the costs of inlining 'B' into its callers where that inlining would be blocked by inlining 'A' into 'B', and if that cost is less than the cost of inlining 'A' into 'B', then we skip inlining 'A' into 'B'. Now, in order for #2 to make sense, we have to have some confidence that we will actually have the opportunity to inline 'B' into its callers when cheaper, and that we'll be able to revisit the decision and inline 'A' into 'B' if that ever becomes the correct tradeoff. This often isn't true for external functions -- we can see very few of their callers, and we won't be able to re-consider inlining 'A' into 'B' if 'B' is external when we finally see more callers of 'B'. There are two cases where we believe this to be true for C/C++ code: functions local to a translation unit, and functions with an inline definition in every translation unit which uses them. These are represented as internal linkage and linkonce-odr (resp.) in LLVM. I enabled this logic for linkonce-odr in r152737. Unfortunately, when I did that, I also introduced a subtle bug. There was an implicit assumption that the last caller of the function within the TU was the last caller of the function in the program. We want to bonus the last caller of the function in the program by a huge amount for inlining because inlining that callsite has very little cost. Unfortunately, the last caller in the TU of a linkonce-odr function is not the last caller in the program, and so we don't want to apply this bonus. If we do, we can apply it to one callsite per-TU. Because of the way deferred inlining works, when it sees this bonus applied to one callsite in the TU for 'B', it decides that inlining 'B' is of the utmost importance just so we can get that final bonus. It then proceeds to essentially force deferred inlining regardless of the actual cost tradeoff. The result? PR12345: code bloat, code bloat, code bloat. Another result is getting damn lucky on a few benchmarks, and the over-inlining exposing critically important optimizations. I would very much like a list of benchmarks that regress after this change goes in, with bitcode before and after. This will help me greatly understand what opportunities the current cost analysis is missing. Initial benchmark numbers look very good. WebKit files that exhibited the worst of PR12345 went from growing to shrinking compared to Clang with r152737 reverted. - Bootstrapped Clang is 3% smaller with this change. - Bootstrapped Clang -O0 over a single-source-file of lib/Lex is 4% faster with this change. Please let me know about any other performance impact you see. Thanks to Nico for reporting and urging me to actually fix, Richard Smith, Duncan Sands, Manuel Klimek, and Benjamin Kramer for talking through the issues today. llvm-svn: 153506	2012-03-27 10:48:28 +00:00
Chandler Carruth	2121199241	Move the instruction simplification of callsite arguments in the inliner to instead rely on much more generic and powerful instruction simplification in the function cloner (and thus inliner). This teaches the pruning function cloner to use instsimplify rather than just the constant folder to fold values during cloning. This can simplify a large number of things that constant folding alone cannot begin to touch. For example, it will realize that 'or' and 'and' instructions with certain constant operands actually become constants regardless of what their other operand is. It also can thread back through the caller to perform simplifications that are only possible by looking up a few levels. In particular, GEPs and pointer testing tend to fold much more heavily with this change. This should (in some cases) have a positive impact on compile times with optimizations on because the inliner itself will simply avoid cloning a great deal of code. It already attempted to prune proven-dead code, but now it will be use the stronger simplifications to prove more code dead. llvm-svn: 153403	2012-03-25 04:03:40 +00:00
Kostya Serebryany	e505a5abe9	add EP_OptimizerLast extension point llvm-svn: 153353	2012-03-23 23:22:59 +00:00
Chandler Carruth	b37fc13a36	Rip out support for 'llvm.noinline'. This thing has a strange history... It was added in 2007 as the first cut at supporting no-inline attributes, but we didn't have function attributes of any form at the time. However, it was added without any mention in the LangRef or other documentation. Later on, in 2008, Devang added function notes for 'inline=never' and then turned them into proper function attributes. From that point onward, as far as I can tell, the world moved on, and no one has touched 'llvm.noinline' in any meaningful way since. It's time has now come. We have had better mechanisms for doing this for a long time, all the frontends I'm aware of use them, and this is just holding back progress. Given that it was never a documented feature of the IR, I've provided no auto-upgrade support. If people know of real, in-the-wild bitcode that relies on this, yell at me and I'll add it, but I seriously doubt anyone cares. llvm-svn: 152904	2012-03-16 06:10:15 +00:00
Chandler Carruth	d7a5f2adb0	Start removing the use of an ad-hoc 'never inline' set and instead directly query the function information which this set was representing. This simplifies the interface of the inline cost analysis, and makes the always-inline pass significantly more efficient. Previously, always-inline would first make a single set of every function in the module except those marked with the always-inline attribute. It would then query this set at every call site to see if the function was a member of the set, and if so, refuse to inline it. This is quite wasteful. Instead, simply check the function attribute directly when looking at the callsite. The normal inliner also had similar redundancy. It added every function in the module with the noinline attribute to its set to ignore, even though inside the cost analysis function we already tested the noinline attribute and produced the same result. The only tricky part of removing this is that we have to be able to correctly remove only the functions inlined by the always-inline pass when finalizing, which requires a bit of a hack. Still, much less of a hack than the set of all non-always-inline functions was. While I was touching this function, I switched a heavy-weight set to a vector with sort+unique. The algorithm already had a two-phase insert and removal pattern, we were just needlessly paying the uniquing cost on every insert. This probably speeds up some compiles by a small amount (-O0 compiles with lots of always-inline, so potentially heavy libc++ users), but I've not tried to measure it. I believe there is no functional change here, but yell if you spot one. None are intended. Finally, the direction this is going in is to greatly simplify the inline cost query interface so that we can replace its implementation with a much more clever one. Along the way, all the APIs get simplified, so it seems incrementally good. llvm-svn: 152903	2012-03-16 06:10:13 +00:00
Chandler Carruth	30b8416d2c	Change where we enable the heuristic that delays inlining into functions which are small enough to themselves be inlined. Delaying in this manner can be harmful if the function is inelligible for inlining in some (or many) contexts as it pessimizes the code of the function itself in the event that inlining does not eventually happen. Previously the check was written to only do this delaying of inlining for static functions in the hope that they could be entirely deleted and in the knowledge that all callers of static functions will have the opportunity to inline if it is in fact profitable. However, with C++ we get two other important sources of functions where the definition is always available for inlining: inline functions and templated functions. This patch generalizes the inliner to allow linkonce-ODR (the linkage such C++ routines receive) to also qualify for this delay-based inlining. Benchmarking across a range of large real-world applications shows roughly 2% size increase across the board, but an average speedup of about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary itself (when bootstrapped with this feature) shows a 1% -O0 performance improvement when run over all Sema, Lex, and Parse source code smashed into a single file. A clean re-build of Clang+LLVM with a bootstrapped Clang shows approximately 2% improvement, but that measurement is often noisy. llvm-svn: 152737	2012-03-14 20:16:41 +00:00
Dan Gohman	eab06fa3c9	Teach globalopt how to evaluate an invoke with a non-void return type. llvm-svn: 152634	2012-03-13 18:01:37 +00:00
Chandler Carruth	595fda8466	When inlining a function and adding its inner call sites to the candidate set for subsequent inlining, try to simplify the arguments to the inner call site now that inlining has been performed. The goal here is to propagate and fold constants through deeply nested call chains. Without doing this, we loose the inliner bonus that should be applied because the arguments don't match the exact pattern the cost estimator uses. Reviewed on IRC by Benjamin Kramer. llvm-svn: 152556	2012-03-12 11:19:33 +00:00
Stepan Dyatkovskiy	5b648afb4d	Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst" or from "SwitchInst". ConstCaseIt is just a read-only iterator. CaseIt is read-write iterator; it allows to change case successor and case value. Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters. Main way of iterator usage looks like this: SwitchInst SI = ... // intialize it somehow for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) { BasicBlock BB = i.getCaseSuccessor(); ConstantInt *V = i.getCaseValue(); // Do something. } If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method. If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method. There are also related changes in llvm-clients: klee and clang. llvm-svn: 152297	2012-03-08 07:06:20 +00:00
Benjamin Kramer	93887631d9	Plog a memleak in GlobalOpt. Found by valgrind. llvm-svn: 151525	2012-02-27 12:48:24 +00:00
Chad Rosier	50e0b81ea9	Add comment. llvm-svn: 151431	2012-02-25 03:07:57 +00:00
Chad Rosier	07d37bc1ed	Add support for disabling llvm.lifetime intrinsics in the AlwaysInliner. These are optimization hints, but at -O0 we're not optimizing. This becomes a problem when the alwaysinline attribute is abused. rdar://10921594 llvm-svn: 151429	2012-02-25 02:56:01 +00:00
Chad Rosier	e48e5d2945	Fix indentation. llvm-svn: 151420	2012-02-25 01:10:59 +00:00
Duncan Sands	4730cb9c7c	GCC fails to understand that NextBB is always initialized if EvaluateBlock returns 'true' and emits a warning. Help it out. llvm-svn: 151242	2012-02-23 08:23:06 +00:00
Nick Lewycky	9d0da18597	Use the target-aware constant folder on expressions to improve the chance they'll be simple enough to simulate, and to reduce the chance we'll encounter equal but different simple pointer constants. This removes the symptoms from PR11352 but is not a full fix. A proper fix would either require a guarantee that two constant objects we simulate are folded when equal, or a different way of handling equal pointers (ie., trying a constantexpr icmp on them to see whether we know they're equal or non-equal or unsure). llvm-svn: 151093	2012-02-21 22:08:06 +00:00
Nick Lewycky	519561f418	Check for the correct size in the invariant marker. llvm-svn: 151003	2012-02-20 23:32:26 +00:00
Nick Lewycky	60829a587a	Rename class Evaluate to Evaluator and put it in an anonymous namespace. llvm-svn: 150947	2012-02-20 03:25:59 +00:00
Nick Lewycky	73be5e31a6	Move EvaluateFunction and EvaluateBlock into a class, and make the class store the information that they pass around between them. No functionality change! llvm-svn: 150939	2012-02-19 23:26:27 +00:00
Nick Lewycky	68f9f9d9c8	Add support for invariant.start inside the static constructor evaluator. This is useful to represent a variable that is const in the source but can't be constant in the IR because of a non-trivial constructor. If globalopt evaluates the constructor, and there was an invariant.start with no matching invariant.end possible, it will mark the global constant afterwards. llvm-svn: 150794	2012-02-17 06:59:21 +00:00
Nick Lewycky	c1572e4c90	Handle InvokeInst in EvaluateBlock. Don't try to support exceptions, it's just that no optz'ns have run yet to convert invokes to calls. llvm-svn: 150326	2012-02-12 05:09:35 +00:00
Nick Lewycky	f285256f72	false is totally null! llvm-svn: 150324	2012-02-12 02:17:18 +00:00
Nick Lewycky	4b273cb7ea	Remove redundant getAnalysis<> calls in GlobalOpt. Add a few Itanium ABI calls to TargetLibraryInfo and use one of them in GlobalOpt. llvm-svn: 150323	2012-02-12 02:15:20 +00:00
Nick Lewycky	cf6aae686d	Pass TargetData and TargetLibraryInfo through to the constant folder. Fixes a few fixme's when TLI was added. llvm-svn: 150322	2012-02-12 01:13:18 +00:00
Nick Lewycky	1480f1d3f9	Fix function name in comment to match actual name. Fix comments that are using doxy-style on local variables to not do so. Fix one 80-col violation. llvm-svn: 150320	2012-02-12 00:52:26 +00:00
Nick Lewycky	4231c41c64	Don't traverse the PHI nodes twice. No functionality change! llvm-svn: 150319	2012-02-12 00:47:24 +00:00
Benjamin Kramer	1a4695a091	Tweak comment readability and grammar. llvm-svn: 150183	2012-02-09 16:28:15 +00:00
Benjamin Kramer	487a3962c7	GlobalOpt: Be more aggressive about elminating side-effect free static dtors. GlobalOpt runs early in the pipeline (before inlining) and complex class hierarchies often introduce bitcasts or GEPs which weren't optimized away. Teach it to ignore side-effect free instructions instead of depending on other passes to remove them. llvm-svn: 150174	2012-02-09 14:26:06 +00:00
Bill Wendling	d5d95b0b51	[unwind removal] We no longer have 'unwind' instructions being generated, so remove the code that handles them. llvm-svn: 149901	2012-02-06 21:16:41 +00:00
Nick Lewycky	239fdf0f61	Split part of EvaluateFunction into a new EvaluateBlock method. No functionality change. llvm-svn: 149861	2012-02-06 08:24:44 +00:00
Nick Lewycky	52da72b12a	Teach GlobalOpt to handle atomic accesses to globals. * Most of the transforms come through intact by having each transformed load or store copy the ordering and synchronization scope of the original. * The transform that turns a global only accessed in main() into an alloca (since main is non-recursive) with a store of the initial value uses an unordered store, since it's guaranteed to be the first thing to happen in main. (Threads may have started before main (!) but they can't have the address of a function local before the point in the entry block we insert our code.) * The heap-SRoA transforms are disabled in the face of atomic operations. This can probably be improved; it seems odd to have atomic accesses to an alloca that doesn't have its address taken. AnalyzeGlobal keeps track of the strongest ordering found in any use of the global. This is more information than we need right now, but it's cheap to compute and likely to be useful. llvm-svn: 149847	2012-02-05 19:56:38 +00:00
Nick Lewycky	bbd1156b95	Clean up some whitespace and comments. No functionality change. llvm-svn: 149845	2012-02-05 19:48:37 +00:00
Stepan Dyatkovskiy	513aaa5691	SwitchInst refactoring. The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want. What was done: 1. Changed semantics of index inside the getCaseValue method: getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous. 2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned. 3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment. 4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst. 4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor. 4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor. Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang. llvm-svn: 149481	2012-02-01 07:49:51 +00:00
Hal Finkel	c34e51132c	Add a basic-block autovectorization pass. This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure. Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser). llvm-svn: 149468	2012-02-01 03:51:43 +00:00
Chris Lattner	0256be96f2	continue making the world safe for ConstantDataVector. At this point, we should (theoretically optimize and codegen ConstantDataVector as well as ConstantVector. llvm-svn: 149116	2012-01-27 03:08:05 +00:00
Chris Lattner	fa77500d96	Continue improving support for ConstantDataAggregate, and use the new methods recently added to (sometimes greatly!) simplify code. llvm-svn: 149024	2012-01-26 02:32:04 +00:00
Chris Lattner	6705883ad8	use Constant::getAggregateElement to simplify a bunch of code. llvm-svn: 148934	2012-01-25 06:48:06 +00:00
David Blaikie	46a9f016c5	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
Dan Gohman	b9936296d3	Add a new PassManagerBuilder customization point, EP_ModuleOptimizerEarly, to allow passes to be added before the main ModulePass optimizers. llvm-svn: 148329	2012-01-17 20:51:32 +00:00
Eli Friedman	b31c627be1	Re-fix the issue Bill fixed in r147899 in a slightly different way, which doesn't abuse the semantics of linker_private. We don't really want to merge any string constant with a weak_odr global. llvm-svn: 147971	2012-01-11 22:06:46 +00:00
Bill Wendling	c79155192d	If the global variable is removed by the linker, then don't constant merge it with other symbols. An object in the __cfstring section is suppoed to be filled with CFString objects, which have a pointer to ___CFConstantStringClassReference followed by a pointer to a __cstring. If we allow the object in the __cstring section to be merged with another global, then it could end up in any section. Because the linker is going to remove these symbols in the final executable, we shouldn't bother to merge them. <rdar://problem/10564621> llvm-svn: 147899	2012-01-11 00:13:08 +00:00
Eli Friedman	55fa49f32d	PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into global initializers if there's an implied extension or truncation. llvm-svn: 147625	2012-01-05 23:03:32 +00:00
Nick Lewycky	f740db31e2	SCCCaptured is trivially false on entry to this loop and not modified inside it. Eliminate the dead test for it on each loop iteration. No functionality change. llvm-svn: 147616	2012-01-05 22:21:45 +00:00
Nick Lewycky	4c378a4453	Change CaptureTracking to pass a Use* instead of a Value* when a value is captured. This allows the tracker to look at the specific use, which may be especially interesting for function calls. Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does not iterate until a fixpoint and does not guarantee that it produces the same result regardless of iteration order. The new implementation builds up a graph of how arguments are passed from function to function, and uses a bottom-up walk on the argument-SCCs to assign nocapture. This gets us nocapture more often, and does so rather efficiently and independent of iteration order. llvm-svn: 147327	2011-12-28 23:24:21 +00:00
Daniel Dunbar	27a7489a03	LLVMBuild: Remove trailing newline, which irked me. llvm-svn: 146409	2011-12-12 19:48:00 +00:00
Duncan Sands	8fa0b6927d	Remove unused include. llvm-svn: 146037	2011-12-07 17:18:31 +00:00
Chad Rosier	43a33066b4	Fix a few more places where TargetData/TargetLibraryInfo is not being passed. Add FIXMEs to places that are non-trivial to fix. llvm-svn: 145661	2011-12-02 01:26:24 +00:00
Chad Rosier	e6de63dfc5	Last bit of TargetLibraryInfo propagation. Also fixed a case for TargetData where it appeared beneficial to pass. More of rdar://10500969 llvm-svn: 145630	2011-12-01 21:29:16 +00:00
Kostya Serebryany	dc436f95d2	make asan work at -O0, llvm part. Patch by glider@google.com llvm-svn: 145530	2011-11-30 22:19:26 +00:00
Daniel Dunbar	539d0a8a09	build/CMake: Finish removal of add_llvm_library_dependencies. llvm-svn: 145420	2011-11-29 19:25:30 +00:00
Benjamin Kramer	1f97a5a671	Remove all remaining uses of Value::getNameStr(). llvm-svn: 144648	2011-11-15 16:27:03 +00:00
Daniel Dunbar	52823cc91c	build: Attempt to rectify inconsistencies between CMake and LLVMBuild versions of explicit dependencies. - The hope is that we have a tool/test to verify these are accurate (and tight) soon. llvm-svn: 144444	2011-11-12 02:10:57 +00:00
Daniel Dunbar	2f39f72703	LLVMBuild: Alphabetize required_libraries lists. llvm-svn: 144416	2011-11-11 22:59:23 +00:00
Daniel Dunbar	bf9bba47a1	build: Add initial cut at LLVMBuild.txt files. llvm-svn: 143634	2011-11-03 18:53:17 +00:00
Eli Friedman	1923a330e6	Refactor code from inlining and globalopt that checks whether a function definition is unused, and enhance it so it can tell that functions which are only used by a blockaddress are in fact dead. This probably doesn't happen much on most code, but the Linux kernel's _THIS_IP_ can trigger this issue with blockaddress. (GlobalDCE can also handle the given tescase, but we only run that at -O3.) Found while looking at PR11180. llvm-svn: 142572	2011-10-20 05:23:42 +00:00
Andrew Trick	f7656015fc	Inlining and unrolling heuristics should be aware of free truncs. We want heuristics to be based on accurate data, but more importantly we don't want llvm to behave randomly. A benign trunc inserted by an upstream pass should not cause a wild swings in optimization level. See PR11034. It's a general problem with threshold-based heuristics, but we can make it less bad. llvm-svn: 140919	2011-10-01 01:39:05 +00:00
Andrew Trick	caa500bf93	whitespace llvm-svn: 140916	2011-10-01 01:27:56 +00:00
Benjamin Kramer	547b6c5ecd	Stop emitting instructions with the name "tmp" they eat up memory and have to be uniqued, without any benefit. If someone prefers %tmp42 to %42, run instnamer. llvm-svn: 140634	2011-09-27 20:39:19 +00:00
Bill Wendling	04289fcad8	Place the check for an exit landing pad where it will be run on both code paths through the if-then-else. llvm-svn: 140195	2011-09-20 22:27:16 +00:00
Bill Wendling	0058520770	Omit extracting a loop if one of the exits is a landing pad. The landing pad must accompany the invoke when it's extracted. However, if it does, then the loop isn't properly extracted. I.e., the resulting extraction has a loop in it. The extracted function is then extracted, etc. resulting in an infinite loop. llvm-svn: 140193	2011-09-20 22:23:09 +00:00
Bill Wendling	3d48f59231	Check the terminator, not the basic block. llvm-svn: 140176	2011-09-20 20:20:50 +00:00
Bill Wendling	c1da6ea344	When extracting a basic block that ends in an 'invoke' instruction, we need to extract its associated landing pad block as well. However, that landing pad block may have more than one predecessor. So split the landing pad block so that individual landing pads have only one predecessor. This type of transformation may produce a false positive with bugpoint. llvm-svn: 140173	2011-09-20 19:10:24 +00:00
Benjamin Kramer	5a656883b1	C API functions must be able to see their extern "C" definitions, or it will be impossible to call them from C. llvm-svn: 138022	2011-08-19 01:36:54 +00:00
David Chisnall	719a72f34c	Add a mechanism for optimisation plugins to register passes that all front ends can use without needing to be aware of the plugin (or the plugin be aware of the front end). Before 3.0, I'd like to add a mechanism for automatically loading a set of plugins from a config file. API suggestions welcome... llvm-svn: 137717	2011-08-16 13:58:41 +00:00
Eli Friedman	a917d4f9b4	Revert a bit of r137667; the logic in question can safely handle atomic load/store. llvm-svn: 137702	2011-08-16 01:28:22 +00:00
Eli Friedman	b8f30de527	Minor comment fixes. llvm-svn: 137693	2011-08-16 00:20:11 +00:00
Eli Friedman	211e348eaa	Update inter-procedural optimizations for atomic load/store. llvm-svn: 137667	2011-08-15 22:16:46 +00:00
Bill Wendling	88294cdbe0	Mark the SCC as "might unwind" if we run into a 'resume' instruction. llvm-svn: 137627	2011-08-15 18:22:00 +00:00
Chris Lattner	335d399a0e	switch to use the new api for structtypes. llvm-svn: 137480	2011-08-12 18:06:37 +00:00
Rafael Espindola	07f6091527	Add a C interface to PassManagerBuilder. It is missing the addExtension functionality since in the C api a pass is created and added to a pass manager in a single call. llvm-svn: 137159	2011-08-09 22:17:34 +00:00
Bill Wendling	2d3138c112	Remove the LowerSetJmp pass. It wasn't used effectively by any of the targets. This is some of my original LLVM code. wipes tear llvm-svn: 136821	2011-08-03 22:18:20 +00:00
Rafael Espindola	3ea478b7ac	Move methods in PassManagerBuilder offline. llvm-svn: 136727	2011-08-02 21:50:27 +00:00
Bill Wendling	f891bf8b30	Add the 'resume' instruction for the new EH rewrite. This adds the 'resume' instruction class, IR parsing, and bitcode reading and writing. The 'resume' instruction resumes propagation of an existing (in-flight) exception whose unwinding was interrupted with a 'landingpad' instruction (to be added later). llvm-svn: 136589	2011-07-31 06:30:59 +00:00
Bill Wendling	ad088e6724	Revert r136253, r136263, r136269, r136313, r136325, r136326, r136329, r136338, r136339, r136341, r136369, r136387, r136392, r136396, r136429, r136430, r136444, r136445, r136446, r136253 pending review. llvm-svn: 136556	2011-07-30 05:42:50 +00:00
Eli Friedman	adec587d5c	Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to be working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457	2011-07-29 03:05:32 +00:00
Chandler Carruth	9d7feab3e0	Rewrite the CMake build to use explicit dependencies between libraries, specified in the same file that the library itself is created. This is more idiomatic for CMake builds, and also allows us to correctly specify dependencies that are missed due to bugs in the GenLibDeps perl script, or change from compiler to compiler. On Linux, this returns CMake to a place where it can relably rebuild several targets of LLVM. I have tried not to change the dependencies from the ones in the current auto-generated file. The only places I've really diverged are in places where I was seeing link failures, and added a dependency. The goal of this patch is not to start changing the dependencies, merely to move them into the correct location, and an explicit form that we can control and change when necessary. This also removes a serialization point in the build because we don't have to scan all the libraries before we begin building various tools. We no longer have a step of the build that regenerates a file inside the source tree. A few other associated cleanups fall out of this. This isn't really finished yet though. After talking to dgregor he urged switching to a single CMake macro to construct libraries with both sources and dependencies in the arguments. Migrating from the two macros to that style will be a follow-up patch. Also, llvm-config is still generated with GenLibDeps.pl, which means it still has slightly buggy dependencies. The internal CMake 'llvm-config-like' macro uses the correct explicitly specified dependencies however. A future patch will switch llvm-config generation (when using CMake) to be based on these deps as well. This may well break Windows. I'm getting a machine set up now to dig into any failures there. If anyone can chime in with problems they see or ideas of how to solve them for Windows, much appreciated. llvm-svn: 136433	2011-07-29 00:14:25 +00:00
Bill Wendling	6c923bb8d9	Merge the contents from exception-handling-rewrite to the mainline. This adds the new instructions 'landingpad' and 'resume'. llvm-svn: 136253	2011-07-27 20:18:04 +00:00
Nick Lewycky	8ac9ecedfd	Teach the ConstantMerge pass about alignment. Fixes PR10514! llvm-svn: 136250	2011-07-27 19:47:34 +00:00
Rafael Espindola	b84dc6bca8	Add LLVMAddAlwaysInlinerPass to the C API. llvm-svn: 136083	2011-07-26 15:23:23 +00:00
Rafael Espindola	be2fe29f9c	LLVM 3.0 is here, remove old do nothing method. llvm-svn: 136082	2011-07-26 15:17:32 +00:00
Jay Foad	d1b7849d49	Convert GetElementPtrInst to use ArrayRef. llvm-svn: 135904	2011-07-25 09:48:08 +00:00
Jay Foad	17bab44308	Fix more MSVC warnings caused by a cases I missed when converting ConstantExpr::getGetElementPtr to use ArrayRef. llvm-svn: 135762	2011-07-22 08:52:50 +00:00
Jay Foad	2f5fc8c67d	Make better use of ConstantExpr::getGetElementPtr's InBounds parameter. llvm-svn: 135676	2011-07-21 15:15:37 +00:00
Jay Foad	ed8db7d9df	Convert ConstantExpr::getGetElementPtr and ConstantExpr::getInBoundsGetElementPtr to use ArrayRef. llvm-svn: 135673	2011-07-21 14:31:17 +00:00
Chris Lattner	5cf753c95e	move tier out of an anonymous namespace, it doesn't make sense to for it to be an an anon namespace and be in a header. Eliminate some extraenous uses of tie. llvm-svn: 135669	2011-07-21 06:21:31 +00:00
Jay Foad	bf904773bb	Convert TargetData::getIndexedOffset to use ArrayRef. llvm-svn: 135478	2011-07-19 14:01:37 +00:00
Jay Foad	f4b14a2b0d	Use ArrayRef in ConstantFoldInstOperands and ConstantFoldCall. llvm-svn: 135477	2011-07-19 13:32:40 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Jay Foad	5bd375a6cc	Convert CallInst and InvokeInst APIs to use ArrayRef. llvm-svn: 135265	2011-07-15 08:37:34 +00:00
Jay Foad	b804a2b751	Second attempt at de-constifying LLVM Types in FunctionType::get(), StructType::get() and TargetData::getIntPtrType(). llvm-svn: 134982	2011-07-12 14:06:48 +00:00
Bill Wendling	a78cd228c2	Revert r134893 and r134888 (and related patches in other trees). It was causing an assert on Darwin llvm-gcc builds. Assertion failed: (castIsValid(op, S, Ty) && "Invalid cast!"), function Create, file /Users/buildslave/zorg/buildbot/smooshlab/slave-0.8/build.llvm-gcc-i386-darwin9-RA/llvm.src/lib/VMCore/Instructions.cpp, li\ ne 2067. etc. http://smooshlab.apple.com:8013/builders/llvm-gcc-i386-darwin9-RA/builds/2354 --- Reverse-merging r134893 into '.': U include/llvm/Target/TargetData.h U include/llvm/DerivedTypes.h U tools/bugpoint/ExtractFunction.cpp U unittests/Support/TypeBuilderTest.cpp U lib/Target/ARM/ARMGlobalMerge.cpp U lib/Target/TargetData.cpp U lib/VMCore/Constants.cpp U lib/VMCore/Type.cpp U lib/VMCore/Core.cpp U lib/Transforms/Utils/CodeExtractor.cpp U lib/Transforms/Instrumentation/ProfilingUtils.cpp U lib/Transforms/IPO/DeadArgumentElimination.cpp U lib/CodeGen/SjLjEHPrepare.cpp --- Reverse-merging r134888 into '.': G include/llvm/DerivedTypes.h U include/llvm/Support/TypeBuilder.h U include/llvm/Intrinsics.h U unittests/Analysis/ScalarEvolutionTest.cpp U unittests/ExecutionEngine/JIT/JITTest.cpp U unittests/ExecutionEngine/JIT/JITMemoryManagerTest.cpp U unittests/VMCore/PassManagerTest.cpp G unittests/Support/TypeBuilderTest.cpp U lib/Target/MBlaze/MBlazeIntrinsicInfo.cpp U lib/Target/Blackfin/BlackfinIntrinsicInfo.cpp U lib/VMCore/IRBuilder.cpp G lib/VMCore/Type.cpp U lib/VMCore/Function.cpp G lib/VMCore/Core.cpp U lib/VMCore/Module.cpp U lib/AsmParser/LLParser.cpp U lib/Transforms/Utils/CloneFunction.cpp G lib/Transforms/Utils/CodeExtractor.cpp U lib/Transforms/Utils/InlineFunction.cpp U lib/Transforms/Instrumentation/GCOVProfiling.cpp U lib/Transforms/Scalar/ObjCARC.cpp U lib/Transforms/Scalar/SimplifyLibCalls.cpp U lib/Transforms/Scalar/MemCpyOptimizer.cpp G lib/Transforms/IPO/DeadArgumentElimination.cpp U lib/Transforms/IPO/ArgumentPromotion.cpp U lib/Transforms/InstCombine/InstCombineCompares.cpp U lib/Transforms/InstCombine/InstCombineAndOrXor.cpp U lib/Transforms/InstCombine/InstCombineCalls.cpp U lib/CodeGen/DwarfEHPrepare.cpp U lib/CodeGen/IntrinsicLowering.cpp U lib/Bitcode/Reader/BitcodeReader.cpp llvm-svn: 134949	2011-07-12 01:15:52 +00:00
Jay Foad	7c57be3e2b	De-constify Types in StructType::get() and TargetData::getIntPtrType(). llvm-svn: 134893	2011-07-11 09:56:20 +00:00
Jay Foad	56cc1530ee	De-constify Types in FunctionType::get(). llvm-svn: 134888	2011-07-11 07:56:41 +00:00
Chris Lattner	6b96757745	remove the DerivedType which isn't adding value anymore. llvm-svn: 134832	2011-07-09 17:59:15 +00:00
Chris Lattner	b1ed91f397	Land the long talked about "type system rewrite" patch. This patch brings numerous advantages to LLVM. One way to look at it is through diffstat: 109 files changed, 3005 insertions(+), 5906 deletions(-) Removing almost 3K lines of code is a good thing. Other advantages include: 1. Value::getType() is a simple load that can be CSE'd, not a mutating union-find operation. 2. Types a uniqued and never move once created, defining away PATypeHolder. 3. Structs can be "named" now, and their name is part of the identity that uniques them. This means that the compiler doesn't merge them structurally which makes the IR much less confusing. 4. Now that there is no way to get a cycle in a type graph without a named struct type, "upreferences" go away. 5. Type refinement is completely gone, which should make LTO much MUCH faster in some common cases with C++ code. 6. Types are now generally immutable, so we can use "Type " instead "const Type " everywhere. Downsides of this patch are that it removes some functions from the C API, so people using those will have to upgrade to (not yet added) new API. "LLVM 3.0" is the right time to do this. There are still some cleanups pending after this, this patch is large enough as-is. llvm-svn: 134829	2011-07-09 17:41:24 +00:00
Chris Lattner	cc19efaa97	Revamp the "ConstantStruct::get" methods. Previously, these were scattered all over the place in different styles and variants. Standardize on two preferred entrypoints: one that takes a StructType and ArrayRef, and one that takes StructType and varargs. In cases where there isn't a struct type convenient, we now add a ConstantStruct::getAnon method (whose name will make more sense after a few more patches land). It would be "really really nice" if the ConstantStruct::get and ConstantVector::get methods didn't make temporary std::vectors. llvm-svn: 133412	2011-06-20 04:01:31 +00:00
John McCall	58fb52c6c7	When deleting a basic block, remove call edges only for non-intrinsics. llvm-svn: 132803	2011-06-09 20:31:09 +00:00
Rafael Espindola	b77c00fb60	Improve the handling of available_externally and llvm.global_ctors. llvm-svn: 132775	2011-06-09 14:38:09 +00:00
Nick Lewycky	c66d455e50	Don't crash owhen ComputeLoadResult can't compute the result of the load. llvm-svn: 132290	2011-05-29 19:33:36 +00:00
Nick Lewycky	a3bb03e400	Obey the isVolatile bit on memory intrinsics when analyzing uses of a global variable. Noticed by inspection. Simulate memset in EvaluateFunction where the target of the memset and the value we're setting are both the null value. Fixes PR10047! llvm-svn: 132288	2011-05-29 18:41:56 +00:00
Chris Lattner	1a1acc2191	fix PR9856, an incorrectly conservative assertion: a global can be "stored once" even if its address is compared. llvm-svn: 131849	2011-05-22 07:15:13 +00:00
Julien Lerouge	7e11f9e26d	Fix a source of non determinism in FindUsedTypes, use a SetVector instead of a set. rdar://9423996 llvm-svn: 131283	2011-05-13 05:20:42 +00:00
Devang Patel	3fd06f760b	Preserve line number information. llvm-svn: 131112	2011-05-10 00:03:11 +00:00
Jay Foad	1a180156b6	Remove unused STL header includes. llvm-svn: 130068	2011-04-23 19:53:52 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Chris Lattner	e81d045d94	remove the StructRetPromotion pass. It is unused, not maintained and has some bugs. If this is interesting functionality, it should be reimplemented in the argpromotion pass. llvm-svn: 129314	2011-04-11 23:09:44 +00:00
Nick Lewycky	0f85789800	Just because a GlobalVariable's initializer is [N x { i32, void ()* }] doesn't mean that it has to be ConstantArray of ConstantStruct. We might have ConstantAggregateZero, at either level, so don't crash on that. Also, semi-deprecate the sentinal value. The linker isn't aware of sentinals so we end up with the two lists appended, each with their "sentinals" on them. Different parts of LLVM treated sentinals differently, so make them all just ignore the single entry and continue on with the rest of the list. llvm-svn: 129307	2011-04-11 22:11:20 +00:00
Jay Foad	7c14a558fe	Don't include Operator.h from InstrTypes.h. llvm-svn: 129271	2011-04-11 09:35:34 +00:00
Eli Friedman	9cca0715aa	Add back a couple checks removed by r129128; the fact that an intitializer is an array of structures doesn't imply it's a ConstantArray of ConstantStruct. llvm-svn: 129207	2011-04-09 09:11:09 +00:00
Nick Lewycky	466d0c1f93	llvm.global_[cd]tor is defined to be either external, or appending with an array of { i32, void ()* }. Teach the verifier to verify that, deleting copies of checks strewn about. llvm-svn: 129128	2011-04-08 07:30:21 +00:00
Jay Foad	11522097be	Remove some support for ReturnInsts with multiple operands, and for returning a scalar value in a function whose return type is a single- element structure or array. llvm-svn: 128810	2011-04-04 07:44:02 +00:00
Jay Foad	52131344a2	Remove PHINode::reserveOperandSpace(). Instead, add a parameter to PHINode::Create() giving the (known or expected) number of operands. llvm-svn: 128537	2011-03-30 11:28:46 +00:00
Jay Foad	e0938d8a87	(Almost) always call reserveOperandSpace() on newly created PHINodes. llvm-svn: 128535	2011-03-30 11:19:20 +00:00
Nick Lewycky	0e25c8b364	No functionality change, just adjust some whitespace for coding style compliance. llvm-svn: 128257	2011-03-25 06:05:50 +00:00
Anders Carlsson	1cc8073bb3	Handle another case that Frits suggested. llvm-svn: 128068	2011-03-22 03:21:01 +00:00
Anders Carlsson	4dd420f193	More cleanups to the OptimizeEmptyGlobalCXXDtors GlobalOpt function. llvm-svn: 127997	2011-03-21 14:54:40 +00:00
Anders Carlsson	701822a48e	As suggested by Nick Lewycky, ignore debugging intrinsics when trying to decide whether a destructor is empty or not. llvm-svn: 127985	2011-03-21 02:42:27 +00:00
Nick Lewycky	d078183725	Fix comments llvm-svn: 127984	2011-03-21 02:26:01 +00:00
Anders Carlsson	336fd90f4d	Don't try to eliminate invokes to __cxa_atexit. llvm-svn: 127976	2011-03-20 20:21:33 +00:00
Anders Carlsson	fcec2f519a	Don't segfault on mutual recursion, as pointed out by Frits. llvm-svn: 127975	2011-03-20 20:16:43 +00:00
Anders Carlsson	48a44911d3	Address comments from Frits van Bommel. llvm-svn: 127974	2011-03-20 19:51:13 +00:00
Anders Carlsson	ee6bc70d2f	Add an optimization to GlobalOpt that eliminates calls to __cxa_atexit, if the function passed is empty. llvm-svn: 127970	2011-03-20 17:59:11 +00:00
Devang Patel	a10794ab7b	These llvm.dbg.* constants are not used anymore. llvm-svn: 127352	2011-03-09 19:41:33 +00:00
Rafael Espindola	871cfde1c2	Don't internalize available_externally functions. We already did the right thing for variables. llvm-svn: 127138	2011-03-06 23:41:34 +00:00
Eli Friedman	683bbc16c4	Add an obvious missing safety check to DAE::RemoveDeadArgumentsFromCallers. llvm-svn: 126720	2011-03-01 00:33:47 +00:00
Nick Lewycky	080ea93779	Instead of keeping two Value*->id# mappings, keep one Value->Value mapping and one Value set. This is faster because we only need to use the set when there isn't already an entry in the map. No functionality change! llvm-svn: 126076	2011-02-20 08:11:03 +00:00
Chris Lattner	69229316aa	convert ConstantVector::get to use ArrayRef. llvm-svn: 125537	2011-02-15 00:14:00 +00:00
Chris Lattner	34442e6ebf	revert my ConstantVector patch, it seems to have made the llvm-gcc builders unhappy. llvm-svn: 125504	2011-02-14 18:15:46 +00:00
Chris Lattner	d9f5b88548	Switch ConstantVector::get to use ArrayRef instead of a pointer+size idiom. Change various clients to simplify their code. llvm-svn: 125487	2011-02-14 07:55:32 +00:00
Nick Lewycky	292e78c3cd	When removing a function from the function set and adding it to deferred, we could end up removing a different function than we intended because it was functionally equivalent, then end up with a comparison of a function against itself in the next round of comparisons (the one in the function set and the one on the deferred list). To fix this, I introduce a choice in the form of comparison for ComparableFunctions, either normal or "pointer only" used to find exact Function*'s in lookups. Also add some debugging statements. llvm-svn: 125180	2011-02-09 06:32:02 +00:00
Nick Lewycky	cb1a4c26ee	Simplify away redundant test, and document what's going on. llvm-svn: 124977	2011-02-06 05:04:00 +00:00
Nick Lewycky	f8797fda44	Remove specialized comparison of InlineAsm objects. They're uniqued on creation now, and this wasn't comparing some of their relevant bits anyhow. llvm-svn: 124976	2011-02-06 04:33:50 +00:00
Nick Lewycky	a46c898314	Remove wasteful caching. This isn't needed for correctness because any function that might have changed been affected by a merge elsewhere will have been removed from the function set, and it isn't needed for performance because we call grow() ahead of time to prevent reallocations. llvm-svn: 124717	2011-02-02 05:31:01 +00:00
Nick Lewycky	cfb284cf96	Rename functions to follow coding standard. Also rejiggers comments. No functionality change. llvm-svn: 124482	2011-01-28 08:43:14 +00:00
Nick Lewycky	aaf401241a	Add a doxygen comment for this class. llvm-svn: 124480	2011-01-28 08:19:00 +00:00
Nick Lewycky	564fcca856	Reorder for readability. (Chris, is this what you meant?) llvm-svn: 124479	2011-01-28 07:36:21 +00:00
Nick Lewycky	c5eb3733f7	Reduce the number of functions we look at in the first pass, and preallocate the function equality set. llvm-svn: 124475	2011-01-28 05:48:15 +00:00
Benjamin Kramer	57e3d65884	Unbreak the build. llvm-svn: 124426	2011-01-27 20:30:54 +00:00
Nick Lewycky	e2d46d30ae	Expound upon this comparison! llvm-svn: 124406	2011-01-27 19:51:31 +00:00
Nick Lewycky	5a37e950e1	Use dyn_cast instead of isa+cast. llvm-svn: 124404	2011-01-27 19:42:43 +00:00
Nick Lewycky	13e04aef2a	Fix surprising missed optimization in mergefunc where we forgot to consider that relationships like "i8* null" is equivalent to "i32* null". llvm-svn: 124368	2011-01-27 08:38:19 +00:00
Nick Lewycky	91543447a6	AttrListPtr has an overloaded operator== which does this for us, we should use it. No functionality change! llvm-svn: 124286	2011-01-26 09:23:19 +00:00
Nick Lewycky	82d4db8662	Teach mergefunc that intptr_t is the same width as a pointer. We still can't merge vector<intptr_t>::push_back() and vector<void>::push_back() because Enumerate() doesn't realize that "i64 null" and "i8** null" are equivalent. llvm-svn: 124285	2011-01-26 09:13:58 +00:00
Nick Lewycky	fb622f9920	There are no vectors of pointer or arrays, so we don't need to check vector elements for type equivalence. llvm-svn: 124284	2011-01-26 08:50:18 +00:00
Nick Lewycky	f1cec164ce	Teach mergefunc how to emit aliases safely again -- but keep it turned it off for now. It's controlled by the HasGlobalAliases variable which is not attached to any flag yet. llvm-svn: 124182	2011-01-25 08:56:50 +00:00
Rafael Espindola	fc355bc070	Add unnamed_addr when we can show that address of a global is not used. llvm-svn: 123834	2011-01-19 16:32:21 +00:00
Rafael Espindola	ecd5b9abe9	Reduce indentation and remove commented out code. llvm-svn: 123729	2011-01-18 04:36:06 +00:00
Anders Carlsson	d3db83349e	Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead. llvm-svn: 123596	2011-01-16 21:25:33 +00:00
Rafael Espindola	751677a040	Don't merge two constants if we care about the address of both. This fixes the original testcase in PR8927. It also causes a clang binary built with a patched clang to increase in size by 0.21%. We can probably get some of the size back by writing a pass that detects that a global never has its pointer compared and adds unnamed_addr to it (maybe extend global opt). It is also possible that there are some other cases clang could add unnamed_addr to. I will investigate extending globalopt next. llvm-svn: 123584	2011-01-16 17:05:09 +00:00
Chris Lattner	e5f8de8639	fix PR8932, a case where arg promotion could infinitely promote. llvm-svn: 123574	2011-01-16 08:09:24 +00:00
Owen Anderson	4e54efd625	Improve the safety of my globalopt enhancement by ensuring that the bitcast of the stored value to the new store type is always. Also, add a testcase. llvm-svn: 123563	2011-01-16 04:33:33 +00:00
Chris Lattner	8b4952fcf7	simplify this code, it is still broken but will follow up on llvm-commits. llvm-svn: 123558	2011-01-16 02:05:10 +00:00
Chris Lattner	1e209b87ad	remove the partial specialization pass. It is unmaintained and has bugs. llvm-svn: 123554	2011-01-16 00:27:10 +00:00
Nick Lewycky	4a1ff16b29	Add missing whitespace. llvm-svn: 123543	2011-01-15 18:42:52 +00:00
Nick Lewycky	0296a481f9	Make constmerge a two-pass algorithm so that it won't miss merging opporuntities. Fixes PR8978. llvm-svn: 123541	2011-01-15 18:14:21 +00:00
Benjamin Kramer	ed5f2e504e	Try to unbreak selfhost. llvm-svn: 123537	2011-01-15 11:25:34 +00:00
Nick Lewycky	540f9536c8	Add a cache that protects mergefunc's internals from more surprises in DenseSet. Also, replace tabs with spaces. Yes, it's 2011. llvm-svn: 123535	2011-01-15 10:16:23 +00:00
Owen Anderson	3e2f6cf7ae	Fix a false-positive warning. llvm-svn: 123480	2011-01-14 22:31:13 +00:00
Owen Anderson	9eb7cb48e4	Enhance GlobalOpt to be able evaluate initializers that involve stores through bitcasts, at least in simple cases. This fixes clang's CodeGenCXX/virtual-base-dtor.cpp llvm-svn: 123477	2011-01-14 22:19:20 +00:00
Dale Johannesen	a71d2cc88d	Improve the accuracy of the inlining heuristic looking for the case where a static caller is itself inlined everywhere else, and thus may go away if it doesn't get too big due to inlining other things into it. If there are references to the caller other than calls, it will not be removed; account for this. This results in same-day completion of the case in PR8853. llvm-svn: 122821	2011-01-04 19:01:54 +00:00
Nick Lewycky	5361b84184	Also remove functions that use complex constant expressions in terms of another function. llvm-svn: 122705	2011-01-02 19:16:44 +00:00
Nick Lewycky	4e250c8245	Remove functions from the FnSet when one of their callee's is being merged. This maintains the guarantee that the DenseSet expects two elements it contains to not go from inequal to equal under its nose. As a side-effect, this also lets us switch from iterating to a fixed-point to actually maintaining a work queue of functions to look at again, and we don't add thunks to our work queue so we don't need to detect and ignore them. llvm-svn: 122677	2011-01-02 02:46:33 +00:00
Chris Lattner	1903c42b97	fix a globalopt crash on two Adobe-C++ testcases that the recent loop idiom pass exposed. llvm-svn: 122674	2011-01-01 22:31:46 +00:00
Chris Lattner	0d71c4f564	reapply r121100 with a tweak to constant fold ConstExprs with TargetData (if available) as we go so that we get simple constantexprs not insane ones. This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp that the previous iteration of this patch had. llvm-svn: 121111	2010-12-07 04:33:29 +00:00
Eric Christopher	f10dcfb9fb	Temporarily revert r121100 as it's causing clang to fail CodeGenCXX/virtual-base-ctor.cpp. llvm-svn: 121102	2010-12-07 02:41:11 +00:00
Chris Lattner	287f4366c1	fix PR8710 - teach global opt that some constantexprs are too complex to put in a global variable's initializer. llvm-svn: 121100	2010-12-07 01:59:32 +00:00
Chris Lattner	7ff0ba41bd	replace a linear scan with a symtab lookup, reduce indentation. No functionality change. llvm-svn: 121042	2010-12-06 21:53:07 +00:00
Chris Lattner	fb212de06d	Fix PR8735, a really terrible problem in the inliner's "alloca merging" optimization. Consider: static void foo() { A = alloca ... } static void bar() { B = alloca ... call foo(); } void main() { bar() } The inliner proceeds bottom up, but lets pretend it decides not to inline foo into bar. When it gets to main, it inlines bar into main(), and says "hey, I just inlined an alloca "B" into main, lets remember that. Then it keeps going and finds that it now contains a call to foo. It decides to inline foo into main, and says "hey, foo has an alloca A, and I have an alloca B from another inlined call site, lets reuse it". The problem with this of course, is that the lifetime of A and B are nested, not disjoint. Unfortunately I can't create a reasonable testcase for this: the one in the PR is both huge and extremely sensitive, because you minor tweaks end up causing foo to get inlined into bar too early. We already have tests for the basic alloca merging optimization and this does not break them. llvm-svn: 120995	2010-12-06 07:52:42 +00:00
Chris Lattner	5b6a865f2e	improve -debug output and comments a little. llvm-svn: 120993	2010-12-06 07:38:40 +00:00
Dan Gohman	65316d6749	Add helper functions for computing the Location of load, store, and vaarg instructions. llvm-svn: 118845	2010-11-11 21:50:19 +00:00
Dan Gohman	a826a88755	Factor out Instruction::isSafeToSpeculativelyExecute's code for testing for dereferenceable pointers into a helper function, isDereferenceablePointer. Teach it how to reason about GEPs with simple non-zero indices. Also eliminate ArgumentPromtion's IsAlwaysValidPointer, which didn't check for weak externals or out of range gep indices. llvm-svn: 118840	2010-11-11 21:23:25 +00:00
Dan Gohman	dcdfd8dd24	TBAA-enable ArgumentPromotion. llvm-svn: 118804	2010-11-11 18:09:32 +00:00
Dan Gohman	066c1bb1e9	Add a doesAccessArgPointees helper function, and update code to use it, and to be consistent. llvm-svn: 118692	2010-11-10 18:17:28 +00:00
Dan Gohman	2577580967	Factor out the code for testing whether a function accesses arbitrary memory into a helper function, and adjust some comments. llvm-svn: 118687	2010-11-10 17:34:04 +00:00
Dan Gohman	2694e14087	Make ModRefBehavior a lattice. Use this to clean up AliasAnalysis chaining and simplify FunctionAttrs' GetModRefBehavior logic. llvm-svn: 118660	2010-11-10 01:02:18 +00:00
Dan Gohman	e3467a7687	Teach FunctionAttrs about the VAArg instruction. llvm-svn: 118627	2010-11-09 20:17:38 +00:00
Dan Gohman	35814e6128	Use the AliasAnalysis interface to determine how a Function accesses memory. This isn't a real improvement with present day AliasAnalysis implementations; it's mainly for consistency. llvm-svn: 118624	2010-11-09 20:13:27 +00:00
Dan Gohman	de52155685	Teach FunctionAttrs about AccessesArgumentsReadonly. llvm-svn: 118617	2010-11-09 19:56:27 +00:00
Dan Gohman	470ade12e0	Fix a thinko that Duncan spotted. llvm-svn: 118430	2010-11-08 19:24:47 +00:00
Dan Gohman	2cd1fd4a82	Make FunctionAttrs TBAA-aware. llvm-svn: 118417	2010-11-08 17:12:04 +00:00
Dan Gohman	9130bad71f	Extend the AliasAnalysis::pointsToConstantMemory interface to allow it to optionally look for constant or local (alloca) memory. Teach BasicAliasAnalysis::pointsToConstantMemory to look through Select and Phi nodes, and to support looking for local memory. Remove FunctionAttrs' PointsToLocalOrConstantMemory function, now that AliasAnalysis knows all the tricks that it knew. llvm-svn: 118412	2010-11-08 16:45:26 +00:00
Dan Gohman	86449d705a	Make FunctionAttrs use AliasAnalysis::getModRefBehavior, now that it knows about intrinsic functions. llvm-svn: 118410	2010-11-08 16:10:15 +00:00
Duncan Sands	9d1fe4c40d	Rename PointsToLocalMemory to PointsToLocalOrConstantMemory to make the code more self-documenting. llvm-svn: 118171	2010-11-03 14:45:05 +00:00
Jakob Stoklund Olesen	31a7eb40c1	Let the -inline-threshold command line argument take precedence over the threshold given to createFunctionInliningPass(). Both opt -O3 and clang would silently ignore the -inline-threshold option. llvm-svn: 118117	2010-11-02 23:40:26 +00:00

... 3 4 5 6 7 ...

1830 Commits