llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Berlin	554dcd8c89	MemorySSA: Move to Analysis, from Transforms/Utils. It's used as Analysis, it has Analysis passes, and once NewGVN is made an Analysis, this removes the cross dependency from Analysis to Transform/Utils. NFC. llvm-svn: 299980	2017-04-11 20:06:36 +00:00
Rong Xu	48596b6f7a	[PGO] Memory intrinsic calls optimization based on profiled size This patch optimizes two memory intrinsic operations: memset and memcpy based on the profiled size of the operation. The high level transformation is like: mem_op(..., size) ==> switch (size) { case s1: mem_op(..., s1); goto merge_bb; case s2: mem_op(..., s2); goto merge_bb; ... default: mem_op(..., size); goto merge_bb; } merge_bb: Differential Revision: http://reviews.llvm.org/D28966 llvm-svn: 299446	2017-04-04 16:42:20 +00:00
Dehao Chen	cc75d2441d	Add call branch annotation for ICP promoted direct call in SamplePGO mode. Summary: SamplePGO uses branch_weight annotation to represent callsite hotness. When ICP promotes an indirect call to direct call, we need to make sure the direct call is annotated with branch_weight in SamplePGO mode, so that downstream function inliner can use hot callsite heuristic. Reviewers: davidxl, eraman, xur Reviewed By: davidxl, xur Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D30282 llvm-svn: 296028	2017-02-23 22:15:18 +00:00
Dehao Chen	7d230325ef	Increases full-unroll threshold. Summary: The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge): Performance: 403 0.11% 433 0.51% 445 0.48% 447 3.50% 453 1.49% 464 0.75% Code size: 403 0.56% 433 0.96% 445 2.16% 447 2.96% 453 0.94% 464 8.02% The compiler time overhead is similar with code size. Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc Reviewed By: hfinkel, chandlerc Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D28368 llvm-svn: 295538	2017-02-18 03:46:51 +00:00
Davide Italiano	513dfaa0a3	[PM] Hook up the instrumented PGO machinery in the new PM. Differential Revision: https://reviews.llvm.org/D29308 llvm-svn: 294955	2017-02-13 15:26:22 +00:00
Chandler Carruth	719ffe1a66	[PM] Add devirtualization-based iteration utility into the new PM's default pipeline. A clang with this patch built with ASan and asserts can build all of the test-suite as well, so it seems to not uncover any latent problems. Differential Revision: https://reviews.llvm.org/D29853 llvm-svn: 294888	2017-02-12 05:38:04 +00:00
Chandler Carruth	e87fc8cb71	[PM] Enable GlobalsAA in the new PM's pipeline by default. All the invalidation issues and bugs in this seem to be fixed, it has survived a full build of the test suite plus SPEC with asserts and ASan enabled on the Clang binary used. Differential Revision: https://reviews.llvm.org/D29815 llvm-svn: 294887	2017-02-12 05:34:04 +00:00
Chandler Carruth	0ede22e1c0	[PM] Add Argument Promotion to the pass pipeline. This needs explicit requires of the optimization remark emission before loop pass pipelines containing LICM as we no longer get it from the inliner -- Argument Promotion may invalidate it. Technically the inliner could also have broken this, but it never came up in testing. Differential Revision: https://reviews.llvm.org/D29595 llvm-svn: 294670	2017-02-09 23:54:57 +00:00
Chandler Carruth	addcda483e	[PM] Port ArgumentPromotion to the new pass manager. Now that the call graph supports efficient replacement of a function and spurious reference edges, we can port ArgumentPromotion to the new pass manager very easily. The old PM-specific bits are sunk into callbacks that the new PM simply doesn't use. Unlike the old PM, the new PM simply does argument promotion and afterward does the update to LCG reflecting the promoted function. Differential Revision: https://reviews.llvm.org/D29580 llvm-svn: 294667	2017-02-09 23:46:27 +00:00
Peter Collingbourne	857aba4410	Rename LowerTypeTestsSummaryAction to PassSummaryAction. NFCI. I intend to use the same type with the same semantics in the WholeProgramDevirt pass. Differential Revision: https://reviews.llvm.org/D29746 llvm-svn: 294629	2017-02-09 21:45:01 +00:00
Daniel Berlin	439042b7ad	Add PredicateInfo utility and printing pass Summary: This patch adds a utility to build extended SSA (see "ABCD: eliminating array bounds checks on demand"), and an intrinsic to support it. This is then used to get functionality equivalent to propagateEquality in GVN, in NewGVN (without having to replace instructions as we go). It would work similarly in SCCP or other passes. This has been talked about a few times, so i built a real implementation and tried to productionize it. Copies are inserted for operands used in assumes and conditional branches that are based on comparisons (see below for more) Every use affected by the predicate is renamed to the appropriate intrinsic result. E.g. %cmp = icmp eq i32 %x, 50 br i1 %cmp, label %true, label %false true: ret i32 %x false: ret i32 1 will become %cmp = icmp eq i32, %x, 50 br i1 %cmp, label %true, label %false true: ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 } %x.0 = call @llvm.ssa_copy.i32(i32 %x) ret i32 %x.0 false: ret i23 1 (you can use -print-predicateinfo to get an annotated-with-predicateinfo dump) This enables us to easily determine what operations are affected by a given predicate, and how operations affected by a chain of predicates. Reviewers: davide, sanjoy Subscribers: mgorny, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D29519 Update for review comments Fix a bug Nuno noticed where we are giving information about and/or on edges where the info is not useful and easy to use wrong Update for review comments llvm-svn: 294351	2017-02-07 21:10:46 +00:00
Chandler Carruth	baabda9317	[PM] Port LoopLoadElimination to the new pass manager and wire it into the main pipeline. This is a very straight forward port. Nothing weird or surprising. This brings the number of missing passes from the new PM's pipeline down to three. llvm-svn: 293249	2017-01-27 01:32:26 +00:00
Chandler Carruth	a95ff38924	[PM] Flesh out almost all of the late loop passes. With this the per-module pass pipeline is extremely close to the legacy PM. The missing pieces are: - PruneEH (or some equivalent) - ArgumentPromotion - LoopLoadElimination - LoopUnswitch I'm going to work through those in essentially that order but this seems like a worthwhile incremental step toward the end state. One difference in what I have here from the legacy PM is that I've consolidated some of the per-function passes at the very end of the pipeline into the main optimization function pipeline. The intervening passes are really uninteresting and so this seems very likely to have any effect other than minor improvement to locality. Note that there are still some failures in the test suite, but the compiler doesn't crash or assert. Differential Revision: https://reviews.llvm.org/D29114 llvm-svn: 293241	2017-01-27 00:50:21 +00:00
Chandler Carruth	79b733bc6b	[PM] Enable the main loop pass pipelines with everything but loop-unswitch in the main pipelines for the new PM. All of these now work, and Clang built using this pipeline can build the test suite and SPEC without hitting any asserts of ASan failures. There are still some bugs hiding though -- 7 tests regress with the new PM. I'm going to be investigating these, but it seems worthwhile to at least get the pipelines in place so that others can play with them, and they aren't completely broken. Differential Revision: https://reviews.llvm.org/D29113 llvm-svn: 293225	2017-01-26 23:21:17 +00:00
Chandler Carruth	eab3b90a14	[PM] Simplify the new PM interface to the loop unroller and expose two factory functions for the two modes the loop unroller is actually used in in-tree: simplified full-unrolling and the entire thing including partial unrolling. I've also wired these up to nice names so you can express both of these being in a pipeline easily. This is a precursor to actually enabling these parts of the O2 pipeline. Differential Revision: https://reviews.llvm.org/D28897 llvm-svn: 293136	2017-01-26 02:13:50 +00:00
Artur Pilipenko	8fb3d57e67	[Guards] Introduce loop-predication pass This patch introduces guard based loop predication optimization. The new LoopPredication pass tries to convert loop variant range checks to loop invariant by widening checks across loop iterations. For example, it will convert for (i = 0; i < n; i++) { guard(i < len); ... } to for (i = 0; i < n; i++) { guard(n - 1 < len); ... } After this transformation the condition of the guard is loop invariant, so loop-unswitch can later unswitch the loop by this condition which basically predicates the loop by the widened condition: if (n - 1 < len) for (i = 0; i < n; i++) { ... } else deoptimize This patch relies on an NFC change to make ScalarEvolution::isMonotonicPredicate public (revision 293062). Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D29034 llvm-svn: 293064	2017-01-25 16:00:44 +00:00
Davide Italiano	089a912365	[PM] Flesh out the new pass manager LTO pipeline. Differential Revision: https://reviews.llvm.org/D28996 llvm-svn: 292863	2017-01-24 00:57:39 +00:00
Chandler Carruth	e9b18e3d34	[PM] Port LoopSink to the new pass manager. Like several other loop passes (the vectorizer, etc) this pass doesn't really fit the model of a loop pass. The critical distinction is that it isn't intended to be pipelined together with other loop passes. I plan to add some documentation to the loop pass manager to make this more clear on that side. LoopSink is also different because it doesn't really need a lot of the infrastructure of our loop passes. For example, if there aren't loop invariant instructions causing a preheader to exist, there is no need to form a preheader. It also doesn't need LCSSA because this pass is only involved in sinking invariant instructions from a preheader into the loop, not reasoning about live-outs. This allows some nice simplifications to the pass in the new PM where we can directly walk the loops once without restructuring them. Differential Revision: https://reviews.llvm.org/D28921 llvm-svn: 292589	2017-01-20 08:42:19 +00:00
Michael Kuperstein	8ecc38ef85	[PM] Add LoopVectorize to the default module pipeline LV no longer "requires" LCSSA and LoopSimplify, and instead forms them internally as required. So, there's nothing preventing it from being enabled. llvm-svn: 292464	2017-01-19 02:21:54 +00:00
Chandler Carruth	3bab7e1a79	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move the latter to the Transforms library. While the loop PM uses an analysis to form the IR units, the current plan is to have the PM itself establish and enforce both loop simplified form and LCSSA. This would be a layering violation in the analysis library. Fundamentally, the idea behind the loop PM is to transform loops in addition to running passes over them, so it really seemed like the most natural place to sink this was into the transforms library. We can't just move everything because we also have loop analyses that rely on a subset of the invariants. So this patch splits the the loop infrastructure into the analysis management that has to be part of the analysis library, and the transform-aware pass manager. This also required splitting the loop analyses' printer passes out to the transforms library, which makes sense to me as running these will transform the code into LCSSA in theory. I haven't split the unittest though because testing one component without the other seems nearly intractable. Differential Revision: https://reviews.llvm.org/D28452 llvm-svn: 291662	2017-01-11 09:43:56 +00:00
Chandler Carruth	410eaeb064	[PM] Rewrite the loop pass manager to use a worklist and augmented run arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to support updates to the set of loops during the traversal of the loop nest and to support invalidation of analyses. An additional significant burden in the loop PM is that so many passes require access to a large number of function analyses. Manually ensuring these are cached, available, and preserved has been a long-standing burden in LLVM even with the help of the automatic scheduling in the old pass manager. And it made the new pass manager extremely unweildy. With this design, we can package the common analyses up while in a function pass and make them immediately available to all the loop passes. While in some cases this is unnecessary, I think the simplicity afforded is worth it. This does not (yet) address loop simplified form or LCSSA form, but those are the next things on my radar and I have a clear plan for them. While the patch is very large, most of it is either mechanically updating loop passes to the new API or the new testing for the loop PM. The code for it is reasonably compact. I have not yet updated all of the loop passes to correctly leverage the update mechanisms demonstrated in the unittests. I'll do that in follow-up patches along with improved FileCheck tests for those passes that ensure things work in more realistic scenarios. In many cases, there isn't much we can do with these until the loop simplified form and LCSSA form are in place. Differential Revision: https://reviews.llvm.org/D28292 llvm-svn: 291651	2017-01-11 06:23:21 +00:00
Chandler Carruth	05ca5acc9e	[PM] Introduce a devirtualization iteration layer for the new PM. This is an orthogonal and separated layer instead of being embedded inside the pass manager. While it adds a small amount of complexity, it is fairly minimal and the composability and control seems worth the cost. The logic for this ends up being nicely isolated and targeted. It should be easy to experiment with different iteration strategies wrapped around the CGSCC bottom-up walk using this kind of facility. The mechanism used to track devirtualization is the simplest one I came up with. I think it handles most of the cases the existing iteration machinery handles, but I haven't done a very in depth analysis. It does however match the basic intended semantics, and we can tweak or tune its exact behavior incrementally as necessary. One thing that we may want to revisit is freshly building the value handle set on each iteration. While I don't think this will be a significant cost (it is strictly fewer value handles but more churn of value handes than the old call graph), it is conceivable that we'll want a somewhat more clever tracking mechanism. My hope is to layer that on as a follow up patch with data supporting any implementation complexity it adds. This code also provides for a basic count heuristic: if the number of indirect calls decreases and the number of direct calls increases for a given function in the SCC, we assume devirtualization is responsible. This matches the heuristics currently used in the legacy pass manager. Differential Revision: https://reviews.llvm.org/D23114 llvm-svn: 290665	2016-12-28 11:07:33 +00:00
Chandler Carruth	e635289ee2	[PM] Disable the loop vectorizer from the new PM's pipeline as it currenty relies on the old PM's dependency system forming LCSSA. The new PM will require a different design for this, and for now this is causing most of the issues I'm currently seeing in testing. I'd like to get to a testable baseline and then work on re-enabling things one at a time. llvm-svn: 290644	2016-12-28 02:24:55 +00:00
Chandler Carruth	81c8edaf5c	[PM] Disable more of the loop passes -- LCSSA and LoopSimplify are also not really wired into the loop pass manager in a way that will let us productively use these passes yet. This lets the new PM get farther in basic testing which is useful for establishing a good baseline of "doesn't explode". There are still plenty of crashers in basic testing though, this just gets rid of some noise that is well understood and not representing a specific or narrow bug. llvm-svn: 290601	2016-12-27 10:16:46 +00:00
Chandler Carruth	534d644b86	[PM] Try to improve the comments here to make what's going on more clear. Based on post-commit review suggestion from Sean. (Thanks!) llvm-svn: 290488	2016-12-24 05:11:17 +00:00
Chandler Carruth	060ad61fbe	[PM] Add support for building a default AA pipeline to the PassBuilder. Pretty boring and lame as-is but necessary. This is definitely a place we'll end up with extension hooks longer term. =] Differential Revision: https://reviews.llvm.org/D28076 llvm-svn: 290449	2016-12-23 20:38:19 +00:00
Davide Italiano	e05e3306a3	[NewGVN] Add the pass to PassRegistry.def. We need to hook up here to get it working with the new PM. Add a test while here (and remove a typo). llvm-svn: 290350	2016-12-22 16:35:02 +00:00
Chandler Carruth	e3f5064b72	[PM] Introduce a reasonable port of the main per-module pass pipeline from the old pass manager in the new one. I'm not trying to support (initially) the numerous options that are currently available to customize the pass pipeline. If we end up really wanting them, we can add them later, but I suspect many are no longer interesting. The simplicity of omitting them will help a lot as we sort out what the pipeline should look like in the new PM. I've also documented to the best of my ability why each pass or group of passes is used so that reading the pipeline is more helpful. In many cases I think we have some questionable choices of ordering and I've left FIXME comments in place so we know what to come back and revisit going forward. But for now, I've left it as similar to the current pipeline as I could. Lastly, I've had to comment out several places where passes are not ported to the new pass manager or where the loop pass infrastructure is not yet ready. I did at least fix a few bugs in the loop pass infrastructure uncovered by running the full pipeline, but I didn't want to go too far in this patch -- I'll come back and re-enable these as the infrastructure comes online. But I'd like to keep the comments in place because I don't want to lose track of which passes need to be enabled and where they go. One thing that seemed like a significant API improvement was to require that we don't build pipelines for O0. It seems to have no real benefit. I've also switched back to returning pass managers by value as at this API layer it feels much more natural to me for composition. But if others disagree, I'm happy to go back to an output parameter. I'm not 100% happy with the testing strategy currently, but it seems at least OK. I may come back and try to refactor or otherwise improve this in subsequent patches but I wanted to at least get a good starting point in place. Differential Revision: https://reviews.llvm.org/D28042 llvm-svn: 290325	2016-12-22 06:59:15 +00:00
Chandler Carruth	1d96311447	[PM] Provide an initial, minimal port of the inliner to the new pass manager. This doesn't implement every feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these could be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the newly inlined calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161	2016-12-20 03:15:32 +00:00
Daniel Jasper	aec2fa352f	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086	2016-12-19 08:22:17 +00:00
Hal Finkel	3ca4a6bcf1	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756	2016-12-15 03:02:15 +00:00
Chandler Carruth	6b9816477b	[PM] Support invalidation of inner analysis managers from a pass over the outer IR unit. Summary: This never really got implemented, and was very hard to test before a lot of the refactoring changes to make things more robust. But now we can test it thoroughly and cleanly, especially at the CGSCC level. The core idea is that when an inner analysis manager proxy receives the invalidation event for the outer IR unit, it needs to walk the inner IR units and propagate it to the inner analysis manager for each of those units. For example, each function in the SCC needs to get an invalidation event when the SCC gets one. The function / module interaction is somewhat boring here. This really becomes interesting in the face of analysis-backed IR units. This patch effectively handles all of the CGSCC layer's needs -- both invalidating SCC analysis and invalidating function analysis when an SCC gets invalidated. However, this second aspect doesn't really handle the LoopAnalysisManager well at this point. That one will need some change of design in order to fully integrate, because unlike the call graph, the entire function behind a LoopAnalysis's results can vanish out from under us, and we won't even have a cached API to access. I'd like to try to separate solving the loop problems into a subsequent patch though in order to keep this more focused so I've adapted them to the API and updated the tests that immediately fail, but I've not added the level of testing and validation at that layer that I have at the CGSCC layer. An important aspect of this change is that the proxy for the FunctionAnalysisManager at the SCC pass layer doesn't work like the other proxies for an inner IR unit as it doesn't directly manage the FunctionAnalysisManager and invalidation or clearing of it. This would create an ever worsening problem of dual ownership of this responsibility, split between the module-level FAM proxy and this SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy to work in terms of the module-level proxy and defer to it to handle much of the updates. It only does SCC-specific invalidation. This will become more important in subsequent patches that support more complex invalidaiton scenarios. Reviewers: jlebar Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D27197 llvm-svn: 289317	2016-12-10 06:34:44 +00:00
Chandler Carruth	dab4eae274	[PM] Change the static object whose address is used to uniquely identify analyses to have a common type which is enforced rather than using a char object and a `void ` type when used as an identifier. This has a number of advantages. First, it at least helps some of the confusion raised in Justin Lebar's code review of why `void ` was being used everywhere by having a stronger type that connects to documentation about this. However, perhaps more importantly, it addresses a serious issue where the alignment of these pointer-like identifiers was unknown. This made it hard to use them in pointer-like data structures. We were already dodging this in dangerous ways to create the "all analyses" entry. In a subsequent patch I attempted to use these with TinyPtrVector and things fell apart in a very bad way. And it isn't just a compile time or type system issue. Worse than that, the actual alignment of these pointer-like opaque identifiers wasn't guaranteed to be a useful alignment as they were just characters. This change introduces a type to use as the "key" object whose address forms the opaque identifier. This both forces the objects to have proper alignment, and provides type checking that we get it right everywhere. It also makes the types somewhat less mysterious than `void `. We could go one step further and introduce a truly opaque pointer-like type to return from the `ID()` static function rather than returning `AnalysisKey `, but that didn't seem to be a clear win so this is just the initial change to get to a reliably typed and aligned object serving is a key for all the analyses. Thanks to Richard Smith and Justin Lebar for helping pick plausible names and avoid making this refactoring many times. =] And thanks to Sean for the super fast review! While here, I've tried to move away from the "PassID" nomenclature entirely as it wasn't really helping and is overloaded with old pass manager constructs. Now we have IDs for analyses, and key objects whose address can be used as IDs. Where possible and clear I've shortened this to just "ID". In a few places I kept "AnalysisID" to make it clear what was being identified. Differential Revision: https://reviews.llvm.org/D27031 llvm-svn: 287783	2016-11-23 17:53:26 +00:00
Davide Italiano	2ae76dd239	[GlobalSplit] Port to the new pass manager. llvm-svn: 287511	2016-11-21 00:28:23 +00:00
Chris Bieneman	05c279fc4b	[CMake] NFC. Updating CMake dependency specifications This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206	2016-11-17 04:36:50 +00:00
Rong Xu	1c0e9b97d2	Conditionally eliminate library calls where the result value is not used Summary: This pass shrink-wraps a condition to some library calls where the call result is not used. For example: sqrt(val); is transformed to if (val < 0) sqrt(val); Even if the result of library call is not being used, the compiler cannot safely delete the call because the function can set errno on error conditions. Note in many functions, the error condition solely depends on the incoming parameter. In this optimization, we can generate the condition can lead to the errno to shrink-wrap the call. Since the chances of hitting the error condition is low, the runtime call is effectively eliminated. These partially dead calls are usually results of C++ abstraction penalty exposed by inlining. This optimization hits 108 times in 19 C/C++ programs in SPEC2006. Reviewers: hfinkel, mehdi_amini, davidxl Subscribers: modocache, mgorny, mehdi_amini, xur, llvm-commits, beanz Differential Revision: https://reviews.llvm.org/D24414 llvm-svn: 284542	2016-10-18 21:36:27 +00:00
Mehdi Amini	55b06538b5	Fix test after renaming -name-anon-functions pass to -name-anon-globals llvm-svn: 281752	2016-09-16 17:18:16 +00:00
Mehdi Amini	27d2379b4e	Rename NameAnonFunctions to NameAnonGlobals to match what it is doing (NFC) llvm-svn: 281745	2016-09-16 16:56:30 +00:00
Sriraman Tallam	06a67ba57d	[PM] Port CFGViewer and CFGPrinter to the new Pass Manager Differential Revision: https://reviews.llvm.org/D24592 llvm-svn: 281640	2016-09-15 18:35:27 +00:00
Geoff Berry	8d84605f25	[EarlyCSE] Optionally use MemorySSA. NFC. Summary: Use MemorySSA, if requested, to do less conservative memory dependency checking. This change doesn't enable the MemorySSA enhanced EarlyCSE in the default pipelines, so should be NFC. Reviewers: dberlin, sanjoy, reames, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19821 llvm-svn: 280279	2016-08-31 19:24:10 +00:00
Chandler Carruth	8882346842	[PM] Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing some mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a great name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the particular nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the highest entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks EVERYTHING. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a persistent set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a new, bottom SCC/RefSCC. - We also need the ability to skip SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip nodes from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then directly update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to invalidate the analyses associated with it. - We also need to invalidate analyses after we finish processing an SCC. But the analyses we need to invalidate here are only those for the newly updated SCC!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618	2016-08-24 09:37:14 +00:00
Chandler Carruth	9b35e6d746	[PM] Re-instate r279227 and r279228 with a fix to the way the templating was done to hopefully appease MSVC. As an upside, this also implements the suggestion Sanjoy made in code review, so two for one! =] I'll be watching the bots to see if there are still issues. llvm-svn: 279295	2016-08-19 18:36:06 +00:00
Chandler Carruth	b8824a5d3f	[PM] Revert r279227 and r279228 until I can find someone to help me solve completely opaque MSVC build errors. It complains about lots of stuff with this change without givin nearly enough information to even try to fix. llvm-svn: 279231	2016-08-19 10:51:55 +00:00
Chandler Carruth	db1759ace1	[PM] Make the the new pass manager support fully generic extra arguments to run methods, both for transform passes and analysis passes. This also allows the analysis manager to use a different set of extra arguments from the pass manager where useful. Consider passes over analysis produced units of IR like SCCs of the call graph or loops. Passes of this nature will often want to refer to the analysis result that was used to compute their IR units (the call graph or LoopInfo). And for transformations, they may want to communicate special update information to the outer pass manager. With this change, it becomes possible to have a run method for a loop pass that looks more like: PreservedAnalyses run(Loop &L, AnalysisManager<Loop, LoopInfo> &AM, LoopInfo &LI, LoopUpdateRecord &UR); And to query the analysis manager like: AM.getResult<MyLoopAnalysis>(L, LI); This makes accessing the known-available analyses convenient and clear, and it makes passing customized data structures around easy. My initial use case is going to be in updating the pass manager layers when the analysis units of IR change. But there are more use cases here such as having a layer that lets inner passes signal whether certain additional passes should be run because of particular simplifications made. Two desires for this have come up in the past: triggering additional optimization after successfully unrolling loops, and triggering additional inlining after collapsing indirect calls to direct calls. Despite adding this layer of generic extensibility, the only change to existing, simple usage are for places where we forward declare the AnalysisManager template. We really shouldn't be doing this because of the fragility exposed here, but currently it makes coping with the legacy PM code easier. Differential Revision: http://reviews.llvm.org/D21462 llvm-svn: 279227	2016-08-19 09:45:16 +00:00
Chandler Carruth	67fc52f067	[PM] Port the always inliner to the new pass manager in a much more minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896	2016-08-17 02:56:20 +00:00
Teresa Johnson	1eca6bc6a7	[PM] Port LoopDataPrefetch to new pass manager Summary: Refactor the existing support into a LoopDataPrefetch implementation class and a LoopDataPrefetchLegacyPass class that invokes it. Add a new LoopDataPrefetchPass for the new pass manager that utilizes the LoopDataPrefetch implementation class. Reviewers: mehdi_amini Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23483 llvm-svn: 278591	2016-08-13 04:11:27 +00:00
Michael Kuperstein	31b8399beb	[PM] Port LowerInvoke to the new pass manager llvm-svn: 278531	2016-08-12 17:28:27 +00:00
Teresa Johnson	4223dd8559	[PM] Port NameAnonFunction pass to new pass manager Summary: Port the NameAnonFunction pass and add a test. Depends on D23439. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23440 llvm-svn: 278509	2016-08-12 14:03:36 +00:00
Teresa Johnson	f93b246f8b	[PM] Port ModuleSummaryIndex analysis to new pass manager Summary: Port the ModuleSummaryAnalysisWrapperPass to the new pass manager. Use it in the ported BitcodeWriterPass (similar to how we use the legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass). Also, pass the -module-summary opt flag through to the new pass manager pipeline and through to the bitcode writer pass, and add a test that uses it. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23439 llvm-svn: 278508	2016-08-12 13:53:02 +00:00
Sean Silva	5f6ec06f17	Consistently use CGSCCAnalysisManager Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278080	2016-08-09 00:28:56 +00:00

1 2 3 4 5

201 Commits