llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	49d728ad21	[LCG] Redesign the lazy post-order iteration mechanism for the LazyCallGraph to support repeated, stable iterations, even in the face of graph updates. This is particularly important to allow the CGSCC pass manager to walk the RefSCCs (and thus everything else) in a module more than once. Lots of unittests and other tests were hard or impossible to write because repeated CGSCC pass managers which didn't invalidate the LazyCallGraph would conclude the module was empty after the first one. =[ Really, really bad. The interesting thing is that in many ways this simplifies the code. We can now re-use the same code for handling reference edge insertion updates of the RefSCC graph as we use for handling call edge insertion updates of the SCC graph. Outside of adapting to the shared logic for this (which isn't trivial, but is much simpler than the DFS it replaces!), the new code involves putting newly created RefSCCs when deleting a reference edge into the cached list in the correct way, and to re-formulate the iterator to be stable and effective even in the face of these kinds of updates. I've updated the unittests for the LazyCallGraph to re-iterate the postorder sequence and verify that this all works. We even check for using alternating iterators to trigger the lazy formation of RefSCCs after mutation has occured. It's worth noting that there are a reasonable number of likely simplifications we can make past this. It isn't clear that we need to keep the "LeafRefSCCs" around any more. But I've not removed that mostly because I want this to be a more isolated change. Differential Revision: https://reviews.llvm.org/D24219 llvm-svn: 281716	2016-09-16 10:20:17 +00:00
Chandler Carruth	11b3f60cd9	[LCG] Clean up and make NDEBUG verify calls more rigorous with make_scope_exit now that we have that utility. This makes the code much more clear and readable by isolating the check. It also makes it easy to go through and make sure all the interesting update routines have a start and end verify so we don't slowly let the graph drift into an invalid state. llvm-svn: 280619	2016-09-04 08:34:31 +00:00
Chandler Carruth	1f621f0a70	[LCG] A NFC refactoring to extract the logic for doing a postorder-sequence based update after edge insertion into a generic helper function. This separates the SCC-specific logic into two fairly simple lambdas and extracts the rest into a generic helper template function. I think this is a net win on its own merits because it disentangles different pieces of the algorithm. Now there is one place that does the two-step partition to identify a set of newly connected components and at the same time update the postorder sequence. However, I'm also hoping to re-use this an upcoming patch to update a cached post-order sequence of RefSCCs when doing the analogous update to the RefSCC graph, and I don't want to have two copies. The diff is quite messy but this really is just moving things around and making types generic rather than specific. llvm-svn: 280618	2016-09-04 08:34:24 +00:00
Chandler Carruth	8882346842	[PM] Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing some mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a great name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the particular nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the highest entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks EVERYTHING. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a persistent set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a new, bottom SCC/RefSCC. - We also need the ability to skip SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip nodes from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then directly update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to invalidate the analyses associated with it. - We also need to invalidate analyses after we finish processing an SCC. But the analyses we need to invalidate here are only those for the newly updated SCC!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618	2016-08-24 09:37:14 +00:00
David Majnemer	42531260b3	Use the range variant of find/find_if instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278469	2016-08-12 03:55:06 +00:00
David Majnemer	0d955d0bf5	Use the range variant of find instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433	2016-08-11 22:21:41 +00:00
David Majnemer	0a16c22846	Use range algorithms instead of unpacking begin/end No functionality change is intended. llvm-svn: 278417	2016-08-11 21:15:00 +00:00
Chandler Carruth	168800c97d	[LCG] Hoist the definitions of the stream operator friends to be inline friend definitions. Based on the experiments Sean Silva and Reid did, this seems the safest course of action and also will work around a questionable warning provided by GCC6 on the old form of the code. Thanks for Davide pointing out the issue and other suggesting ways to fix. llvm-svn: 274740	2016-07-07 07:52:07 +00:00
Chandler Carruth	dca834089a	[PM] Improve the debugging and logging facilities of the CGSCC bits of the new pass manager. This adds operator<< overloads for the various bits of the LazyCallGraph, dump methods for use from the debugger, and debug logging using them to the CGSCC pass manager. Having this was essential for debugging the call graph update patch, and I've extracted what I could from that patch here to minimize the delta. llvm-svn: 273961	2016-06-27 23:26:08 +00:00
Sean Silva	7cb30664fc	Add a super basic LazyCallGraph DOT printer. Access it through -passes=print-lcg-dot Let me know any suggestions for changing the rendering; I'm not particularly attached to what is implemented here. llvm-svn: 273082	2016-06-18 09:17:32 +00:00
Chandler Carruth	b47f8010a9	[PM] Make the AnalysisManager parameter to run methods a reference. This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219	2016-03-11 11:05:24 +00:00
Chandler Carruth	b4faf13c15	[PM] Implement the final conclusion as to how the analysis IDs should work in the face of the limitations of DLLs and templated static variables. This requires passes that use the AnalysisBase mixin provide a static variable themselves. So as to keep their APIs clean, I've made these private and befriended the CRTP base class (which is the common practice). I've added documentation to AnalysisBase for why this is necessary and at what point we can go back to the much simpler system. This is clearly a better pattern than the extern template as it caught numerous places where the template magic hadn't been applied and things were "just working" but would eventually have broken mysteriously. llvm-svn: 263216	2016-03-11 10:22:49 +00:00
NAKAMURA Takumi	df0cd72657	[PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix for clang. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262188	2016-02-28 17:17:00 +00:00
NAKAMURA Takumi	ca04a1f720	Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal tweaks." I'll rework soon. llvm-svn: 262186	2016-02-28 16:54:06 +00:00
NAKAMURA Takumi	de40e7437e	[PM] Appease mingw32's auto-import DLL build with minimal tweaks. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262185	2016-02-28 16:38:46 +00:00
Chandler Carruth	3a63435551	[PM] Introduce CRTP mixin base classes to help define passes and analyses in the new pass manager. These just handle really basic stuff: turning a type name into a string statically that is nice to print in logs, and getting a static unique ID for each analysis. Sadly, the format of passes in anonymous namespaces makes using their names in tests really annoying so I've customized the names of the no-op passes to keep tests sane to read. This is the first of a few simplifying refactorings for the new pass manager that should reduce boilerplate and confusion. llvm-svn: 262004	2016-02-26 11:44:45 +00:00
Chandler Carruth	e5944d97d8	[LCG] Construct an actual call graph with call-edge SCCs nested inside reference-edge SCCs. This essentially builds a more normal call graph as a subgraph of the "reference graph" that was the old model. This allows both to exist and the different use cases to use the aspect which addresses their needs. Specifically, the pass manager and other ordering constrained logic can use the reference graph to achieve conservative order of visit, while analyses reasoning about attributes and other properties derived from reachability can reason about the direct call graph. Note that this isn't necessarily complete: it doesn't model edges to declarations or indirect calls. Those can be found by scanning the instructions of the function if desirable, and in fact every user currently does this in order to handle things like calls to instrinsics. If useful, we could consider caching this information in the call graph to save the instruction scans, but currently that doesn't seem to be important. An important realization for why the representation chosen here works is that the call graph is a formal subset of the reference graph and thus both can live within the same data structure. All SCCs of the call graph are necessarily contained within an SCC of the reference graph, etc. The design is to build 'RefSCC's to model SCCs of the reference graph, and then within them more literal SCCs for the call graph. The formation of actual call edge SCCs is not done lazily, unlike reference edge 'RefSCC's. Instead, once a reference SCC is formed, it directly builds the call SCCs within it and stores them in a post-order sequence. This is used to provide a consistent platform for mutation and update of the graph. The post-order also allows for very efficient updates in common cases by bounding the number of nodes (and thus edges) considered. There is considerable common code that I'm still looking for the best way to factor out between the various DFS implementations here. So far, my attempts have made the code harder to read and understand despite reducing the duplication, which seems a poor tradeoff. I've not given up on figuring out the right way to do this, but I wanted to wait until I at least had the system working and tested to continue attempting to factor it differently. This also requires introducing several new algorithms in order to handle all of the incremental update scenarios for the more complex structure involving two edge colorings. I've tried to comment the algorithms sufficiently to make it clear how this is expected to work, but they may still need more extensive documentation. I know that there are some changes which are not strictly necessarily coupled here. The process of developing this started out with a very focused set of changes for the new structure of the graph and algorithms, but subsequent changes to bring the APIs and code into consistent and understandable patterns also ended up touching on other aspects. There was no good way to separate these out without causing massive merge conflicts. Ultimately, to a large degree this is a rewrite of most of the core algorithms in the LCG class and so I don't think it really matters much. Many thanks to the careful review by Sanjoy Das! Differential Revision: http://reviews.llvm.org/D16802 llvm-svn: 261040	2016-02-17 00:18:16 +00:00
Chandler Carruth	a4499e9f73	[LCG] Build an edge abstraction for the LazyCallGraph and use it to differentiate between indirect references to functions an direct calls. This doesn't do a whole lot yet other than change the print out produced by the analysis, but it lays the groundwork for a very major change I'm working on next: teaching the call graph to actually be a call graph, modeling both the indirect reference graph and the call graph simultaneously. More details on that in the next patch though. The rest of this is essentially a bunch of over-engineering that won't be interesting until the next patch. But this also isolates essentially all of the churn necessary to introduce the edge abstraction from the very important behavior change necessary in order to separately model the two graphs. So it should make review of the subsequent patch a bit easier at the cost of making this patch seem poorly motivated. ;] Differential Revision: http://reviews.llvm.org/D16038 llvm-svn: 259463	2016-02-02 03:57:13 +00:00
Chandler Carruth	9153911551	[lcg] Fix a few more formatting goofs found by clang-format. NFC. llvm-svn: 256480	2015-12-28 01:54:20 +00:00
Chandler Carruth	11f5032368	Revert r225854: [PM] Move the LazyCallGraph printing functionality to a print method. This was formulated on a bad idea, but sadly I didn't uncover how bad this was until I got further down the path. I had hoped that we could provide a low boilerplate way of printing analyses, but it just doesn't seem like this really fits the needs of the analyses. Not all analyses really want to do printing, and those that do don't all use the same interface. Instead, with the new pass manager let's just take advantage of the fact that creating an explicit printer pass like the LCG has is pretty low boilerplate already and rely on that for testing. llvm-svn: 225861	2015-01-14 00:27:45 +00:00
Chandler Carruth	76890d82c0	[PM] Move the LazyCallGraph printing functionality to a print method. I'm adding generic analysis printing utility pass support which will require such a method (or a specialization) so this will let the existing printing logic satisfy that. llvm-svn: 225854	2015-01-13 23:53:50 +00:00
Chandler Carruth	d174ce4ad1	[PM] Switch the new pass manager to use a reference-based API for IR units. This was debated back and forth a bunch, but using references is now clearly cleaner. Of all the code written using pointers thus far, in only one place did it really make more sense to have a pointer. In most cases, this just removes immediate dereferencing from the code. I think it is much better to get errors on null IR units earlier, potentially at compile time, than to delay it. Most notably, the legacy pass manager uses references for its routines and so as more and more code works with both, the use of pointers was likely to become really annoying. I noticed this when I ported the domtree analysis over and wrote the entire thing with references only to have it fail to compile. =/ It seemed better to switch now than to delay. We can, of course, revisit this is we learn that references are really problematic in the API. llvm-svn: 225145	2015-01-05 02:47:05 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
Alp Toker	beaca19c7c	Fix typos llvm-svn: 208839	2014-05-15 01:52:21 +00:00
Chandler Carruth	312dddfb81	[LCG] Add the last (and most complex) of the edge insertion mutation operations on the call graph. This one forms a cycle, and while not as complex as removing an internal edge from an SCC, it involves a reasonable amount of work to find all of the nodes newly connected in a cycle. Also somewhat alarming is the worst case complexity here: it might have to walk roughly the entire SCC inverse DAG to insert a single edge. This is carefully documented in the API (I hope). llvm-svn: 207935	2014-05-04 09:38:32 +00:00
Chandler Carruth	7cc4ed8202	[LCG] Add the other simple edge insertion API to the call graph. This just connects an SCC to one of its descendants directly. Not much of an impact. The last one is the hard one -- connecting an SCC to one of its ancestors, and thereby forming a cycle such that we have to merge all the SCCs participating in the cycle. llvm-svn: 207751	2014-05-01 12:18:20 +00:00
Chandler Carruth	034d0d6805	[LCG] Don't lookup the child SCC twice. Spotted this by inspection, and no functionality changed. llvm-svn: 207750	2014-05-01 12:16:31 +00:00
Chandler Carruth	4b096741b4	[LCG] Add some basic methods for querying the parent/child relationships of SCCs in the SCC DAG. Exercise them in the big graph test case. These will be especially useful for establishing invariants in insertion logic. llvm-svn: 207749	2014-05-01 12:12:42 +00:00
Chandler Carruth	5217c94522	[LCG] Add the really, really boring edge insertion case: adding an edge entirely within an existing SCC. Shockingly, making the connected component more connected is ... a total snooze fest. =] Anyways, its wired up, and I even added a test case to make sure it pretty much sorta works. =D llvm-svn: 207631	2014-04-30 10:48:36 +00:00
Chandler Carruth	c5026b670e	[LCG] Actually test the basic edge removal bits (IE, the non-SCC bits), and discover that it's totally broken. Yay tests. Boo bug. Fix the basic edge removal so that it works by nulling out the removed edges rather than actually removing them. This leaves the indices valid in the map from callee to index, and preserves some of the locality for iterating over edges. The iterator is made bidirectional to reflect that it now has to skip over null entries, and the skipping logic is layered onto it. As future work, I would like to track essentially the "load factor" of the edge list, and when it falls below a threshold do a compaction. An alternative I considered (and continue to consider) is storing the callees in a doubly linked list where each element of the list is in a set (which is essentially the classical linked-hash-table datastructure). The problem with that approach is that either you need to heap allocate the linked list nodes and use pointers to them, or use a bucket hash table (with even more linked list pointer overhead!), etc. It's pretty easy to get 5x overhead for values that are just pointers. So far, I think punching holes in the vector, and periodic compaction is likely to be much more efficient overall in the space/time tradeoff. llvm-svn: 207619	2014-04-30 07:45:27 +00:00
Chandler Carruth	c00a7ff4b7	[LCG] Add the most basic of edge insertion to the lazy call graph. This just handles the pre-DFS case. Also add some test cases for this case to make sure it works. llvm-svn: 207411	2014-04-28 11:10:23 +00:00
Chandler Carruth	3f5f5fe164	[LCG] Make the return of the IntraSCC removal method actually match its contract (and be much more useful). It now provides exactly the post-order traversal a caller might need to perform on newly formed SCCs. llvm-svn: 207410	2014-04-28 10:49:06 +00:00
Chandler Carruth	aa839b22c9	[LCG] Re-organize the methods for mutating a call graph to make their API requirements much more obvious. The key here is that there are two totally different use cases for mutating the graph. Prior to doing any SCC formation, it is very easy to mutate the graph. There may be users that want to do small tweaks here, and then use the already-built graph for their SCC-based operations. This method remains on the graph itself and is documented carefully as being cheap but unavailable once SCCs are formed. Once SCCs are formed, and there is some in-flight DFS building them, we have to be much more careful in how we mutate the graph. These mutation operations are sunk onto the SCCs themselves, which both simplifies things (the code was already there!) and helps make it obvious that these interfaces are only applicable within that context. The other primary constraint is that the edge being mutated is actually related to the SCC on which we call the method. This helps make it obvious that you cannot arbitrarily mutate some other SCC. I've tried to write much more complete documentation for the interesting mutation API -- intra-SCC edge removal. Currently one aspect of this documentation is a lie (the result list of SCCs) but we also don't even have tests for that API. =[ I'm going to add tests and fix it to match the documentation next. llvm-svn: 207339	2014-04-27 01:59:50 +00:00
Chandler Carruth	90821c2a93	[LCG] Rather than removing nodes from the SCC entry set when we process them, just skip over any DFS-numbered nodes when finding the next root of a DFS. This allows the entry set to just be a vector as we populate it from a uniqued source. It also removes the possibility for a linear scan of the entry set to actually do the removal which can make things go quadratic if we get unlucky. llvm-svn: 207312	2014-04-26 09:45:55 +00:00
Chandler Carruth	5e2d70b9a3	[LCG] Rotate the full SCC finding algorithm to avoid round-trips through the DFS stack for leaves in the call graph. As mentioned in my previous commit, this is particularly interesting for graphs which have high fan out but low connectivity resulting in many leaves. For such graphs, this can remove a large % of the DFS stack traffic even though it doesn't make the stack much smaller. It's a bit easier to formulate this for the full algorithm because that one stops completely for each SCC. For example, I was able to directly eliminate the "Recurse" boolean used to continue an outer loop from the inner loop. llvm-svn: 207311	2014-04-26 09:28:00 +00:00
Chandler Carruth	aca48d0443	[LCG] Hoist the main DFS loop out of the edge removal function. This makes working through the worklist much cleaner, and makes it possible to avoid the 'bool-to-continue-the-outer-loop' hack. Not a huge difference, but I think this is approaching as polished as I can make it. llvm-svn: 207310	2014-04-26 09:06:53 +00:00
Chandler Carruth	680af7a78c	[LCG] In the incremental SCC re-formation, lift the node currently being processed in the DFS out of the stack completely. Keep it exclusively in a variable. Re-shuffle some code structure to make this easier. This can have a very dramatic effect in some cases because call graphs tend to look like a high fan-out spanning tree. As a consequence, there are a large number of leaf nodes in the graph, and this technique causes leaf nodes to never even go into the stack. While this only reduces the max depth by 1, it may cause the total number of round trips through the stack to drop by a lot. Now, most of this isn't really relevant for the incremental version. =] But I wanted to prototype it first here as this variant is in ways more complex. As long as I can get the code factored well here, I'll next make the primary walk look the same. There are several refactorings this exposes I think. llvm-svn: 207306	2014-04-26 03:36:42 +00:00
Chandler Carruth	a7205b6154	[LCG] Special case the removal of self edges. These don't impact the SCC graph in any way because we don't track edges in the SCC graph, just nodes. This also lets us add a nice assert about the invariant that we're working on at least a certain number of nodes within the SCC. llvm-svn: 207305	2014-04-26 03:36:37 +00:00
Chandler Carruth	8f92d6db22	[LCG] Refactor the duplicated code I added in my last commit here into a helper function. Also factor the other two places where we did the same thing into the helper function. =] Much cleaner this way. NFC. llvm-svn: 207300	2014-04-26 01:03:46 +00:00
Chandler Carruth	9ba7762d7f	[LCG] During the incremental update of an SCC, switch to using the SCCMap to test for nodes that have been re-added to the root SCC rather than a set vector. We already have done the SCCMap lookup, we juts need to test it in two different ways. In turn, do most of the processing of these nodes as they go into the root SCC rather than lazily. This simplifies the final loop to just stitch the root SCC into its children's parent sets. No functionlatiy changed. However, this makes a few things painfully obvious, which was my intent. =] There is tons of repeated code introduced here and elsewhere. I'm splitting the refactoring of that code into helpers from this change so its clear that this is the change which switches the datastructures used around, and the other is a pure factoring & deduplication of code change. llvm-svn: 207217	2014-04-25 09:52:44 +00:00
Chandler Carruth	2e6ef0e80f	[LCG] During the incremental re-build of an SCC after removing an edge, remove the nodes in the SCC from the SCC map entirely prior to the DFS walk. This allows the SCC map to represent both the state of not-yet-re-added-to-an-SCC and added-back-to-this-SCC independently. The first is being missing from the SCC map, the second is mapping back to 'this'. In a subsequent commit, I'm going to use this property to simplify the new node list for this SCC. In theory, I think this also makes the contract for orphaning a node from the graph slightly less confusing. Now it is also orphaned from the SCC graph. Still, this isn't quite right either, and so I'm not adding test cases here. I'll add test cases for the behavior of orphaning nodes when the code actually supports it. The change here is mostly incidental, my goal is simplifying the algorithm. llvm-svn: 207213	2014-04-25 09:08:10 +00:00
Chandler Carruth	770060ddfa	[LCG] Rather than doing a linear time SmallSetVector removal of each child from the worklist, wait until we actually need to pop another element off of the worklist and skip over any that were already visited by the DFS. This also enables swapping the nodes of the SCC into the worklist. No functionality changed. llvm-svn: 207212	2014-04-25 09:08:05 +00:00
Chandler Carruth	6b88e3a545	[LCG] Remove a completely unnecessary loop. It wasn't even doing any thing, just mucking up the code. I feel bad that I even wrote this loop. Very sorry. The diff is huge because of the indent change, but I promise all this is doing is realizing that the outer two loops were actually the exact same loops, and we didn't need two of them. llvm-svn: 207202	2014-04-25 06:45:06 +00:00
Chandler Carruth	774c9320c0	[LCG] Now that the loop structure of the core SCC finding routine is factored into a more reasonable form, replace the tail call with a simple outer-loop continuation. It's sad that C++ makes this so awkward to write, but it seems more direct and clear than the tail call at this point. llvm-svn: 207201	2014-04-25 06:38:58 +00:00
Chandler Carruth	91dcf0f977	[LCG] Switch a weird do/while loop that actually couldn't fail its condition into an obviously infinite loop with an assert about the degenerate condition. No functionality changed. llvm-svn: 207147	2014-04-24 21:19:30 +00:00
Chandler Carruth	24553934f8	[LCG] Incorporate the core trick of improvements on the naive Tarjan's algorithm here: http://dl.acm.org/citation.cfm?id=177301. The idea of isolating the roots has even more relevance when using the stack not just to implement the DFS but also to implement the recursive step. Because we use it for the recursive step, to isolate the roots we need to maintain two stacks: one for our recursive DFS walk, and another of the nodes that have been walked. The nice thing is that the latter will be half the size. It also fixes a complete hack where we scanned backwards over the stack to find the next potential-root to continue processing. Now that is always the top of the DFS stack. While this is a really nice improvement already (IMO) it further opens the door for two important simplifications: 1) De-duplicating some of the code across the two different walks. I've actually made the duplication a bit worse in some senses with this patch because the two are starting to converge. 2) Dramatically simplifying the loop structures of both walks. I wanted to do those separately as they'll be essentially just CFG restructuring. This patch on the other hand actually uses different datastructures to implement the algorithm itself. llvm-svn: 207098	2014-04-24 11:05:20 +00:00
Chandler Carruth	09751bf173	[LCG] Rotate logic applied to the top of the DFSStack to instead be applied prior to pushing a node onto the DFSStack. This is the first step toward avoiding the stack entirely for leaf nodes. It also simplifies things a bit and I think is pointing the way toward factoring some more of the shared logic out of the two implementations. It is also making it more obvious how to restructure the loops themselves to be a bit easier to read (although no different in terms of functionality). llvm-svn: 207095	2014-04-24 09:59:59 +00:00
Chandler Carruth	493e0a6ad0	[LCG] Switch the parent SCC tracking from a SmallSetVector to a SmallPtrSet. Currently, there is no need for stable iteration in this dimension, and I now thing there won't need to be going forward. If this is ever re-introduced in any form, it needs to not be a SetVector based solution because removal cannot be linear. There will be many SCCs with large numbers of parents. When encountering these, the incremental SCC update for intra-SCC edge removal was quadratic due to linear removal (kind of). I'm really hoping we can avoid having an ordering property here at all though... llvm-svn: 207091	2014-04-24 09:22:31 +00:00
Chandler Carruth	d52f8e0e4d	[LCG] We don't actually need a set in each SCC to track the nodes. We can use the node -> SCC mapping in the top-level graph to test this on the rare occasions we need it. llvm-svn: 207090	2014-04-24 08:55:36 +00:00
Chandler Carruth	6a4fee87bc	[LCG] Normalize the post-order SCC iterator to just iterate over the SCC values rather than having pointers in weird places. llvm-svn: 207053	2014-04-23 23:51:07 +00:00

1 2

75 Commits