llvm-project

Commit Graph

Author	SHA1	Message	Date
Quentin Colombet	b926bdac4c	Reapply r263460: [SpillPlacement] Fix a quadratic behavior in spill placement. Using Chandler's words from r265331: This commit was greatly exacerbating PR17409 and effectively regressed build time for lot of (very large) code when compiled with ASan or MSan. PR17409 is fixed by r269249, so this is fine to reapply r263460. Original commit message: The bad behavior happens when we have a function with a long linear chain of basic blocks, and have a live range spanning most of this chain, but with very few uses. Let say we have only 2 uses. The Hopfield network is only seeded with two active blocks where the uses are, and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two new nodes to the network due to the completely linear shape of the CFG. Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes, which adds up to a quadratic algorithm. This is an historical accident effect from r129188. When the Hopfield network is expanding, most of the action is happening on the frontier where new nodes are being added. The internal nodes in the network are not likely to be flip-flopping much, or they will at least settle down very quickly. This means that while `SpillPlacer->iterate()` is recomputing all the nodes in the network, it is probably only the two frontier nodes that are changing their output. Instead of recomputing the whole network on each iteration, we can maintain a SparseSet of nodes that need to be updated: - `SpillPlacement::activate()` adds the node to the todo list. - When a node changes value (i.e., `update()` returns true), its neighbors are added to the todo list. - `SpillPlacement::iterate()` only updates the nodes in the list. The result of Hopfield iterations is not necessarily exact. It should converge to a local minimum, but there is no guarantee that it will find a global minimum. It is possible that updating nodes in a different order will cause us to switch to a different local minimum. In other words, this is not NFC, but although I saw a few runtime improvements and regressions when I benchmarked this change, those were side effects and actually the performance change is in the noise as expected. Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks, guidance and time for the review. llvm-svn: 270149	2016-05-19 22:40:37 +00:00
Chandler Carruth	613eec8210	Revert r263460: [SpillPlacement] Fix a quadratic behavior in spill placement. That commit looks wonderful and awesome. Sadly, it greatly exacerbates PR17409 and effectively regresses build time for a lot of (very large) code when compiled with ASan or MSan. We thought this could be fixed forward by landing D15302 which at last fixes that PR, but some issues were discovered and it looks like that got reverted, so reverting this as well temporarily. As soon as the fix for PR17409 lands and sticks, we should re-land this patch as it won't trigger more significant test cases hitting that bug. Many thanks to Quentin and Wei here as they're doing all the awesome hard work!!! llvm-svn: 265331	2016-04-04 18:57:50 +00:00
Quentin Colombet	40ce25b68b	[SpillPlacement] Fix a quadratic behavior in spill placement. The bad behavior happens when we have a function with a long linear chain of basic blocks, and have a live range spanning most of this chain, but with very few uses. Let say we have only 2 uses. The Hopfield network is only seeded with two active blocks where the uses are, and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two new nodes to the network due to the completely linear shape of the CFG. Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes, which adds up to a quadratic algorithm. This is an historical accident effect from r129188. When the Hopfield network is expanding, most of the action is happening on the frontier where new nodes are being added. The internal nodes in the network are not likely to be flip-flopping much, or they will at least settle down very quickly. This means that while `SpillPlacer->iterate()` is recomputing all the nodes in the network, it is probably only the two frontier nodes that are changing their output. Instead of recomputing the whole network on each iteration, we can maintain a SparseSet of nodes that need to be updated: - `SpillPlacement::activate()` adds the node to the todo list. - When a node changes value (i.e., `update()` returns true), its neighbors are added to the todo list. - `SpillPlacement::iterate()` only updates the nodes in the list. The result of Hopfield iterations is not necessarily exact. It should converge to a local minimum, but there is no guarantee that it will find a global minimum. It is possible that updating nodes in a different order will cause us to switch to a different local minimum. In other words, this is not NFC, but although I saw a few runtime improvements and regressions when I benchmarked this change, those were side effects and actually the performance change is in the noise as expected. Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks, guidance and time for the review. llvm-svn: 263460	2016-03-14 18:21:25 +00:00
Duncan P. N. Exon Smith	f1ff53ecc2	CodeGen: Remove implicit ilist iterator conversions, NFC Finish removing implicit ilist iterator conversions from LLVMCodeGen. I'm sure there are lots more of these in lib/CodeGen/*/. llvm-svn: 249915	2015-10-09 22:56:24 +00:00
Matthias Braun	c804cdb912	TargetRegisterInfo: Introduce PrintLaneMask. This makes it more convenient to print lane masks and lead to more uniform printing. llvm-svn: 248624	2015-09-25 21:51:24 +00:00
Chandler Carruth	7425c8c279	Fix the threshold added in r186434 (a re-apply of r185393) and updaated to be a ManagedStatic in r218163 to not be a global variable written and read to from within the innards of SpillPlacement. This will fix a really scary race condition for anyone that has two copies of LLVM running spill placement concurrently. Yikes! This will also fix a really significant compile time hit that r218163 caused because the spill placement threshold read is actually in the very hot path of this code. The memory fence on each read was showing up as huge compile time regressions when spilling is responsible for most of the compile time. For example, optimizing sanitized code showed over 50% compile time regressions here. =/ llvm-svn: 218921	2014-10-02 22:23:14 +00:00
Chris Bieneman	1a98490ce5	Converting SpillPlacement's BlockFrequency threshold to a ManagedStatic to avoid static constructors and destructors. llvm-svn: 218163	2014-09-19 22:46:28 +00:00
Chandler Carruth	1b9dde087e	[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837	2014-04-22 02:02:50 +00:00
Craig Topper	c0196b1b40	[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr. llvm-svn: 206142	2014-04-14 00:51:57 +00:00
Duncan P. N. Exon Smith	a5df813e51	RegAlloc: Account for a variable entry block frequency Until r197284, the entry frequency was constant -- i.e., set to 2^14. Although current ToT still has a constant entry frequency, since r197284 that has been an implementation detail (which is soon going to change). - r204690 made the wrong assumption for the CSRCost metric. Adjust callee-saved register cost based on entry frequency. - r185393 made the wrong assumption (although it was valid at the time). Update SpillPlacement.cpp::Threshold to be relative to the entry frequency. Since ToT still has 2^14 entry frequency, this should have no observable functionality change. <rdar://problem/14292693> llvm-svn: 205789	2014-04-08 19:18:56 +00:00
Benjamin Kramer	b6d0bd48bd	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev. Remove the old functions. llvm-svn: 202636	2014-03-02 12:27:27 +00:00
Manman Ren	709c951b42	SpillPlacement: fix a bug in iterate. Inside iterate, we scan backwards then scan forwards in a loop. When iteration is not zero, the last node was just updated so we can skip it. But when iteration is zero, we can't skip the last node. For the testing case, fixing this will save a spill and move register copies from hot path to cold path. llvm-svn: 202557	2014-02-28 23:05:31 +00:00
Michael Gottesman	5e985ee5b5	[block-freq] Rename getEntryFrequency() -> getEntryFreq() to match getBlockFreq() in all BlockFrequencyInfo. llvm-svn: 197304	2013-12-14 02:37:38 +00:00
Michael Gottesman	092647b37a	[block-freq] Store MBFI as a field on SpillPlacement so we can access it to get the entry frequency while processing data. llvm-svn: 197291	2013-12-14 00:25:47 +00:00
Jakob Stoklund Olesen	c5454ff046	Reapply r185393. Original commit message: Remove floating point computations from SpillPlacement.cpp. Patch by Benjamin Kramer! Use the BlockFrequency class instead of floats in the Hopfield network computations. This rescales the node Bias field from a [-2;2] float range to two block frequencies BiasN and BiasP pulling in opposite directions. This construct has a more predictable behavior when block frequencies saturate. The per-node scaling factors are no longer necessary, assuming the block frequencies around a bundle are consistent. This patch can cause the register allocator to make different spilling decisions. The differences should be small. llvm-svn: 186434	2013-07-16 18:26:15 +00:00
Jakob Stoklund Olesen	13be6bfb60	Revert (most of) r185393 and r185395. "Remove floating point computations form SpillPlacement.cpp." These commits caused test failures in lencod on clang-native-arm-lnt. I suspect these changes are only exposing an existing issue, but reverting anyway to keep the bots passing while we investigate. llvm-svn: 185447	2013-07-02 17:31:58 +00:00
Jakob Stoklund Olesen	3b4e9ccfe3	Tweak some comments that referred to the old bias computations. llvm-svn: 185395	2013-07-01 23:36:37 +00:00
Jakob Stoklund Olesen	55daed27b7	Remove floating point computations form SpillPlacement.cpp. Patch by Benjamin Kramer! Use the BlockFrequency class instead of floats in the Hopfield network computations. This rescales the node Bias field from a [-2;2] float range to two block frequencies BiasN and BiasP pulling in opposite directions. This construct has a more predictable behavior when block frequencies saturate. The per-node scaling factors are no longer necessary, assuming the block frequencies around a bundle are consistent. This patch can cause the register allocator to make different spilling decisions. The differences should be small. llvm-svn: 185393	2013-07-01 23:19:39 +00:00
Benjamin Kramer	e2a1d89e14	Switch spill weights from a basic loop depth estimation to BlockFrequencyInfo. The main advantages here are way better heuristics, taking into account not just loop depth but also __builtin_expect and other static heuristics and will eventually learn how to use profile info. Most of the work in this patch is pushing the MachineBlockFrequencyInfo analysis into the right places. This is good for a 5% speedup on zlib's deflate (x86_64), there were some very unfortunate spilling decisions in its hottest loop in longest_match(). Other benchmarks I tried were mostly neutral. This changes register allocation in subtle ways, update the tests for it. 2012-02-20-MachineCPBug.ll was deleted as it's very fragile and the instruction it looked for was gone already (but the FileCheck pattern picked up unrelated stuff). llvm-svn: 184105	2013-06-17 19:00:36 +00:00
Jakub Staszak	b6970267ad	Move #include of BitVector from .h to .cpp file. Also remove unneeded #include and forward declaration. llvm-svn: 177357	2013-03-18 23:45:45 +00:00
Jakob Stoklund Olesen	29268b50f2	Give a small negative bias to giant edge bundles. This helps compile time when the greedy register allocator splits live ranges in giant functions. Without the bias, we would try to grow regions through the giant edge bundles, usually to find out that the region became too big and expensive. If a live range has many uses in blocks near the giant bundle, the small negative bias doesn't make a big difference, and we still consider regions including the giant edge bundle. Giant edge bundles are usually connected to landing pads or indirect branches. llvm-svn: 157174	2012-05-21 03:11:23 +00:00
Jakob Stoklund Olesen	869545203b	Be more conservative when forming compact regions. Apply twice the negative bias on transparent blocks when computing the compact regions. This excludes loop backedges from the region when only one of the loop blocks uses the register. Previously, we would include the backedge in the region if the loop preheader and the loop latch both used the register, but the loop header didn't. When both the header and latch blocks use the register, we still keep it live on the backedge. llvm-svn: 136832	2011-08-03 23:09:38 +00:00
Jakob Stoklund Olesen	d2a7d1ed97	Extend the SpillPlacement interface with two new features. The PrefBoth constraint is used for blocks that ideally want a live-in value both on the stack and in a register. This would be used by a block that has a use before interference forces a spill. Secondly, add the ChangesValue flag to BlockConstraint. This tells SpillPlacement if a live-in value on the stack can be reused as a live-out stack value for free. If the block redefines the virtual register, a spill would be required for that. This extra information will be used by SpillPlacement to more accurately calculate spill costs when a value can exist both on the stack and in a register. The simplest example is a basic block that reads the virtual register, but doesn't change its value. Spilling around such a block requires a reload, but no spill in the block. The spiller already knows this, but the spill placer doesn't. That can sometimes lead to suboptimal regions. llvm-svn: 136731	2011-08-02 21:53:03 +00:00
Jakob Stoklund Olesen	0ab5d0ee5b	Add a simple method for marking blocks with interference in and out. This method matches addLinks - All the listed blocks are considered to have interference, so they add a negative bias to their bundles. This could also be done by addConstraints, but that requires building a separate BlockConstraint array. llvm-svn: 135844	2011-07-23 03:10:19 +00:00
Jakob Stoklund Olesen	ed47ed4e80	Build the Hopfield network incrementally when splitting global live ranges. It is common for large live ranges to have few basic blocks with register uses and many live-through blocks without any uses. This approach grows the Hopfield network incrementally around the use blocks, completely avoiding checking interference for some through blocks. llvm-svn: 129188	2011-04-09 02:59:09 +00:00
Jakob Stoklund Olesen	1c0db0fd21	Prefer multiplications to divisions. llvm-svn: 129080	2011-04-07 17:27:48 +00:00
Jakob Stoklund Olesen	6d2bbc1c20	Extract SpillPlacement::addLinks for handling the special transparent blocks. llvm-svn: 129079	2011-04-07 17:27:46 +00:00
Jakob Stoklund Olesen	6895b87dfe	Keep track of the number of positively biased nodes when adding constraints. If there are no positive nodes, the algorithm can be aborted early. llvm-svn: 129021	2011-04-06 19:14:00 +00:00
Jakob Stoklund Olesen	36b5d8a698	Break the spill placement algorithm into three parts: prepare, addConstraints, and finish. This will allow us to abort the algorithm early if it is determined to be futile. llvm-svn: 129020	2011-04-06 19:13:57 +00:00
Jakob Stoklund Olesen	c332e727b4	Precompute block frequencies, pow() isn't free. llvm-svn: 126975	2011-03-04 00:58:40 +00:00
Jakob Stoklund Olesen	73e203e3d3	Trim debugging output. llvm-svn: 125802	2011-02-18 00:32:47 +00:00
Jakob Stoklund Olesen	2b855eb69c	Silence an MSVC warning llvm-svn: 124798	2011-02-03 17:04:12 +00:00
Jakob Stoklund Olesen	79be8aecba	Divert Hopfield network debug output. It is very noisy. llvm-svn: 123859	2011-01-19 23:14:59 +00:00
Jakob Stoklund Olesen	267f6c1ab2	Add RAGreedy methods for splitting live ranges around regions. Analyze the live range's behavior entering and leaving basic blocks. Compute an interference pattern for each allocation candidate, and use SpillPlacement to find an optimal region where that register can be live. This code is still not enabled. llvm-svn: 123774	2011-01-18 21:13:27 +00:00
Jakob Stoklund Olesen	8e236eac74	Add the SpillPlacement analysis pass. This pass precomputes CFG block frequency information that can be used by the register allocator to find optimal spill code placement. Given an interference pattern, placeSpills() will compute which basic blocks should have the current variable enter or exit in a register, and which blocks prefer the stack. The algorithm is ready to consume block frequencies from profiling data, but for now it gets by with the static estimates used for spill weights. This is a work in progress and still not hooked up to RegAllocGreedy. llvm-svn: 122938	2011-01-06 01:21:53 +00:00

35 Commits