llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	2835278ee0	[LoopIdiomRecognize] Support for converting loops that use LSHR to CTLZ. In the 'detectCTLZIdiom' function support for loops that use LSHR instruction instead of ASHR has been added. This supports creating ctlz from the following code. int lzcnt(int x) { int count = 0; while (x > 0) { count++; x = x >> 1; } return count; } Patch by Olga Moldovanova Differential Revision: https://reviews.llvm.org/D48354 llvm-svn: 336509	2018-07-08 01:45:47 +00:00
Chandler Carruth	d8b0c8ce1b	[PM/LoopUnswitch] Fix PR37889, producing the correct loop nest structure after trivial unswitching. This PR illustrates that a fundamental analysis update was not performed with the new loop unswitch. This update is also somewhat fundamental to the core idea of the new loop unswitch -- we actually update the CFG based on the unswitching. In order to do that, we need to update the loop nest in addition to the domtree. For some reason, when writing trivial unswitching, I thought that the loop nest structure cannot be changed by the transformation. But the PR helps illustrate that it clearly can. I've expanded this to a number of different test cases that try to cover the different cases of this. When we unswitch, we move an exit edge of a loop out of the loop. If this exit edge changes which loop reached by an exit is the innermost loop, it changes the parent of the loop. Essentially, this transformation may hoist the inner loop up the nest. I've added the simple logic to handle this reliably in the trivial unswitching case. This just requires updating LoopInfo and rebuilding LCSSA on the impacted loops. In the trivial case, we don't even need to handle dedicated exits because we're only hoisting the one loop and we just split its preheader. I've also ported all of these tests to non-trivial unswitching and verified that the logic already there correctly handles the loop nest updates necessary. Differential Revision: https://reviews.llvm.org/D48851 llvm-svn: 336477	2018-07-07 01:12:56 +00:00
Benjamin Kramer	3687ac52a9	[LoopSink] Make the enforcement of determinism deterministic. LoopBlockNumber is a DenseMap<BasicBlock, int>, comparing the result of find() will compare a pair<BasicBlock, int>. That's of course depending on pointer ordering which varies from run to run. Reverse iteration doesn't find this because we're copying to a vector first. This bug has been there since 2016 but only recently showed up on clang selfhost with FDO and ThinLTO, which is also why I didn't manage to get a reasonable test case for this. Add an assert that would've caught this. llvm-svn: 336439	2018-07-06 14:20:58 +00:00
Michael Zolotukhin	a5f2c52a1e	Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading."" There were a couple of issues reported (PR38047, PR37929) - I'll reland the patch when I figure out and fix the rootcause. llvm-svn: 336393	2018-07-05 22:10:31 +00:00
Chandler Carruth	3897ded691	[PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when unswitching loops. Original patch trying to address this was sent in D47624, but that didn't quite handle things correctly. There are two key principles used to select whether and how to invalidate SCEV-cached information about loops: 1) We must invalidate any info SCEV has cached before unswitching as we may change (or destroy) the loop structure by the act of unswitching, and make it hard to recover everything we want to invalidate within SCEV. 2) We need to invalidate all of the loops whose CFGs are mutated by the unswitching. Notably, this isn't the entire loop nest, this is every loop contained by the outermost loop reached by an exit block relevant to the unswitch. And we need to do this even when doing trivial unswitching. I've added more focused tests that directly check that SCEV starts off with imprecise information and after unswitching (and simplifying instructions) re-querying SCEV will produce precise information. These tests also specifically work to check that an outer loop's information becomes precise. However, the testing here is still a bit imperfect. Crafting test cases that reliably fail to be analyzed by SCEV before unswitching and succeed afterward proved ... very, very hard. It took me several hours and careful work to build these, and I'm not optimistic about necessarily coming up with more to cover more elaborate possibilities. Fortunately, the code pattern we are testing here in the pass is really straightforward and reliable. Thanks to Max Kazantsev for the initial work on this as well as the review, and to Hal Finkel for helping me talk through approaches to test this stuff even if it didn't come to much. Differential Revision: https://reviews.llvm.org/D47624 llvm-svn: 336183	2018-07-03 09:13:27 +00:00
Alina Sbirlea	0e15501fa7	Replace "Replacable" with "Replaceable". [NFC] llvm-svn: 336133	2018-07-02 18:53:40 +00:00
Florian Hahn	4ebba909a2	Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 336098	2018-07-02 12:44:04 +00:00
David Green	963401d2be	[UnrollAndJam] New Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 336062	2018-07-01 12:47:30 +00:00
Chandler Carruth	7c557f804d	[instsimplify] Move the instsimplify pass to use more obvious file names and diretory. Also cleans up all the associated naming to be consistent and removes the public access to the pass ID which was unused in LLVM. Also runs clang-format over parts that changed, which generally cleans up a bunch of formatting. This is in preparation for doing some internal cleanups to the pass. Differential Revision: https://reviews.llvm.org/D47352 llvm-svn: 336028	2018-06-29 23:36:03 +00:00
Sean Fertile	cd0d7634f6	Revert "Extend CFGPrinter and CallPrinter with Heat Colors" This reverts r335996 which broke graph printing in Polly. llvm-svn: 336000	2018-06-29 17:48:58 +00:00
Sean Fertile	3b0535b424	Extend CFGPrinter and CallPrinter with Heat Colors Extends the CFGPrinter and CallPrinter with heat colors based on heuristics or profiling information. The colors are enabled by default and can be toggled on/off for CFGPrinter by using the option -cfg-heat-colors for both -dot-cfg[-only] and -view-cfg[-only]. Similarly, the colors can be toggled on/off for CallPrinter by using the option -callgraph-heat-colors for both -dot-callgraph and -view-callgraph. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D40425 llvm-svn: 335996	2018-06-29 17:13:58 +00:00
Anastasis Grammenos	425df22ee3	[SROA] Preserve DebugLoc when rewriting alloca partitions When rewriting an alloca partition copy the DL from the old alloca over the the new one. Differential Revision: https://reviews.llvm.org/D48640 llvm-svn: 335904	2018-06-28 18:58:30 +00:00
Florian Hahn	388af14f85	[SCCP] Mark CFG as preserved. SCCP does not change the CFG, so we can mark it as preserved. Reviewers: dberlin, efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D47149 llvm-svn: 335820	2018-06-28 09:53:38 +00:00
Michael Zolotukhin	d3b8bdef01	[JumpThreading] Don't try to rewrite a use if it's already valid. Summary: When recording uses we need to rewrite after cloning a loop we need to check if the use is not dominated by the original def. The initial assumption was that the cloned basic block will introduce a new path and thus the original def will only dominate the use if they are in the same BB, but as the reproducer from PR37745 shows it's not always the case. This fixes PR37745. Reviewers: haicheng, Ka-Ka Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48111 llvm-svn: 335675	2018-06-26 22:19:48 +00:00
Matt Arsenault	2c1a570aab	LoopUnroll: Allow analyzing intrinsic call costs I'm not sure why the code here is skipping calls since TTI does try to do something for general calls, but it at least should allow intrinsics. Skip intrinsics that should not be omitted as calls, which is by far the most common case on AMDGPU. llvm-svn: 335645	2018-06-26 18:51:17 +00:00
Florian Hahn	4a69b0bb36	[IPSCCP] Change dead blocks to unreachable after visiting all executable blocks. changeToUnreachable may remove PHI nodes from executable blocks we found values for and we would fail to replace them. By changing dead blocks to unreachable after we replaced constants in all executable blocks, we ensure such PHI nodes are replaced by their known value before. Fixes PR37780. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D48421 llvm-svn: 335588	2018-06-26 10:15:02 +00:00
Chandler Carruth	1652996fd6	[PM/LoopUnswitch] Teach the new unswitch to handle nontrivial unswitching of switches. This works much like trivial unswitching of switches in that it reliably moves the switch out of the loop. Here we potentially clone the entire loop into each successor of the switch and re-point the cases at these clones. Due to the complexity of actually doing nontrivial unswitching, this patch doesn't create a dedicated routine for handling switches -- it would duplicate far too much code. Instead, it generalizes the existing routine to handle both branches and switches as it largely reduces to looping in a few places instead of doing something once. This actually improves the results in some cases with branches due to being much more careful about how dead regions of code are managed. With branches, because exactly one clone is created and there are exactly two edges considered, somewhat sloppy handling of the dead regions of code was sufficient in most cases. But with switches, there are much more complicated patterns of dead code and so I've had to move to a more robust model generally. We still do as much pruning of the dead code early as possible because that allows us to avoid even cloning the code. This also surfaced another problem with nontrivial unswitching before which is that we weren't as precise in reconstructing loops as we could have been. This seems to have been mostly harmless, but resulted in pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we have to get this right, and everything benefits from it. While the testing may seem a bit light here because we only have two real cases with actual switches, they do a surprisingly good job of exercising numerous edge cases. Also, because we share the logic with branches, most of the changes in this patch are reasonably well covered by existing tests. The new unswitch now has all of the same fundamental power as the old one with the exception of the single unsound case of partial switch unswitching -- that really is just loop specialization and not unswitching at all. It doesn't fit into the canonicalization model in any way. We can add a loop specialization pass that runs late based on profile data if important test cases ever come up here. Differential Revision: https://reviews.llvm.org/D47683 llvm-svn: 335553	2018-06-25 23:32:54 +00:00
Craig Topper	27847868b7	[LoopIdiomRecognize] Fix a couple places where it appears we were unintenionally making copies of DebugLoc. llvm-svn: 335521	2018-06-25 20:45:45 +00:00
Stanislav Mekhanoshin	d8c9374797	Fix invariant fdiv hoisting in LICM FDiv is replaced with multiplication by reciprocal and invariant reciprocal is hoisted out of the loop, while multiplication remains even if invariant. Switch checks for all invariant operands and only invariant denominator to fix the issue. Differential Revision: https://reviews.llvm.org/D48447 llvm-svn: 335411	2018-06-23 04:01:28 +00:00
Eli Friedman	203eaaf5ba	[LoopReroll] Rewrite induction variable rewriting. This gets rid of a bunch of weird special cases; instead, just use SCEV rewriting for everything. In addition to being simpler, this fixes a bug where we would use the wrong stride in certain edge cases. The one bit I'm not quite sure about is the trip count handling, specifically the FIXME about overflow. In general, I think we need to widen the exit condition, but that's probably not profitable if the new type isn't legal, so we probably need a check somewhere. That said, I don't think I'm making the existing problem any worse. As a followup to this, a bunch of IV-related code in root-finding could be cleaned up; with SCEV-based rewriting, there isn't any reason to assume a loop will have exactly one or two PHI nodes. Differential Revision: https://reviews.llvm.org/D45191 llvm-svn: 335400	2018-06-22 22:58:55 +00:00
Alina Sbirlea	bee50036d3	[LoopUnswitch]Fix comparison for DomTree updates. Summary: In LoopUnswitch when replacing a branch Parent -> Succ with a conditional branch Parent -> True & Parent->False, the DomTree updates should insert an edge for each of True/False if True/False are different than Succ, and delete Parent->Succ edge if both are different. The comparison with Succ appears to be incorect, it's comparing with Parent instead. There is no test failing either before or after this change, but it seems to me this is the right way to do the update. Reviewers: chandlerc, kuhar Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D48457 llvm-svn: 335369	2018-06-22 17:14:35 +00:00
Francis Visoiu Mistrih	ac599b6951	Revert r335206 "Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions." This reverts commit r335206. As discussed here: https://reviews.llvm.org/rL333740, a fix will come tomorrow. In the meanwhile, revert this to fix some bots. llvm-svn: 335272	2018-06-21 19:18:36 +00:00
Florian Hahn	d36aa1f763	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. r335150 should resolve the issues with the clang-with-thin-lto-ubuntu and clang-with-lto-ubuntu builders. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 335206	2018-06-21 07:15:08 +00:00
Chandler Carruth	d1dab0c3c0	[PM/LoopUnswitch] Add partial non-trivial unswitching for invariant conditions feeding a chain of `and`s or `or`s for a branch. Much like with full non-trivial unswitching, we rely on the pass manager to handle iterating until all of the profitable unswitches have been done. This is to allow other more profitable unswitches to fire on any of the cloned, simpler versions of the loop if viable. Threading the partial unswiching through the non-trivial unswitching logic motivated some minor refactorings. If those are too disruptive to make it reasonable to review this patch, I can separate them out, but it'll be somewhat timeconsuming so I wanted to send it for initial review as-is. Feel free to tell me whether it warrants pulling apart. I've tried to re-use (and factor out) logic form the partial trivial unswitching, but not as much could be shared as I had haped. Still, this wasn't as bad as I naively expected. Some basic testing is added, but I probably need more. Suggestions for things you'd like to see tested more than welcome. One thing I'd like to do is add some testing that when we schedule this with loop-instsimplify it effectively cleans up the cruft created. Last but not least, this uncovered a bug that has been in loop cloning the entire time for non-trivial unswitching. Specifically, we didn't correctly add the outer-most cloned loop to the list of cloned loops. This meant that LCSSA wouldn't be updated for it hypothetically, and more significantly that we would never visit it in the loop pass manager. I noticed this while checking loop-instsimplify by hand. I'll try to separate this bugfix out into its own patch with a more focused test. But it is just one line, so shouldn't significantly confuse the review here. After this patch, the only missing "feature" in this unswitch I'm aware of us non-trivial unswitching of switches. I'll try implementing full non-trivial unswitching of switches (which is at least a sound thing to implement), but partial non-trivial unswitching of switches is something I don't see any sound and principled way to implement. I also have no interesting test cases for the latter, so I'm not really worried. The rest of the things that need to be ported are bug-fixes and more narrow / targeted support for specific issues. Differential Revision: https://reviews.llvm.org/D47522 llvm-svn: 335203	2018-06-21 06:14:03 +00:00
Alina Sbirlea	dfd14adeb0	Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred. Summary: Two utils methods have essentially the same functionality. This is an attempt to merge them into one. 1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred 2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor Prior to the patch: 1. MergeBasicBlockIntoOnlyPred Updates either DomTree or DeferredDominance Moves all instructions from Pred to BB, deletes Pred Asserts BB has single predecessor If address was taken, replace the block address with constant 1 (?) 2. MergeBlockIntoPredecessor Updates DomTree, LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken After the patch: Method 2. MergeBlockIntoPredecessor is attempting to become the new default: Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken Uses of MergeBasicBlockIntoOnlyPred that need to be replaced: 1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp Updated in this patch. No challenges. 2. lib/CodeGen/CodeGenPrepare.cpp Updated in this patch. i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation. ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks Some interesting aspects: - Since Pred is not deleted (BB is), the entry block does not need updating. - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred. - isMergingEmptyBlockProfitable assumes BB is the one to be deleted. - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead. - adding some test owner as subscribers for the interesting tests modified: test/CodeGen/X86/avx-cmp.ll test/CodeGen/AMDGPU/nested-loop-conditions.ll test/CodeGen/AMDGPU/si-annotate-cf.ll test/CodeGen/X86/hoist-spill.ll test/CodeGen/X86/2006-11-17-IllegalMove.ll 3. lib/Transforms/Scalar/JumpThreading.cpp Not covered in this patch. It is the only use case using the DeferredDominance. I would defer to Brian Rzycki to make this replacement. Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D48202 llvm-svn: 335183	2018-06-20 22:01:04 +00:00
Chandler Carruth	4da3331d3d	[PM/LoopUnswitch] Support partial trivial unswitching. The idea of partial unswitching is to take a part of a branch's condition that is loop invariant and just unswitching that part. This primarily makes sense with i1 conditions of branches as opposed to switches. When dealing with i1 conditions, we can easily extract loop invariant inputs to a a branch and unswitch them to test them entirely outside the loop. As part of this, we now create much more significant cruft in the loop body, so this relies on adding cleanup passes to the loop pipeline and revisiting unswitched loops to do that cleanup before continuing to process them. This already appears to be more powerful at unswitching than the old loop unswitch pass, and so I'd appreciate pretty careful review in case I'm just missing some correctness checks. The `LIV-loop-condition` test case is not unswitched by the old unswitch pass, but is with this pass. Thanks to Sanjoy and Fedor for the review! Differential Revision: https://reviews.llvm.org/D46706 llvm-svn: 335156	2018-06-20 18:57:07 +00:00
David Green	e6a9c24878	[LoopSimplifyCFG] Invalidate SCEV in LoopSimplifyCFG LoopSimplifyCFG, being a loop pass, needs to preserve scalar evolution. This invalidates SE for the loops altered during block merging. Differential Revision: https://reviews.llvm.org/D48258 llvm-svn: 335036	2018-06-19 09:43:36 +00:00
Florian Hahn	d8fcf0de31	[LoopInterchange] Move PHI handling to adjustLoopBranches. This patch moves the logic to handle reduction PHI nodes to the end of adjustLoopBranches. Reduction PHI nodes in the outer loop header can be moved to the inner loop header and reduction PHI nodes from the inner loop header can be moved to the outer loop header. In the latter situation, we have to deal with 1 kind of PHI nodes: PHI nodes that are part of inner loop-only reductions. We can replace the PHI node with the value coming from outside the inner loop. Reviewers: mcrosier, efriedma, karthikthecool Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D46198 llvm-svn: 335027	2018-06-19 08:03:24 +00:00
Michael Zolotukhin	158a7c3323	CorrelatedValuePropagation: Preserve DT. Summary: We only modify CFG in a couple of places, and we can preserve DT there with a little effort. Reviewers: davide, vsk Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48059 llvm-svn: 334895	2018-06-16 18:57:31 +00:00
Simon Pilgrim	dee9c67f24	[EarlyCSE] Fix MSVC build. NFCI. MSVC doesn't let you assign different lambdas through a ternary operator. llvm-svn: 334715	2018-06-14 14:22:03 +00:00
Max Kazantsev	ff6d1c9188	[EarlyCSE] Propagate conditions of AND and OR instructions This patches teaches EarlyCSE to figure out that if `and i1 %x, %y` is true then both `%x` and `%y` are true in the taken branch, and if `or i1 %x, %y` is false then both `%x` and `%y` are false in non-taken branch. Fix for PR37635. Differential Revision: https://reviews.llvm.org/D47574 Reviewed By: reames llvm-svn: 334707	2018-06-14 13:02:13 +00:00
Hiroshi Inoue	f209649dfc	[NFC] fix trivial typos in comments llvm-svn: 334687	2018-06-14 05:41:49 +00:00
Florian Hahn	a1cc848399	Use SmallPtrSet explicitly for SmallSets with pointer types (NFC). Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This patch replaces such types with SmallPtrSet, because IMO it is slightly clearer and allows us to get rid of unnecessarily including SmallSet.h Reviewers: dblaikie, craig.topper Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D47836 llvm-svn: 334492	2018-06-12 11:16:56 +00:00
Craig Topper	61998289f9	Use SmallPtrSet instead of SmallSet in places where we iterate over the set. SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work. These places were found by hiding the begin/end methods in the SmallSet forwarding llvm-svn: 334343	2018-06-09 05:04:20 +00:00
Daniil Fukalov	37433dc2e1	reapply r334209 with fixes for harfbuzz in Chromium r334209 description: [LSR] Check yet more intrinsic pointer operands the patch fixes another assertion in isLegalUse() Differential Revision: https://reviews.llvm.org/D47794 llvm-svn: 334300	2018-06-08 16:22:52 +00:00
Reid Kleckner	a3609f75b2	Revert r334209 "[LSR] Check yet more intrinsic pointer operands" This causes cast failures when compiling harfbuzz in Chromium. Reproducer on the way. llvm-svn: 334254	2018-06-08 00:43:27 +00:00
Daniil Fukalov	12c0663a25	[LSR] Check yet more intrinsic pointer operands the patch fixes another assertion in isLegalUse() Differential Revision: https://reviews.llvm.org/D47794 llvm-svn: 334209	2018-06-07 17:30:58 +00:00
Michael Zolotukhin	31800864dc	SpeculativeExecution Pass: Set PreserveCFG to avoid unnecessary analyses invalidation. The pass doesn't touch CFG in any way, only moves instructions between blocks. llvm-svn: 334150	2018-06-07 00:19:29 +00:00
David Blaikie	31b98d2e99	Move Analysis/Utils/Local.h back to Transforms Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954	2018-06-04 21:23:21 +00:00
Chandler Carruth	9281503e8f	[PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses. Summary: I noticed this issue because we didn't put the primary cloned loop into the `NonChildClonedLoops` vector and so never iterated on it. Once I fixed that, it made it clear why I had to do a really complicated and unnecesasry dance when updating the loops to remain in canonical form -- I was unwittingly working around the fact that the primary cloned loop wasn't in the expected list of cloned loops. Doh! Now that we include it in this vector, we don't need to return it and we can consolidate the update logic as we correctly have a single place where it can be handled. I've just added a test for the iteration order aspect as every time I changed the update logic partially or incorrectly here, an existing test failed and caught it so that seems well covered (which is also evidenced by the extensive working around of this missing update). Reviewers: asbirlea, sanjoy Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47647 llvm-svn: 333811	2018-06-02 01:29:01 +00:00
Florian Hahn	8a17f1f43e	Revert r333740: IPSCCP] Use PredicateInfo to propagate facts from cmp. This is breaking the clang-with-thin-lto-ubuntu bot. llvm-svn: 333745	2018-06-01 12:58:43 +00:00
Florian Hahn	f4df554f32	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333740	2018-06-01 10:48:54 +00:00
Craig Topper	9a6c0bdcbd	[LoopIdiomRecognize] Only convert loops to ctlz if we can prove that the input is non-negative. Summary: Loop idiom recognize tries to convert loops like ``` int foo(int x) { int cnt = 0; while (x) { x >>= 1; ++cnt; } return cnt; } ``` into calls to ctlz, but if x is initially negative this loop should be infinite. It happens that the cases that motivated this change have an absolute value of x before the loop. So this patch restricts the transform to cases where we know x is positive. Note: We are relying on the absolute value of INT_MIN to be undefined so we can assume that the result is always positive. Fixes PR37479 Reviewers: spatel, hfinkel, efriedma, javed.absar Reviewed By: efriedma Subscribers: dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D47348 llvm-svn: 333702	2018-05-31 22:16:55 +00:00
Craig Topper	c9a4c6208b	[JumpThreading] Fix some strange formatting of code inside LLVM_DEBUG. NFC I don't know if clang-format got confused here or what. llvm-svn: 333675	2018-05-31 18:08:11 +00:00
Max Kazantsev	0bad5be430	[NFC] Factor out a method for further extension llvm-svn: 333633	2018-05-31 08:08:34 +00:00
George Burgess IV	485762ccba	[NewGVN] Fix set comparison; reflow comment Looks like we intended to compare this->Members with Other->Members here, but ended up comparing this->Members with this->Members. Oops. :) Since CongruenceClass::Members is a SmallPtrSet anyway, we can probably skip building std::sets if we're willing to write a bit more code. This appears to be no functional change (for sufficiently lax values of "no"): this equality check was only being called inside of an assert. So, worst case, we'll catch more bugs in the form of assertion failures. Thanks to d0k for noting this! llvm-svn: 333601	2018-05-30 22:24:08 +00:00
Chandler Carruth	71fd27043e	[PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, schedule loop-cleanup passes at the beginning of the loop pass pipeline, and re-enqueue loops after even trivial unswitching. This will allow us to much more consistently avoid simplifying code while doing trivial unswitching. I've also added a test case that specifically shows effective iteration using this technique. I've unconditionally updated the new PM as that is always using the SimpleLoopUnswitch pass, and I've made the pipeline changes for the old PM conditional on using this new unswitch pass. I added a bunch of comments to the loop pass pipeline in the old PM to make it more clear what is going on when reviewing. Hopefully this will unblock doing partial unswitching instead of just full unswitching. Differential Revision: https://reviews.llvm.org/D47408 llvm-svn: 333493	2018-05-30 02:46:45 +00:00
Chandler Carruth	4cbcbb0761	[LoopInstSimplify] Re-implement the core logic of loop-instsimplify to be both simpler and substantially more efficient. Rather than use a hand-rolled iteration technique that isn't quite the same as RPO, use the pre-built RPO loop body traversal utility. Once visiting the loop body in RPO, we can assert that we visit defs before uses reliably. When this is the case, the only need to iterate is when simplifying a def that is used by a PHI node along a back-edge. With this patch, the first pass over the loop body is just a complete simplification of every instruction across the loop body. When we encounter a use of a simplified instruction that stems from a PHI node in the loop body that has already been visited (due to some cyclic CFG, potentially the loop itself, or a nested loop, or unstructured control flow), we recall that specific PHI node for the second iteration. Nothing else needs to be preserved from iteration to iteration. On the second and later iterations, only instructions known to have simplified inputs are considered, each time starting from a set of PHIs that had simplified inputs along the backedges. Dead instructions are collected along the way, but deleted in a batch at the end of each iteration making the iterations themselves substantially simpler. This uses a new batch API for recursively deleting dead instructions. This alsa changes the routine to visit subloops. Because simplification is fundamentally transitive, we may need to visit the entire loop body, including subloops, to handle knock-on simplification. I've added a basic test file that helps demonstrate that all of these changes work. It includes both straight-forward loops with simplifications as well as interesting PHI-structures, CFG-structures, and a nested loop case. Differential Revision: https://reviews.llvm.org/D47407 llvm-svn: 333461	2018-05-29 20:15:38 +00:00
David Green	aee7ad0cde	Revert 333358 as it's failing on some builders. I'm guessing the tests reply on the ARM backend being built. llvm-svn: 333359	2018-05-27 12:54:33 +00:00
David Green	3034281b43	[UnrollAndJam] Add a new Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 333358	2018-05-27 12:11:21 +00:00

1 2 3 4 5 ...

8963 Commits