llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	8058196677	[InstCombine] Process newly inserted instructions in the correct order InstCombine operates on the basic premise that the operands of the currently processed instruction have already been simplified. It achieves this by pushing instructions to the worklist in reverse program order, so that instructions are popped off in program order. The worklist management in the main combining loop also makes sure to uphold this invariant. However, the same is not true for all the code that is performing manual worklist management. The largest problem (addressed in this patch) are instructions inserted by InstCombine's IRBuilder. These will be pushed onto the worklist in order of insertion (generally matching program order), which means that a) the users of the original instruction will be visited first, as they are pushed later in the main loop and b) the newly inserted instructions will be visited in reverse program order. This causes a number of problems: First, folds operate on instructions that have not had their operands simplified, which may result in optimizations being missed (ran into this in https://reviews.llvm.org/D72048#1800424, which was the original motivation for this patch). Additionally, this increases the amount of folds InstCombine has to perform, both within one iteration, and by increasing the number of total iterations. This patch addresses the issue by adding a Worklist.AddDeferred() method, which is used for instructions inserted by IRBuilder. These will only be added to the real worklist after the combine finished, and in reverse order, so they will end up processed in program order. I should note that the same should also be done to nearly all other uses of Worklist.Add(), but I'm starting with just this occurrence, which has by far the largest test fallout. Most of the test changes are due to https://bugs.llvm.org/show_bug.cgi?id=44521 or other cases where we don't canonicalize something. These are neutral. One regression has been addressed in D73575 and D73647. The remaining regression in an shl+sdiv fold can't really be fixed without dropping another transform, but does not seem particularly problematic in the first place. Differential Revision: https://reviews.llvm.org/D73411	2020-01-30 09:40:10 +01:00
Roman Lebedev	73f702ff19	[InstCombine] Non-canonical clamp-like pattern handling Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: \| \| ULT \| UGE \| UGT \| \| SLT \| https://rise4fun.com/Alive/yIJ \| https://rise4fun.com/Alive/5BfN \| https://rise4fun.com/Alive/INH \| \| SGE \| https://rise4fun.com/Alive/hd8 \| https://rise4fun.com/Alive/Abk \| https://rise4fun.com/Alive/PlzS \| \| SGT \| https://rise4fun.com/Alive/VYG \| https://rise4fun.com/Alive/oMY \| https://rise4fun.com/Alive/KrzC \| {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687	2019-08-13 12:49:28 +00:00
Roman Lebedev	09eb71ced3	[NFC][InstCombine] Non-canonical clamp pattern: non-canonical predicate tests We can't handle 'uge' case because we can't ever get it, there needs to be extra use on that compare or else it will be canonicalized, but because of extra use we can't handle it. 'sge' case we can have. llvm-svn: 368656	2019-08-13 08:14:13 +00:00
Roman Lebedev	76b772f9ce	[InstCombine][NFC] Tests for non-canonical clamp-like pattern As discussed in https://reviews.llvm.org/D65148#1607019 The canonical fold is: https://rise4fun.com/Alive/FKe llvm-svn: 367897	2019-08-05 18:01:22 +00:00

4 Commits