llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	986416251b	[InstCombine] Drop redundant fold for and/or of icmp eq/ne (NFCI) This handles a special case of foldAndOrOfICmpsUsingRanges() with two equality predicates.	2021-11-11 20:25:40 +01:00
Nikita Popov	84e273cced	[InstCombine] Handle undefs in and of icmp eq zero fold For the scalar/splat case, this fold is subsumed by foldLogOpOfMaskedICmps(). However, the conjugated fold for "or" also supports splats with undef. Make both code paths consistent by using m_ZeroInt() for the "and" implementation as well. https://alive2.llvm.org/ce/z/tN63cu https://alive2.llvm.org/ce/z/ufB_Ue	2021-11-11 19:07:07 +01:00
Nikita Popov	0242a6adf7	[InstCombine] Support splat vectors in some or of icmp folds Replace m_ConstantInt() with m_APInt() in order to support splat constants in addition to scalar integers.	2021-11-10 22:59:09 +01:00
Nikita Popov	861adaf2ad	[InstCombine] Support splat vectors in some and of icmp folds Replace m_ConstantInt() with m_APInt() to support splat vectors in addition to scalar integers.	2021-11-10 22:37:54 +01:00
Nikita Popov	58ebc79a64	[InstCombine] Strip offset when folding and/or of icmps When folding and/or of icmps, look through add of a constant and adjust the icmp range instead. Effectively, this decomposes X + C1 < C2 style range checks back into a normal range. This allows us to fold comparisons involving two range checks or one range check and some other condition. We had a fold for a really specific case of this (or of range check and eq, and only one one side!) while this handles it in fully generality. Differential Revision: https://reviews.llvm.org/D113510	2021-11-10 22:01:52 +01:00
Stanislav Mekhanoshin	5731381594	[InstCombine] Relax and reorganize one use checks in the ~(a \| b) & c Since there is just a single check for LHS in ~(A \| B) & C \| ... transforms and multiple RHS checks inside with more coming I am removing m_OneUse checks for LHS and adding new checks for RHS. This is non essential as long as there is total benefit. In addition (~(A \| B) & C) \| (~(A \| C) & B) --> (B ^ C) & ~A checks were overly restrictive, it should be good without any additional checks. Differential Revision: https://reviews.llvm.org/D113141	2021-11-10 10:14:12 -08:00
Sanjay Patel	67299aa84f	[InstCombine] add check for integer source type from cast to prevent crash A problem was noted in the post-commit review for `c36b7e21bd` / D113035 : If the source type is not integer or integer vector, then we could crash when trying to ComputeNumSignBits().	2021-11-10 09:44:55 -05:00
Itay Bookstein	f9059efa0d	[InstCombine] Extend stacksave/restore elimination Previously, InstCombine detected a pair of llvm.stacksave/stackrestore instructions that are adjacent modulo debug instructions in order to eliminate the llvm.stackrestore. This precludes situations where intervening instructions (e.g. loads) preclude the llvm.stacksave and llvm.stackrestore from becoming adjacent. This commit extends the logic and allows for eliminating the llvm.stackrestore when the range of instructions between them does not include any alloca or side-effect causing instructions. Signed-off-by: Itay Bookstein <itay.bookstein@nextsilicon.com> Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D113105	2021-11-10 10:41:58 +02:00
Itay Bookstein	fe7491d32f	[InstCombine][NFC] Refactor llvm.stackrestore handling Hoist the instruction classification logic outside the loop in preparation for reuse in a future commit. Signed-off-by: Itay Bookstein <itay.bookstein@nextsilicon.com> Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D113464	2021-11-10 10:41:56 +02:00
Nikita Popov	0aabdad1ef	[InstCombine] Combine code for and/or of icmps (NFC) The implementation for and/or is the same, apart from the choice of exactIntersectWith() vs exactUnionWith(). Extract a common function to make future extension easier.	2021-11-09 21:18:31 +01:00
Nikita Popov	bb12dedede	[InstCombine] Refactor and/or of icmp with constant (NFCI) Rather than testing for many specific combinations of predicates and values, compute the exact icmp regions for both comparisons and check whether they union/intersect exactly. If they do, construct the equivalent icmp for the new range. Assuming that the existing code handled all possible cases, this should be NFC. Differential Revision: https://reviews.llvm.org/D113367	2021-11-09 21:05:46 +01:00
Stanislav Mekhanoshin	791baf38e1	[InstCombine] Fuse checks for LHS (~(A \| B) & C) \| ... NFC. Differential Revision: https://reviews.llvm.org/D113132	2021-11-09 11:31:22 -08:00
Sanjay Patel	d5c002bdc7	[InstCombine] fix code comment to match code; NFC	2021-11-09 14:27:29 -05:00
Sanjay Patel	2a88d00cf2	[InstCombine] fold sub-of-umax to 0-usubsat Op0 - umax(X, Op0) --> 0 - usub.sat(X, Op1) I'm not sure if this is really an improvement in IR because we probably have better recognition/analysis for min/max, but this lines up with the fold we do for the icmp+select idiom and removes another diff from D98152. This is similar to the previous fold in the code that was added with: `83c2fb9f66` `baa6a85130` https://alive2.llvm.org/ce/z/5MrVB9	2021-11-09 12:46:03 -05:00
Sanjay Patel	baa6a85130	[InstCombine] allow commute in sub-of-umax fold This fold was added with: `83c2fb9f66` ...but missed the commuted pattern: https://alive2.llvm.org/ce/z/_tYEGy	2021-11-09 10:50:11 -05:00
Sanjay Patel	c36b7e21bd	[InstCombine] enhance vector bitwise select matching (Cond & C) \| (~bitcast(Cond) & D) --> bitcast (select Cond, (bc C), (bc D)) This is part of fixing: https://llvm.org/PR34047 That report shows a case where a bitcast is sitting between the select condition candidate and its 'not' value due to current cast canonicalization rules. There's a bitcast type restriction that might be violated in existing matching, but I still need to investigate if that is possible - Alive2 shows we can only do this transform safely when the bitcast is from narrow to wide vector elements (otherwise poison could leak into elements that were safe in the original code): https://alive2.llvm.org/ce/z/Hf66qh Differential Revision: https://reviews.llvm.org/D113035	2021-11-09 08:54:59 -05:00
Nikita Popov	1376301c87	[InstCombine] Canonicalize range test idiom InstCombine converts range tests of the form (X > C1 && X < C2) or (X < C1 \|\| X > C2) into checks of the form (X + C3 < C4) or (X + C3 > C4). It is possible to express all range tests in either of these forms (with different choices of constants), but currently neither of them is considered canonical. We may have equivalent range tests using either ult or ugt. This proposes to canonicalize all range tests to use ult. An alternative would be to canonicalize to either ult or ugt depending on the specific constants involved -- e.g. in practice we currently generate ult for && style ranges and ugt for \|\| style ranges when going through the insertRangeTest() helper. In fact, the "clamp like" fold was relying on this, which is why I had to tweak it to not assume whether inversion is needed based on just the predicate. Proof: https://alive2.llvm.org/ce/z/_SP_rQ Differential Revision: https://reviews.llvm.org/D113366	2021-11-08 21:15:46 +01:00
Nikita Popov	9f0194be45	[ConstantRange] Add getEquivalentICmp() variant with offset (NFCI) Add a variant of getEquivalentICmp() that produces an optional offset. This allows us to create an equivalent icmp for all ranges. Use this in the with.overflow folding code, which was doing this adjustment separately -- this clarifies that the fold will indeed always apply.	2021-11-06 21:59:45 +01:00
Sanjay Patel	83c2fb9f66	[InstCombine] match usub.sat from umax intrinsic umax(X, Op1) - Op1 --> usub.sat(X, Op1) https://alive2.llvm.org/ce/z/HpcGiJ This happens in 2 or more steps with an icmp-select idiom instead of an intrinsic. This is another step towards canonicalization of the min/max intrinsics. See: D98152	2021-11-06 08:32:52 -04:00
David Green	08056e1888	[InstCombine] Generalize sadd.sat combine to compute sign bits. There is a combine in instcombine to transform a saturated add/sub into a saddsat/ssubsat, currently handling inputs which are both sign extended (https://alive2.llvm.org/ce/z/68qpTn). This can generalize to, for example ashr of at least the bitwidth (https://alive2.llvm.org/ce/z/4TFyX- and https://alive2.llvm.org/ce/z/qDWzFs for example). Which means it generalizes further to "the number of sign bits", needing to be enough to truncate to the size of the saturate. (An example using `or` for instance: https://alive2.llvm.org/ce/z/EI_h_A). So this patch makes use of ComputeNumSignBits (with the newly added ComputeMinSignedBits) in matchSAddSubSat to generalize the fold to any inputs with enough sign bits known, truncating the inputs to the new size of the saturate. Differential Revision: https://reviews.llvm.org/D112298	2021-11-05 15:05:09 +00:00
David Green	61225c0818	[ValueTracking][InstCombine] Introduce and use ComputeMinSignedBits This introduces a new ComputeMinSignedBits method for ValueTracking that returns the BitWidth - SignBits + 1 from ComputeSignBits, and represents the minimum bit size for the value as a signed integer. Similar to the existing APInt::getMinSignedBits method, this can make some of the reasoning around ComputeSignBits more natural. See https://reviews.llvm.org/D112298	2021-11-05 14:41:37 +00:00
David Green	1e5f814302	[InstCombine] Fix infinite recursion in ashr/xor vector fold. The added test has poison lanes due to the vector shuffle. This can cause an infinite loop of combines in instcombine where it folds xor(ashr, -1) -> select (icmp slt 0), -1, 0 -> sext (icmp slt 0) -> xor(ashr, -1). We usually prevent this by checking that the xor constant is not -1, but with vectors some of the lanes may be -1, some may be poison. So this changes the way we detect that from "!C1->isAllOnesValue()" to "!match(C1, m_AllOnes())", which is more able to detect that some of the lanes are poison. Fixes PR52397	2021-11-04 09:24:27 +00:00
Sanjay Patel	c85df3c7d5	[InstCombine] refactor fold for icmp with trunc op; NFC There are at least 3 related folds we can add here - see D112634.	2021-11-03 12:43:15 -04:00
Piotr Sobczak	03961709ed	[InstCombine] Extend pattern to replace shuffle's insertelement operand In D71220 a pattern was added to replace shuffle's insertelement operand if inserted scalar is not demanded. The pattern was added only for the case where the shuffle's mask size is equal to element's vector size. However, that condition is not required because the pattern does not change the shuffle vector size. This patch extends the pattern to also include cases where shuffle's mask size is not equal to element's vector size. Differential Revision: https://reviews.llvm.org/D112318	2021-11-03 09:43:04 +01:00
Sanjay Patel	829146164f	[InstCombine] change 'not' match for bitwise select The tests diffs are logically equivalent, and so this is generally NFC, but this makes the code match the code comment. It should also be more efficient. If we choose the 'not' operand (rather than the 'not' instruction) as the select condition, then we don't have to invert the select condition/operands as a subsequent transform.	2021-11-02 10:16:01 -04:00
Sanjay Patel	42c94bc1ab	[InstCombine] allow vector splat matching for bitwise logic fold Similar to `54e969cffd` (and with cosmetic updates to hopefully make that easier to read), this fold has been around since early in LLVM history. Intermediate folds have been added subsequently, so extra uses are required to exercise this code. The test example actually shows an unintended consequence with extra uses - we end up with an extra instruction compared to what we started with. But this at least makes scalar/vector consistent. General proof: https://alive2.llvm.org/ce/z/tmuBza	2021-11-01 11:39:48 -04:00
Sanjay Patel	54e969cffd	[InstCombine] allow vector splat matching for bitwise logic folds This fold was added long ago (part of fixing PR4216), and it matched scalars only. Intermediate folds have been added subsequently, so extra uses are required to exercise this code. General proof: https://alive2.llvm.org/ce/z/G6BBhB One of the specific tests: https://alive2.llvm.org/ce/z/t0JhEB	2021-11-01 08:26:42 -04:00
Sanjay Patel	511ee8759f	[InstCombine] reduce code duplication with commutative matcher; NFC	2021-11-01 08:26:41 -04:00
Kazu Hirata	c714da2ceb	[Transforms] Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC)	2021-10-31 07:57:32 -07:00
David Green	66281baea1	[InstCombine] Fix type of constant in canonicalizeClampLike As a followup to D108049, one of the constants could now be generated with an incorrect type, now that the input could be truncated.	2021-10-30 09:06:21 +01:00
Sanjay Patel	8f786b4618	[InstCombine] fix comments to match code; NFC	2021-10-29 15:48:35 -04:00
Sanjay Patel	d0e9879d96	[InstCombine] allow vector splat matching for bitwise logic folds These transforms are also likely missing a one-use check, but that's another patch.	2021-10-29 14:22:50 -04:00
Stanislav Mekhanoshin	a905c54b76	[InstCombine] Fold `(~(a \| b) & c) \| ~(a \| c)` into `~((b & c) \| a)` ``` ---------------------------------------- define i4 @src(i4 %a, i4 %b, i4 %c) { %or1 = or i4 %b, %a %not1 = xor i4 %or1, -1 %or2 = or i4 %a, %c %not2 = xor i4 %or2, -1 %and = and i4 %not2, %b %or3 = or i4 %and, %not1 ret i4 %or3 } define i4 @tgt(i4 %a, i4 %b, i4 %c) { %and = and i4 %c, %b %or = or i4 %and, %a %or3 = xor i4 %or, -1 ret i4 %or3 } Transformation seems to be correct! ``` Differential Revision: https://reviews.llvm.org/D112338	2021-10-29 10:58:09 -07:00
David Green	11630dbbc3	[InstCombine] Fold BW/2+1 tops bits are same pattern Match "icmp eq (trunc (lsr A, BW), (ashr (trunc A), BW-1))", which checks the top BW/2 + 1 bits are all the same. Create "A >=s INT_MIN && A <=s INT_MAX", which we generate as "icmp ult (add A, 2^BW-1), 2^BW" to skip a few steps of instcombining. https://alive2.llvm.org/ce/z/NjH6Ty https://alive2.llvm.org/ce/z/_fEQ9P Differential Revision: https://reviews.llvm.org/D109155	2021-10-29 12:30:20 +01:00
David Green	9020e22a87	[InstCombine] Convert xor (ashr X, BW-1), C -> select(X >=s 0, C, ~C) The sequence of instructions `xor (ashr X, BW-1), C` (or with a truncation `xor (trunc (ashr X, BW-1)), C)` takes a value, produces all zeros or all ones and with it optionally inverts a constant depending on whether the original input was positive or negative. This is the same as checking if the value is positive, and selecting between the constant and ~constant. https://alive2.llvm.org/ce/z/NJ85qY This is a fairly general version of a fold that helps pull saturating arithmetic into a canonical form. Differential Revision: https://reviews.llvm.org/D109151	2021-10-29 11:19:20 +01:00
Stanislav Mekhanoshin	f7f430c913	[InstCombine] Fixed non-determinisctic order of new instructions Fixes non-determinisctic order of XOR instructions created after `5a7a458306`. The order of call argument evaluation is not defined, so create one Value before the call.	2021-10-28 12:14:02 -07:00
Stanislav Mekhanoshin	5a7a458306	[InstCombine] Fold `(c & ~(a \| b)) \| (b & ~(a \| c))` to `~a & (b ^ c)` ``` ---------------------------------------- define i4 @src(i4 %a, i4 %b, i4 %c) { %0: %or1 = or i4 %a, %b %not1 = xor i4 %or1, 15 %and1 = and i4 %not1, %c %or2 = or i4 %a, %c %not2 = xor i4 %or2, 15 %and2 = and i4 %not2, %b %or3 = or i4 %and1, %and2 ret i4 %or3 } => define i4 @tgt(i4 %a, i4 %b, i4 %c) { %0: %xor = xor i4 %b, %c %not = xor i4 %a, 15 %or3 = and i4 %xor, %not ret i4 %or3 } Transformation seems to be correct! ``` Differential Revision: https://reviews.llvm.org/D112276	2021-10-28 11:54:30 -07:00
David Green	9358384fd6	[InstCombine] Extend canonicalizeClampLike to handle truncated inputs This extends the canonicalizeClampLike function to allow cases where the input is truncated, but still matching on the types of the ICmps. For example %t = trunc i32 %X to i8 %a = add i32 %X, 128 %cmp = icmp ult i32 %a, 256 %c = icmp sgt i32 %X, -1 %f = select i1 %c, i8 High, i8 Low %r = select i1 %cmp, i8 %t, i8 %f becomes %c1 = icmp slt i32 %X, -128 %c2 = icmp sge i32 %X, 128 %s1 = select i1 %c1, i32 sext(Low), i32 %X %s2 = select i1 %c2, i32 sext(High), i32 %s1 %t = trunc i32 %s2 to i8 https://alive2.llvm.org/ce/z/vPzfxH We limit the transform to constant High and Low values, where we know the sext are free. Differential Revision: https://reviews.llvm.org/D108049	2021-10-28 15:46:58 +01:00
David Green	79011c705b	[InstCombine] Fix rare condition violation in canonicalizeClampLike With a "ult x, 0", the fold in canonicalizeClampLike does not validate with undef inputs. This condition will usually have been simplified away, but we should ensure the code is correct in case. https://alive2.llvm.org/ce/z/S8HQ6H vs https://alive2.llvm.org/ce/z/h2XBJ_ See: https://reviews.llvm.org/D108049	2021-10-28 15:03:07 +01:00
Sanjay Patel	e8535fa784	[InstCombine] allow Negator to fold multi-use select with constant arms The motivating test is reduced from: https://llvm.org/PR52261 Note that the more general problem of folding any binop into a multi-use select of constants is still there. We need to ease the restriction in InstCombinerImpl::FoldOpIntoSelect() to catch those. But these examples never reach that code because Negator exclusively handles negation patterns within visitSub(). Differential Revision: https://reviews.llvm.org/D112657	2021-10-28 08:35:58 -04:00
Sanjay Patel	acabad9ff6	[InstCombine] try to canonicalize icmp with trunc op into mask and cmp The motivating test is based on: https://llvm.org/PR52260 We have better analysis for X == 0, so try harder to form that.	2021-10-26 17:43:28 -04:00
Usman Nadeem	da1318ccca	[NFC][Instcombine] Cleanup some obsolete matches in visitSelectInstr These are now redundant after https://reviews.llvm.org/D106872 Change-Id: I82edfedf1d45cac4e3368d77ce3a48c78e342c19	2021-10-26 10:07:08 -07:00
Philip Reames	3c06ecaa1e	[instcombine] Fix oss-fuzz 39934 (mul matcher can match non-instruction) Fixes a crash observed by oss-fuzz in 39934. Issue at hand is that code expects a pattern match on m_Mul to imply the operand is a mul instruction, however mul constexprs are also valid here.	2021-10-24 14:42:03 -07:00
Sanjay Patel	3888de9507	[InstCombine] generalize reassociated Demorgan folds This updates the recent D112108 / `b92412fb28` to handle the flipped logic ('or') sibling: https://alive2.llvm.org/ce/z/Y2L6Ch	2021-10-21 10:39:29 -04:00
Stanislav Mekhanoshin	b92412fb28	[InstCombine] Fold `(a & ~b) & ~c` to `a & ~(b \| c)` %not1 = xor i32 %b, -1 %not2 = xor i32 %c, -1 %and1 = and i32 %a, %not1 %and2 = and i32 %and1, %not2 => %i1 = or i32 %b, %c %i2 = xor i32 %1, -1 %and2 = and i32 %i2, %a Differential Revision: https://reviews.llvm.org/D112108	2021-10-20 13:05:46 -07:00
Sanjay Patel	80ab06c599	[InstCombine] fold fake vector insert to bit-logic bitcast (inselt (bitcast X), Y, 0) --> or (and X, MaskC), (zext Y) https://alive2.llvm.org/ce/z/Ux-662 Similar to D111082 / `db231ebdb0` : We want to avoid relatively opaque vector ops on types that are likely supported by the backend as scalar integers. The bitwise logic ops are more likely to allow further combining. We probably want to generalize this to allow a shift too, but that would oppose instcombine's general rule of not creating extra instructions, so that's left as a potential follow-up. Alternatively, we could do that transform in VectorCombine with the help of the TTI cost model. This is part of solving: https://llvm.org/PR52057	2021-10-20 14:21:40 -04:00
Simon Pilgrim	71e39e3f18	[ADT] Add APInt::isNegatedPowerOf2() helper Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos..... Differential Revision: https://reviews.llvm.org/D111998	2021-10-19 14:38:21 +01:00
Sanjay Patel	2a3cc4d461	[Analysis] add utility function for unary shuffle mask creation This is NFC-intended for the callers. Posting in case there are other potential users that I missed. I would also use this from VectorCombine in a patch for: https://llvm.org/PR52178 ( D111901 ) Differential Revision: https://reviews.llvm.org/D111891	2021-10-18 09:00:39 -04:00
Sanjay Patel	a49f5386ce	[InstCombine] generalize fold for mask-with-signbit-splat, part 2 This removes an over-specified fold. The more general transform was added with: `727e642e97` There's a difference on an existing test that shows a potentially unnecessary use limit on an icmp fold. That fold is in InstCombinerImpl::foldICmpSubConstant(), and IIRC there was some back-and-forth on it and similar folds because they could cause analysis/passes (SCEV, LSR?) to miss optimizations. Differential Revision: https://reviews.llvm.org/D111410	2021-10-15 17:11:29 -04:00
Sanjay Patel	727e642e97	[InstCombine] generalize fold for mask-with-signbit-splat (iN X s>> (N-1)) & Y --> (X < 0) ? Y : 0 https://alive2.llvm.org/ce/z/qeYhdz I was looking at a missing abs() transform and found my way to this generalization of an existing fold that was added with D67799. As discussed in that review, we want to make sure codegen handles this difference well, and for all of the targets/types that I spot-checked, it looks good. I am leaving the existing fold in place in this commit because it covers a potentially missing icmp fold, but I plan to remove that as a follow-up commit as suggested during review. Differential Revision: https://reviews.llvm.org/D111410	2021-10-15 16:25:48 -04:00

1 2 3 4 5 ...

4641 Commits