llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	2e68e4d60e	[InstCombine] make fold for icmp with sext more efficient; NFC We were creating 2 instructions and relying on a subsequent fold to invert a not(icmp). Create the final icmp directly instead. llvm-svn: 369411	2019-08-20 17:03:22 +00:00
Sanjay Patel	a90ee0eeb6	[InstCombine] improve readability for icmp with cast folds; NFC 1. Update function name and stale code comments. 2. Use variable names that are less ambiguous. 3. Move operand checks into the function as early exits. llvm-svn: 369390	2019-08-20 14:56:44 +00:00
Sanjay Patel	f99d254aae	[InstCombine] simplify min/max of min/max with same operands (PR35607) This is the original integer variant requested in: https://bugs.llvm.org/show_bug.cgi?id=35607 As noted in the TODO and several similar TODOs around this block, we could do this in instsimplify, but then it would cost more because we would be trying to match min/max via ValueTracking in 2 different places. There are 4 commuted variants for each of smin/smax/umin/umax that are not matched here. There are also icmp predicate variants that are not included in the affected test file because they are already handled by instsimplify by folding the final icmp to true/false. https://rise4fun.com/Alive/3KVc Name: smax(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: smin(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %min, i32 %max => %r = %min Name: umax(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: umin(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %min, i32 %max => %r = %min llvm-svn: 369386	2019-08-20 13:39:17 +00:00
Roman Lebedev	9b957d3321	[InstCombine] Cherry-pick NFC cleanups of foldShiftIntoShiftInAnotherHandOfAndInICmp() from D66383 llvm-svn: 369207	2019-08-18 12:26:33 +00:00
Sanjay Patel	a53ad0e157	Revert r367891 - "[InstCombine] combine mul+shl separated by zext" This reverts commit `5dbb90bfe1`. As noted in the post-commit thread for r367891, this can create a multiply that is lowered to a libcall that may not exist. We need to improve the backend decomposition for integer multiply before trying to re-land this (if it's still worthwhile after doing the backend work). llvm-svn: 369174	2019-08-16 23:36:28 +00:00
Sanjay Patel	39eb2324f7	[InstCombine] canonicalize a scalar-select-of-vectors to vector select This pattern may arise more frequently with an enhancement to SLP vectorization suggested in PR42755: https://bugs.llvm.org/show_bug.cgi?id=42755 ...but we should handle this pattern to make things easier for the backend either way. For all in-tree targets that I looked at, codegen for typical vector sizes looks better when we change to a vector select, so this is safe to do without a cost model (in other words, as a target-independent canonicalization). For example, if the condition of the select is a scalar, we end up with something like this on x86: vpcmpgtd %xmm0, %xmm1, %xmm0 vpextrb $12, %xmm0, %eax testb $1, %al jne LBB0_2 ## %bb.1: vmovaps %xmm3, %xmm2 LBB0_2: vmovaps %xmm2, %xmm0 Rather than the splat-condition variant: vpcmpgtd %xmm0, %xmm1, %xmm0 vpshufd $255, %xmm0, %xmm0 ## xmm0 = xmm0[3,3,3,3] vblendvps %xmm0, %xmm2, %xmm3, %xmm0 Differential Revision: https://reviews.llvm.org/D66095 llvm-svn: 369140	2019-08-16 18:51:30 +00:00
Roman Lebedev	16244fccfe	[InstCombine] Shift amount reassociation in bittest: trunc-of-shl (PR42399) Summary: This is continuation of D63829 / https://bugs.llvm.org/show_bug.cgi?id=42399 I thought naive pattern would solve my issue, but nope, it involved truncation, thus more folds needed.. This isn't really the fold i'm interested in, i need trunc-of-lshr, but i'we decided to start with `shl` because it's simpler. In this case, no extra legality checks are needed: https://rise4fun.com/Alive/CAb We should be careful about not increasing instruction count, since we need to produce `zext` because `and` is done in wider type. Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66057 llvm-svn: 369117	2019-08-16 15:10:41 +00:00
Roman Lebedev	32f1e1a01d	[InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of canonicalizeCmpWithConstant(), NFCI I'd like to use it elsewhere, hopefully without reinventing the wheel. No functional change intended so far. llvm-svn: 368820	2019-08-14 09:57:20 +00:00
Roman Lebedev	73f702ff19	[InstCombine] Non-canonical clamp-like pattern handling Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: \| \| ULT \| UGE \| UGT \| \| SLT \| https://rise4fun.com/Alive/yIJ \| https://rise4fun.com/Alive/5BfN \| https://rise4fun.com/Alive/INH \| \| SGE \| https://rise4fun.com/Alive/hd8 \| https://rise4fun.com/Alive/Abk \| https://rise4fun.com/Alive/PlzS \| \| SGT \| https://rise4fun.com/Alive/VYG \| https://rise4fun.com/Alive/oMY \| https://rise4fun.com/Alive/KrzC \| {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687	2019-08-13 12:49:28 +00:00
Roman Lebedev	0410489a34	[InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686	2019-08-13 12:49:16 +00:00
Roman Lebedev	2635c324da	[InstCombine] foldXorOfICmps(): don't give up on non-single-use ICmp's if all users are freely invertible Summary: This is rather unconventional.. As the comment there says, we don't have much folds for xor-of-icmps, we try to turn them into an and-of-icmps, for which we have plenty of folds. But if the ICmp we need to invert is not single-use - we give up. As discussed in https://reviews.llvm.org/D65148#1603922, we may have a non-canonical CLAMP pattern, with bit match and select-of-threshold that we'll potentially clamp. As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`, out of all 8 variations of the pattern, only two are not canonicalized into the variant with and+icmp instead of bit math. The reason is because the ICmp we need to invert is not single-use - we give up. We indeed can't perform this fold at will, the general rule is that we should not increase instruction count in InstCombine, But we wouldn't end up increasing instruction count if we can adapt every other user to the inverted value. This way the `not` we create will get folded, and in the end the instruction count did not increase. For that, of course, we need to look at the users of a Value, which is again rather unconventional for InstCombine :S Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`. The alternatives would be to not create that `not`, but add duplicate code to manually invert all users; or to add some even less general combine to handle some more specific pattern[s]. Reviewers: spatel, nikic, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65530 llvm-svn: 368685	2019-08-13 12:49:06 +00:00
David Bolvansky	20d37fab82	[InstCombine] x /c fabs(x) -> copysign(1.0, x) Summary: x / fabs(x) -> copysign(1.0, x) fabs(x) / x -> copysign(1.0, x) Reviewers: spatel, foad, RKSimon, efriedma Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65898 llvm-svn: 368570	2019-08-12 13:43:35 +00:00
Roman Lebedev	ccdad6ef48	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid constantexpr pitfail (PR42962) Instead of matching value and then blindly casting to BinaryOperator just to get the opcode, just match instruction and do no cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42962 llvm-svn: 368554	2019-08-12 11:28:02 +00:00
Roman Lebedev	96474d17c6	[InstCombine][NFC] Use SimplifyAddInst() instead of SimplifyBinOp(Instruction::BinaryOps::Add, ) llvm-svn: 368521	2019-08-10 19:29:10 +00:00
Roman Lebedev	a8d20b4467	[InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant If one of the values being shifted is a constant, since the new shift amount is known-constant, the new shift will end up being constant-folded so, we don't need that one-use restriction then. llvm-svn: 368519	2019-08-10 19:28:54 +00:00
Roman Lebedev	64fe806c4e	[InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction That one-use restriction is not needed for correctness - we have already ensured that one of the shifts will go away, so we know we won't increase the instruction count. So there is no need for that restriction. llvm-svn: 368518	2019-08-10 19:28:44 +00:00
Evandro Menezes	c6c00cdf2e	[Transforms] Rename hasUnaryFloatFn() and getUnaryFloatFn() (NFC) Rename `hasUnaryFloatFn()` to `hasFloatFn()` and `getUnaryFloatFn()` to `getFloatFnName()`. llvm-svn: 368449	2019-08-09 16:04:18 +00:00
Jay Foad	8e8b295835	[InstCombine] Propagate fast math flags through selects Summary: In SimplifySelectsFeedingBinaryOp, propagate fast math flags from the outer op into both arms of the new select, to take advantage of simplifications that require fast math flags. Reviewers: mcberg2017, majnemer, spatel, arsenm, xbolva00 Subscribers: wdng, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65658 llvm-svn: 368175	2019-08-07 15:16:28 +00:00
Roman Lebedev	9bece444dd	[InstCombine] Recommit: Shift amount reassociation: shl-trunc-shl pattern This was initially committed in r368059 but got reverted in r368084 because there was a faulty logic in how the shift amounts type mismatch was being handled (it simply wasn't). I've added an explicit bailout before we SimplifyAddInst() - i don't think it's designed in general to handle differently-typed values, even though the actual problem only comes from ConstantExpr's. I have also changed the common type deduction, to not just blindly look past zext, but try to do that so that in the end types match. Differential Revision: https://reviews.llvm.org/D65380 llvm-svn: 368141	2019-08-07 09:41:50 +00:00
Reid Kleckner	e4bd38478b	Revert [InstCombine] Shift amount reassociation: shl-trunc-shl pattern This reverts r368059 (git commit `0f95710976`) This caused Clang to assert while self-hosting and compiling SystemZInstrInfo.cpp. Reduction is running. llvm-svn: 368084	2019-08-06 20:32:07 +00:00
Roman Lebedev	0f95710976	[InstCombine] Shift amount reassociation: shl-trunc-shl pattern Summary: Currently `reassociateShiftAmtsOfTwoSameDirectionShifts()` only handles two shifts one after another. If the shifts are `shl`, we still can easily perform the fold, with no extra legality checks: https://rise4fun.com/Alive/OQbM If we have right-shift however, we won't be able to make it any simpler than it already is. After this the only thing missing here is constant-folding: (`NewShAmt >= bitwidth(X)`) * If it's a logical shift, then constant-fold to `0` (not `undef`) * If it's a `ashr`, then a splat of original signbit https://rise4fun.com/Alive/E1K https://rise4fun.com/Alive/i0V Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65380 llvm-svn: 368059	2019-08-06 17:03:40 +00:00
Sanjay Patel	5dbb90bfe1	[InstCombine] combine mul+shl separated by zext This appears to slightly help patterns similar to what's shown in PR42874: https://bugs.llvm.org/show_bug.cgi?id=42874 ...but not in the way requested. That fix will require some later IR and/or backend pass to decompose multiply/shifts into something more optimal per target. Those transforms already exist in some basic forms, but probably need enhancing to catch more cases. https://rise4fun.com/Alive/Qzv2 llvm-svn: 367891	2019-08-05 16:59:58 +00:00
Sanjay Patel	1a29823b9c	[InstCombine] add extra use constraint for shl-zext fold As the test shows, we can end up with more instructions than we started with if we don't include the extra-use check. llvm-svn: 367880	2019-08-05 16:04:07 +00:00
Sanjay Patel	9ce5f41851	[InstCombine] fold cmp+select using select operand equivalence As discussed in PR42696: https://bugs.llvm.org/show_bug.cgi?id=42696 ...but won't help that case yet. We have an odd situation where a select operand equivalence fold was implemented in InstSimplify when it could have been done more generally in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand. Here's an example: https://rise4fun.com/Alive/Xplr %cmp = icmp eq i32 %x, 2147483647 %add = add nsw i32 %x, 1 %sel = select i1 %cmp, i32 -2147483648, i32 %add => %sel = add i32 %x, 1 I've left the InstSimplify code in place for now, but my guess is that we'd prefer to remove that as a follow-up to save on code duplication and compile-time. Differential Revision: https://reviews.llvm.org/D65576 llvm-svn: 367695	2019-08-02 17:39:32 +00:00
Roman Lebedev	0efeaa8162	[IR] SelectInst: add swapValues() utility Summary: Sometimes we need to swap true-val and false-val of a `SelectInst`. Having a function for that is nicer than hand-writing it each time. Reviewers: spatel, RKSimon, craig.topper, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65520 llvm-svn: 367547	2019-08-01 12:31:35 +00:00
Sanjay Patel	435cdecdf7	[InstCombine] canonicalize fneg before fmul/fdiv Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it easier to implement the transforms (and possibly other fneg transforms) in 1 place because we can always start the pattern match from fneg (either the legacy binop or the new unop). There's a secondary practical benefit seen in PR21914 and PR42681: https://bugs.llvm.org/show_bug.cgi?id=21914 https://bugs.llvm.org/show_bug.cgi?id=42681 ...hoisting fneg rather than sinking seems to play nicer with LICM in IR (although this change may expose analysis holes in the other direction). 1. The instcombine test changes show the expected neutral IR diffs from reversing the order. 2. The reassociation tests show that we were missing an optimization opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says that all of these transforms are allowed (regardless of binop/unop fneg version) because: "For all other operations [besides copy/abs/negate/copysign], this standard does not specify the sign bit of a NaN result." In all of these transforms, we always have some other binop (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a potential intermediate NaN operand. (If that interpretation is wrong, then we must already have a bug in the existing transforms?) 3. The clang tests shouldn't exist as-is, but that's effectively a revert of rL367149 (the test broke with an extension of the pre-existing fneg canonicalization in rL367146). Differential Revision: https://reviews.llvm.org/D65399 llvm-svn: 367447	2019-07-31 16:53:22 +00:00
Roman Lebedev	be612ea471	[InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is power-of-two Summary: I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling. While we did happen to handle this fold in most of the cases, the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two), or first turn `x s% -y` to `x u% y`; that does handle most of the cases. But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`, and thus we end up being stuck with `(x s% INT_MIN) == 0`. There is no such restriction for the more general fold: https://rise4fun.com/Alive/IIeS To be noted, the fold does not enforce that `y` is a constant, so it may indeed increase instruction count. This is consistent with what `x u% y`->`x & (y-1)` already does. I think it makes sense, it's at most one (simple) extra instruction, while `rem`ainder is really much more un-simple (and likely very costly). Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65046 llvm-svn: 367322	2019-07-30 15:28:22 +00:00
Sanjay Patel	e9ee7b47d4	[InstCombine] fold fadd+fneg with fdiv/fmul betweena The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367227	2019-07-29 13:50:25 +00:00
Sanjay Patel	5483f4225e	[InstCombine] reduce code for fadd with fneg operand; NFC llvm-svn: 367224	2019-07-29 13:20:46 +00:00
Sanjay Patel	99c57c6daf	[InstCombine] fold fsub+fneg with fdiv/fmul between The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367194	2019-07-28 17:10:06 +00:00
Sanjay Patel	a9ab31558c	[InstCombine] canonicalize negated operand of fdiv This is a transform that we use with fmul, so use it for fdiv too for consistency. llvm-svn: 367146	2019-07-26 19:56:59 +00:00
Sanjay Patel	c229cfeb7a	[InstCombine] remove flop from lerp patterns (Y * (1.0 - Z)) + (X * Z) --> Y - (Y * Z) + (X * Z) --> Y + Z * (X - Y) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=42716 Factoring eliminates an instruction, so that should be a good canonicalization. The potential conversion to FMA would be handled by the backend based on target capabilities. Differential Revision: https://reviews.llvm.org/D65305 llvm-svn: 367101	2019-07-26 11:19:18 +00:00
Vlad Tsyrklevich	5d5a58317c	Revert "[InstCombine] try to narrow a truncated load" This reverts commit `bc4a63fd3c`, this is a speculative revert to fix a number of sanitizer bots (like sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2 compiler crashes, presumably due to a miscompile. llvm-svn: 367029	2019-07-25 15:37:57 +00:00
Sanjay Patel	bc4a63fd3c	[InstCombine] try to narrow a truncated load trunc (load X) --> load (bitcast X to narrow type) We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated load pattern can interfere with other instcombine transforms, so I'd like to allow the fold sooner. Example: https://bugs.llvm.org/show_bug.cgi?id=16739 ...in that report, we have bitcasts bracketing these ops, so those could get eliminated too. We've generally ruled out widening of loads early in IR ( LoadCombine - http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but that reasoning may not apply to narrowing if we can preserve information such as the dereferenceable range. Differential Revision: https://reviews.llvm.org/D64432 llvm-svn: 367011	2019-07-25 12:14:27 +00:00
Sanjay Patel	86e9f9dc26	[Transforms] move copying of load metadata to helper function; NFC There's another proposed load combine that can make use of this code in D64432. llvm-svn: 366949	2019-07-24 22:11:11 +00:00
Craig Topper	e9abc8177a	[InstCombine] Teach foldOrOfICmps to allow icmp eq MIN_INT/MAX to be part of a range comparision. Similar for foldAndOfICmps We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow it to merge with icmp ugt X, C. Similar for the other constants. We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there. Fixes PR42691. Differential Revision: https://reviews.llvm.org/D65017 llvm-svn: 366945	2019-07-24 20:57:29 +00:00
Roman Lebedev	3a94765bfc	[NFC][PatternMatch] Refactor code into a proper "matcher for any integral constant" Having it as a proper matcher is better for reusability elsewhere (in a follow-up patch.) llvm-svn: 366752	2019-07-22 22:09:24 +00:00
Craig Topper	e6cd20ba53	[InstCombine] Update comment I missed in r366649. NFC llvm-svn: 366658	2019-07-21 16:15:03 +00:00
Craig Topper	1d149d08d3	[InstCombine] Remove insertRangeTest code that handles the equality case. For equality, the function called getTrue/getFalse with the VT of the comparison input. But getTrue/getFalse need the boolean VT. So if this code ever executed, it would assert. I believe these cases are removed by InstSimplify so we don't get here. So this patch just fixes up an assert to exclude the equality possibility and removes the broken code. llvm-svn: 366649	2019-07-21 06:43:38 +00:00
Craig Topper	8fabdfe9fc	[InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI AddOne/SubOne create new Constant objects. That seems heavy for comparing ConstantInts which wrap APInts. Just do the math on on the APInts and compare them. llvm-svn: 366648	2019-07-21 05:26:05 +00:00
Roman Lebedev	f2eb403144	[InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) Normally, the inner pattern is sign-extend, but for our purposes it's no different to other patterns: alive proofs: f: https://rise4fun.com/Alive/7U3 For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64524 llvm-svn: 366540	2019-07-19 08:26:58 +00:00
Roman Lebedev	441c9d6ca8	[InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: e: https://rise4fun.com/Alive/0FT For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64521 llvm-svn: 366539	2019-07-19 08:26:47 +00:00
Roman Lebedev	3c212ce305	[InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: d: https://rise4fun.com/Alive/I5Y For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64519 llvm-svn: 366538	2019-07-19 08:26:37 +00:00
Roman Lebedev	2ebe57386d	[InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: c: https://rise4fun.com/Alive/RgJh For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64517 llvm-svn: 366537	2019-07-19 08:26:25 +00:00
Roman Lebedev	4422a1657c	[InstCombine] Dropping redundant masking before left-shift [1/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: b. `(x & (~(-1 << maskNbits))) << shiftNbits` All these patterns can be simplified to just: `x << ShiftShAmt` iff: b. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: b: https://rise4fun.com/Alive/y8M For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64514 llvm-svn: 366536	2019-07-19 08:26:13 +00:00
Roman Lebedev	a5f0824eb5	[InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: a. `(x & ((1 << MaskShAmt) - 1)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: a. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: a: https://rise4fun.com/Alive/wi9 Indeed, not all of these patterns are canonical. But since this fold will only produce a single instruction i'm really interested in handling even uncanonical patterns, since i have this general kind of pattern in hotpaths, and it is not totally outlandish for bit-twiddling code. For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, huihuiz, xbolva00 Reviewed By: xbolva00 Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64512 llvm-svn: 366535	2019-07-19 08:25:43 +00:00
Rui Ueyama	49a3ad21d6	Fix parameter name comments using clang-tidy. NFC. This patch applies clang-tidy's bugprone-argument-comment tool to LLVM, clang and lld source trees. Here is how I created this patch: $ git clone https://github.com/llvm/llvm-project.git $ cd llvm-project $ mkdir build $ cd build $ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \ -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \ -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm $ ninja $ parallel clang-tidy -checks='-,bugprone-argument-comment' \ -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \ ::: ../llvm/lib//.{cpp,h} ../clang/lib/*/.{cpp,h} ../lld/*/.{cpp,h} llvm-svn: 366177	2019-07-16 04:46:31 +00:00
David Bolvansky	9f0d718c66	[InstCombine] Disable fold from D64285 for non-integer types llvm-svn: 365959	2019-07-12 21:14:21 +00:00
David Bolvansky	af1b3185f5	[InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y)) Summary: (select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y)) -> ashr (X, Y)) (select (icmp slt x, 1), ashr (X, Y), lshr (X, Y)) -> ashr (X, Y)) Fixes PR41173 Alive proof by @lebedev.ri (thanks) Name: PR41173 %cmp = icmp slt i32 %x, 1 %shr = lshr i32 %x, %y %shr1 = ashr i32 %x, %y %retval.0 = select i1 %cmp, i32 %shr1, i32 %shr => %retval.0 = ashr i32 %x, %y Optimization: PR41173 Done: 1 Optimization is correct! Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: nikic, craig.topper, llvm-commits, lebedev.ri Tags: #llvm Differential Revision: https://reviews.llvm.org/D64285 llvm-svn: 365893	2019-07-12 11:31:16 +00:00
Sanjay Patel	3487791fea	[InstCombine] don't move FP negation out of a constant expression -(X * ConstExpr) becomes X * (-ConstExpr), so don't reverse that and infinite loop. llvm-svn: 365774	2019-07-11 13:44:29 +00:00
Roman Lebedev	c5f92bd67b	[PatternMatch] Generalize m_SpecificInt_ULT() to take ICmpInst::Predicate As discussed in the original review, this may be useful, so let's just do it. llvm-svn: 365652	2019-07-10 16:07:35 +00:00
Tim Northover	60afa49abe	OpaquePtr: add Type parameter to Loads analysis API. This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job. Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering. llvm-svn: 365468	2019-07-09 11:35:35 +00:00
Sanjay Patel	3dee113ebc	[InstCombine] fold insertelement into splat of same scalar Forming the canonical splat shuffle improves analysis and may allow follow-on transforms (although some possibilities are missing as shown in the test diffs). The backend generically turns these patterns into build_vector, so there should be no codegen regressions. All targets are expected to be able to lower splats efficiently. llvm-svn: 365379	2019-07-08 19:48:52 +00:00
Sanjay Patel	0b59103a73	[InstCombine] canonicalize insert+splat to/from element 0 of vector We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue() and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts to that form for better matching. The backend generically turns these patterns into build_vector, so there should be no codegen difference. llvm-svn: 365342	2019-07-08 16:26:48 +00:00
Sanjay Patel	75b5edf6a1	[InstCombine] allow undef elements when forming splat from chain of insertelements We allow forming a splat (broadcast) shuffle, but we were conservatively limiting that to cases where all elements of the vector are specified. It should be safe from a codegen perspective to allow undefined lanes of the vector because the expansion of a splat shuffle would become the chain of inserts again. Forming splat shuffles can reduce IR and help enable further IR transforms. Motivating bugs: https://bugs.llvm.org/show_bug.cgi?id=42174 https://bugs.llvm.org/show_bug.cgi?id=16739 Differential Revision: https://reviews.llvm.org/D63848 llvm-svn: 365147	2019-07-04 16:45:34 +00:00
Roman Lebedev	9f0c83902d	[InstCombine] Y - ~X --> X + Y + 1 fold (PR42457) Summary: I think we'd want this new variant, because we obviously have better handling for `add` as compared to `sub`/`not`. https://rise4fun.com/Alive/WMn Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]] Reviewers: spatel, nikic, huihuiz, efriedma Reviewed By: spatel Subscribers: RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63992 llvm-svn: 365011	2019-07-03 09:41:50 +00:00
Roman Lebedev	0bde7c6527	[InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484) I was actually wondering if there was some nicer way than m_Value()+cast, but apparently what i was really "subconsciously" thinking about was correctness issue. hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction, not for BinaryOperator, so let's just use m_Instruction(), thus both avoiding a cast, and a crash. Fixes https://bugs.llvm.org/show_bug.cgi?id=42484, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587 llvm-svn: 364915	2019-07-02 12:54:48 +00:00
Sanjay Patel	ddc1b40f26	[InstCombine] reduce more checks for power-of-2-or-zero using ctpop Extends the transform from: rL364341 ...to include another (more common?) pattern that tests whether a value is a power-of-2 (including or excluding zero). llvm-svn: 364856	2019-07-01 22:00:00 +00:00
Roman Lebedev	04d3d3bbff	[InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459) Summary: To be noted, this pattern is not unhandled by instcombine per-se, it is somehow does end up being folded when one runs opt -O3, but not if it's just -instcombine. Regardless, that fold is indirect, depends on some other folds, and is thus blind when there are extra uses. This does address the regression being exposed in D63992. https://godbolt.org/z/7DGltU https://rise4fun.com/Alive/EPO0 Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 \| PR42459 ]] Reviewers: spatel, nikic, huihuiz Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63993 llvm-svn: 364792	2019-07-01 15:55:24 +00:00
Roman Lebedev	72b8d41ce8	[InstCombine] Shift amount reassociation in bittest (PR42399) Summary: Given pattern: `icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0` we should move shifts to the same hand of 'and', i.e. rewrite as `icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)` It might be tempting to not restrict this to situations where we know we'd fold two shifts together, but i'm not sure what rules should there be to avoid endless combine loops. We pick the same shift that was originally used to shift the variable we picked to shift: https://rise4fun.com/Alive/6x1v Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 \| PR42399]]. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63829 llvm-svn: 364791	2019-07-01 15:55:15 +00:00
Roman Lebedev	f55818e3a7	[InstCombine] Omit 'urem' where possible This was added in D63390 / rL364286 to backend, but it makes sense to also handle it in middle-end. https://rise4fun.com/Alive/Zsln llvm-svn: 364738	2019-07-01 09:41:43 +00:00
Sanjay Patel	706b48251f	[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics This is the opposite direction of D62158 (we have to choose 1 form or the other). Now that we have FMF on the select, this becomes more palatable. And the benefits of having a single IR instruction for this operation (less chances of missing folds based on extra uses, etc) overcome my previous comments about the potential advantage of larger pattern matching/analysis. Differential Revision: https://reviews.llvm.org/D62414 llvm-svn: 364721	2019-06-30 13:40:31 +00:00
Roman Lebedev	e3a94ba4a9	[InstCombine] Shift amount reassociation (PR42391) Summary: Given pattern: `(x shiftopcode Q) shiftopcode K` we should rewrite it as `x shiftopcode (Q+K)` iff `(Q+K) u< bitwidth(x)` This is valid for any shift, but they must be identical. * https://rise4fun.com/Alive/9E2 * exact on both lshr => exact https://rise4fun.com/Alive/plHk * exact on both ashr => exact https://rise4fun.com/Alive/QDAA * nuw on both shl => nuw https://rise4fun.com/Alive/5Uk * nsw on both shl => nsw https://rise4fun.com/Alive/0plg Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 \| PR42391]]. Reviewers: spatel, nikic, RKSimon Reviewed By: nikic Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63812 llvm-svn: 364712	2019-06-29 11:51:50 +00:00
Sanjay Patel	71ad22707c	[InstCombine] simplify code for inserts -> splat; NFC llvm-svn: 364441	2019-06-26 15:52:59 +00:00
Huihui Zhang	b90cb57b63	[InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0. Simplify 'shl' inequality test into 'and' equality test. This pattern happens in the middle-end while simplifying bitfield access, Exposed in https://reviews.llvm.org/D63505 https://rise4fun.com/Alive/6uz Reviewers: lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: spatel, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63675 llvm-svn: 364348	2019-06-25 20:44:52 +00:00
Sanjay Patel	fcfa056ceb	[InstCombine] reduce checks for power-of-2-or-zero using ctpop This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero. This is matching the isPowerOf2() patterns used in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 But there's at least 1 instcombine follow-up needed to match the alternate form: (v & (v - 1)) == 0; We should have all of the backend expansions handled with: rL364319 (x86-specific changes still needed for optimal code based on subtarget) And the larger patterns to exclude zero as a power-of-2 are joining with this change after: rL364153 ( D63660 ) rL364246 Differential Revision: https://reviews.llvm.org/D63777 llvm-svn: 364341	2019-06-25 18:51:44 +00:00
Huihui Zhang	4626613ffe	[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier. Summary: To generate simplified IR, make sure fold (X & ~C) ==/!= 0 --> X u</u>= C+1 is scheduled before fold ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. https://rise4fun.com/Alive/7ZN Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63505 llvm-svn: 364255	2019-06-25 00:09:10 +00:00
Sanjay Patel	2675b0c8ab	[InstCombine] squash is-not-power-of-2 using ctpop This is the Demorgan'd 'not' of the pattern handled in: D63660 / rL364153 This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is not a power-of-2 using ctpop(X) > 1, so combining that with an is-zero check of the input is the same as testing if not exactly 1 bit is set: (X == 0) \|\| (ctpop(X) u> 1) --> ctpop(X) != 1 llvm-svn: 364246	2019-06-24 22:35:26 +00:00
Matt Arsenault	8025842599	InstCombine: Preserve nuw when reassociating nuw ops [3/3] Alive says this is OK. llvm-svn: 364235	2019-06-24 21:37:03 +00:00
Matt Arsenault	5d82ecd5d9	InstCombine: Preserve nuw when reassociating nuw ops [2/3] Alive says this is OK. llvm-svn: 364234	2019-06-24 21:37:02 +00:00
Matt Arsenault	5a89ba7343	InstCombine: Preserve nuw when reassociating nuw ops [1/3] Alive says this is OK. llvm-svn: 364233	2019-06-24 21:36:59 +00:00
Sanjay Patel	89efefb170	[InstCombine] reduce funnel-shift i16 X, X, 8 to bswap X Prefer the more exact intrinsic to remove a use of the input value and possibly make further transforms easier (we will still need to match patterns with funnel-shift of wider types as pieces of bswap, especially if we want to canonicalize to funnel-shift with constant shift amount). Discussed in D46760. llvm-svn: 364187	2019-06-24 15:20:49 +00:00
Simon Pilgrim	b617b0808d	[InstCombine] SliceUpIllegalIntegerPHI - bail on out of range shifts trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts. Fixes OSS Fuzz #15217 llvm-svn: 364181	2019-06-24 13:13:36 +00:00
Sanjay Patel	13a5ae58fc	[InstCombine] squash is-power-of-2 that uses ctpop This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is power-of-2-or-0 using ctpop(X) < 2, so combining that with a non-zero check of the input is the same as testing if exactly 1 bit is set: (X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1 Differential Revision: https://reviews.llvm.org/D63660 llvm-svn: 364153	2019-06-23 14:22:37 +00:00
David Bolvansky	dbcdad51ff	[InstCombine] (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1 Summary: ``` %a = sub i32 31, %x %r = shl i32 1, %a => %d = shl i32 1, 31 %r = lshr i32 %d, %x Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/btZm Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63652 llvm-svn: 364073	2019-06-21 16:25:32 +00:00
David Bolvansky	4b28478389	[InstCombine] cttz(abs(x)) -> cttz(x) Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D63546 llvm-svn: 364064	2019-06-21 15:26:22 +00:00
Sanjay Patel	273d97e6bf	[InstCombine] fix typo in comment; NFC llvm-svn: 363974	2019-06-20 20:23:32 +00:00
Sanjay Patel	63311bfb83	[InstCombine] canonicalize check for power-of-2 The form that compares against 0 is better because: 1. It removes a use of the input value. 2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2 3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS). This is a root cause for PR42314, but probably doesn't completely answer the codegen request: https://bugs.llvm.org/show_bug.cgi?id=42314 Alive proof: https://rise4fun.com/Alive/9kG Name: is power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp eq i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp eq i32 %a2, 0 Name: is not power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp ne i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp ne i32 %a2, 0 llvm-svn: 363956	2019-06-20 17:41:15 +00:00
David Bolvansky	01511192b2	[InstCombine] cttz(-x) -> cttz(x) Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63534 llvm-svn: 363951	2019-06-20 17:04:14 +00:00
Huihui Zhang	670778c762	[InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier Summary: To generate simplified IR, make sure fold ``` (X & signbit) ==/!= 0) -> X s>=/s< 0; ``` is scheduled before fold ``` ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. ``` https://rise4fun.com/Alive/fbdh Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63026 llvm-svn: 363845	2019-06-19 17:31:39 +00:00
Matt Arsenault	6d741f29ec	AMDGPU: Fold readlane/readfirstlane calls llvm-svn: 363587	2019-06-17 17:52:35 +00:00
Matt Arsenault	492d71cc99	AMDGPU: Fold readlane intrinsics of constants I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406	2019-06-14 14:51:26 +00:00
Stanislav Mekhanoshin	68a2fef9ae	[AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32 Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339	2019-06-13 23:47:36 +00:00
Sander de Smalen	7957fc6547	[IntrinsicEmitter] Extend argument overloading with forward references. Extend the mechanism to overload intrinsic arguments by using either backward or forward references to the overloadable arguments. In for example: def int_something : Intrinsic<[LLVMPointerToElt<0>], [llvm_anyvector_ty], []>; LLVMPointerToElt<0> is a forward reference to the overloadable operand of type 'llvm_anyvector_ty' and would allow intrinsics such as: declare i32* @llvm.something.v4i32(<4 x i32>); declare i64* @llvm.something.v2i64(<2 x i64>); where the result pointer type is deduced from the element type of the first argument. If the returned pointer is not a pointer to the element type, LLVM will give an error: Intrinsic has incorrect return type! i64* (<4 x i32>)* @llvm.something.v4i32 Reviewers: RKSimon, arsenm, rnk, greened Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62995 llvm-svn: 363233	2019-06-13 08:19:33 +00:00
Cameron McInally	08200d6d26	[InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082	2019-06-11 16:21:21 +00:00
Cameron McInally	796de11331	[InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080	2019-06-11 15:45:41 +00:00
Sanjay Patel	9650c95b7e	[InstCombine] allow unordered preds when canonicalizing to fabs() We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954	2019-06-10 15:39:00 +00:00
Sanjay Patel	85de9634e6	[InstCombine] fix bug in canonicalization to fabs() Forgot to translate the predicate clauses in rL362943. llvm-svn: 362945	2019-06-10 14:57:45 +00:00
Sanjay Patel	8b6d9f60ed	[InstCombine] change canonicalization to fabs() to use FMF on fsub Similar to rL362909: This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fsub because they have the same operand. llvm-svn: 362943	2019-06-10 14:46:36 +00:00
Sanjay Patel	8cd8c5784b	[InstCombine] allow unordered preds when canonicalizing to fabs() PR42179: https://bugs.llvm.org/show_bug.cgi?id=42179 llvm-svn: 362937	2019-06-10 14:14:51 +00:00
Roman Lebedev	d669758d84	[InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles A precondition 'x != 0' was forgotten by me: https://rise4fun.com/Alive/JFNP https://rise4fun.com/Alive/jHvL These 4 folds with non-constants could be re-enabled, but for now let's go for the simplest solution. https://bugs.llvm.org/show_bug.cgi?id=42198 llvm-svn: 362911	2019-06-09 16:30:42 +00:00
Sanjay Patel	87cd16a86e	[InstCombine] change canonicalization to fabs() to use FMF on fneg This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fneg (fsub) because they have the same operand. This works around the most glaring FMF logical inconsistency cited in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 llvm-svn: 362909	2019-06-09 16:22:01 +00:00
Sanjay Patel	ac111e526d	[InstCombine] simplify code for bitcast of insertelement; NFC llvm-svn: 362655	2019-06-05 21:26:52 +00:00
Tim Northover	8d7f118ab2	InstCombine: correctly change byval type attribute alongside call args. When the byval attribute has a type, it must match the pointee type of any parameter; but InstCombine was not updating the attribute when folding casts of various kinds away. llvm-svn: 362643	2019-06-05 20:38:17 +00:00
Roman Lebedev	39390d8317	[InstCombine] 'C-(C2-X) --> X+(C-C2)' constant-fold It looks this fold was already partially happening, indirectly via some other folds, but with one-use limitation. No other fold here has that restriction. https://rise4fun.com/Alive/ftR llvm-svn: 362217	2019-05-31 09:47:16 +00:00
Roman Lebedev	886c4ef35a	[InstCombine] 'add (sub C1, X), C2 --> sub (add C1, C2), X' constant-fold https://rise4fun.com/Alive/qJQ llvm-svn: 362216	2019-05-31 09:47:04 +00:00
Martin Storsjo	0fe645c086	[InstCombine] Avoid use after free in DenseMap, when built with GCC Previously, this used a statement like this: Map[A] = Map[B]; This is equivalent to the following: const auto &Src = Map[B]; auto &Dest = Map[A]; Dest = Src; The second statement, "auto &Dest = Map[A];" can insert a new element into the DenseMap, which can potentially grow and reallocate the DenseMap's internal storage, which will invalidate the existing reference to the source. When doing the actual assignment, the Src reference is dereferenced, accessing memory that was freed when the DenseMap grew. This issue hasn't shown up when LLVM was built with Clang, because the right hand side ended up dereferenced before evaulating the left hand side. (If the value type is a larger data type, Clang doesn't do this but behaves like GCC.) With GCC, a cast to Value* isn't enough to make it dereference the right hand side reference before invoking operator[] (while that is enough to make Clang/LLVM do the right thing for larger types), but storing it in an intermediate variable in a separate statement works. This fixes PR42065. Differential Revision: https://reviews.llvm.org/D62624 llvm-svn: 362150	2019-05-30 20:53:21 +00:00
Nikita Popov	5382803b04	[InstCombine] Optimize always overflowing signed saturating add/sub Based on the overflow direction information added in D62463, we can now fold always overflowing signed saturating add/sub to signed min/max. Differential Revision: https://reviews.llvm.org/D62544 llvm-svn: 362006	2019-05-29 18:37:13 +00:00
Nikita Popov	c51cdacab9	[InstCombine] Clean up saturing math overflow optimizations; NFC Reduce duplication and make it easier to handle signed always-overflows conditions in the future. llvm-svn: 361863	2019-05-28 18:59:21 +00:00
Nikita Popov	332c100562	[ValueTracking][ConstantRange] Distinguish low/high always overflow In order to fold an always overflowing signed saturating add/sub, we need to know in which direction the always overflow occurs. This patch splits up AlwaysOverflows into AlwaysOverflowsLow and AlwaysOverflowsHigh to pass through this information (but it is not used yet). Differential Revision: https://reviews.llvm.org/D62463 llvm-svn: 361858	2019-05-28 18:08:31 +00:00

1 2 3 4 5 ...

3436 Commits