llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	bbf3925879	[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw (REAPPLIED) If value tracking can confirm that a shift value is less than the type bitwidth then we can more confidently fold general or(shl(a,x),lshr(b,sub(bw,x))) patterns to a funnel/rotate intrinsic pattern without causing bad codegen regressions in the backend (see D89139). Reapplied after the shift canonicalization in rG02295e6d1a15 which removed the need to flip the shift values. Differential Revision: https://reviews.llvm.org/D88783	2020-10-12 16:06:41 +01:00
Simon Pilgrim	45d785e22b	Revert rGb97093e520036f8 - "[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw" This reverts commit `b97093e520`. Funnel shift argument commutation isn't working correctly	2020-10-12 11:38:52 +01:00
Roman Lebedev	544a6aa267	[InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load And another step towards transforms not introducing inttoptr and/or ptrtoint casts that weren't there already. As we've been establishing (see D88788/D88789), if there is a int<->ptr cast, it basically must stay as-is, we can't do much with it. I've looked, and the most source of new such casts being introduces, as far as i can tell, is this transform, which, ironically, tries to reduce count of casts.. On vanilla llvm test-suite + RawSpeed, @ `-O3`, this results in -33.58% less `IntToPtr`s (19014 -> 12629) and +76.20% more `PtrToInt`s (18589 -> 32753), which is an increase of +20.69% in total. However just on RawSpeed, where i know there are basically none `IntToPtr` in the original source code, this results in -99.27% less `IntToPtr`s (2724 -> 20) and +82.92% more `PtrToInt`s (4513 -> 8255). which is again an increase of 14.34% in total. To me this does seem like the step in the right direction, we end up with strictly less `IntToPtr`, but strictly more `PtrToInt`, which seems like a reasonable trade-off. See https://reviews.llvm.org/D88860 / https://reviews.llvm.org/D88995 for some more discussion on the subject. (Eventually, `CastInst::isNoopCast()`/`CastInst::isEliminableCastPair` should be taught about this, yes) Reviewed By: nlopes, nikic Differential Revision: https://reviews.llvm.org/D88979	2020-10-11 20:24:28 +03:00
Sanjay Patel	3f3356bdd9	[InstCombine] allow vector splats for add+xor --> shifts	2020-10-11 09:04:24 -04:00
Sanjay Patel	f81200ae99	[InstCombine] add one-use check to add+xor transform As shown in the affected test, we could increase instruction count without this limitation. There's another test with extra use that shows we still convert directly to a real "sext" if possible.	2020-10-11 09:04:24 -04:00
Sanjay Patel	85c7653d92	[InstCombine] add tests with extra uses for add+xor transform; NFC	2020-10-11 09:04:24 -04:00
Sanjay Patel	c5138e61e1	[InstCombine] add/adjust tests for add+xor -> shifts; NFC	2020-10-11 09:04:24 -04:00
Simon Pilgrim	b97093e520	[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw If value tracking can confirm that a shift value is less than the type bitwidth then we can more confidently fold general or(shl(a,x),lshr(b,sub(bw,x))) patterns to a funnel/rotate intrinsic pattern without causing bad codegen regressions in the backend (see D89139). Differential Revision: https://reviews.llvm.org/D88783	2020-10-11 10:37:20 +01:00
Simon Pilgrim	f68d174c16	Remove %tmp variables from test cases to appease update_test_checks.py	2020-10-10 19:13:16 +01:00
Simon Pilgrim	803b712330	[InstCombine] Add test case showing rotate intrinsic being split by SimplifyDemandedBits Noticed while triaging regression report on D88834	2020-10-10 18:19:42 +01:00
Simon Pilgrim	8a836daaa9	[InstCombine] Support lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) uniform vector tests FoldShiftByConstant is hardcoded for scalar/uniform outer shift amounts atm so that needs to be fixed first to support non-uniform cases	2020-10-09 16:54:46 +01:00
Simon Pilgrim	af1f016436	[InstCombine] Add lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) vector tests	2020-10-09 16:54:46 +01:00
Simon Pilgrim	1c040a3e56	[InstCombine] commonShiftTransforms - add support for pow2 nonuniform constant vectors in srem fold Note: we already fold srem to undef if any denominator vector element is undef.	2020-10-09 15:59:33 +01:00
Sanjay Patel	080e6bc205	[InstCombine] allow vector splats for add+and with high-mask There might be a better way to specify the pre-conditions, but this is hopefully clearer than the way it was written: https://rise4fun.com/Alive/Jhk3 Pre: C2 < 0 && isShiftedMask(C2) && (C1 == C1 & C2) %a = and %x, C2 %r = add %a, C1 => %a2 = add %x, C1 %r = and %a2, C2	2020-10-09 10:39:11 -04:00
Simon Pilgrim	ccf1260792	[InstCombine] Add tests for X shift (A srem B) -> X shift (A and B-1) pow2 nonuniform constant vectors	2020-10-09 15:33:06 +01:00
Simon Pilgrim	9e796d5e71	[InstCombine] foldShiftOfShiftedLogic - add support for nonuniform constant vectors	2020-10-09 14:25:12 +01:00
Simon Pilgrim	2efcb6438a	[InstCombine] Add nonuniform/undef vector tests for shift(binop(shift(x,c1),y),c2) patterns	2020-10-09 13:42:11 +01:00
Simon Pilgrim	d9f064dc0b	[InstCombine] visitTrunc - trunc(shl(X, C)) --> shl(trunc(X),trunc(C)) vector support Annoyingly vectors aren't supported by shouldChangeType(), but we have precedents for always performing this on vector types (e.g. narrowBinOp). Differential Revision: https://reviews.llvm.org/D89067	2020-10-08 22:07:51 +01:00
Simon Pilgrim	e1b5fcb942	[InstCombine] Add additional trunc(shl(x,c)) -> shl(trunc(x),trunc(c)) vector tests	2020-10-08 21:11:48 +01:00
Sanjay Patel	f688ae7a0e	[InstCombine] allow vector splats for add+xor with low-mask This can be allowed with undef elements too, but that can be another step: https://alive2.llvm.org/ce/z/hnC4Z-	2020-10-08 15:53:38 -04:00
Sanjay Patel	5ac89add1e	[InstCombine] remove unnecessary one-use check from add-xor transform Pre-conditions seem to be optimal, but we don't need a use check because we are only replacing an add with a sub. https://rise4fun.com/Alive/hzN Pre: (~C1 \| C2 == -1) && isPowerOf2(C2+1) %m = and i8 %x, C1 %f = xor i8 %m, C2 %r = add i8 %f, C3 => %r = sub i8 C2 + C3, %m	2020-10-08 15:08:51 -04:00
Sanjay Patel	a52159a1c3	[InstCombine] add tests for add-xor; NFC	2020-10-08 15:08:51 -04:00
Sanjay Patel	b57451b011	[InstCombine] allow vector splats for add+xor with signmask	2020-10-08 10:46:34 -04:00
Sanjay Patel	395963cbe6	[InstCombine] add vector splat tests for add of signmask; NFC	2020-10-08 10:46:33 -04:00
Simon Pilgrim	5415fef3ab	[InstCombine] matchFunnelShift - support non-uniform constant vector shift amounts (PR46895) Complete basic PR46895 fixes by refactoring D87452/D88402 to allow us to match non-uniform constant values. We still don't handle non-uniform vectors that contain undef elements, but that can wait until we have a decent generic mechanism for this. Differential Revision: https://reviews.llvm.org/D88420	2020-10-08 12:56:27 +01:00
Simon Pilgrim	e1d4ca0009	[InstCombine] matchRotate - add support for matching general funnel shifts with constant shift amounts (PR46896) First step towards extending the existing rotation support to full funnel shift handling now that the backend legalization support has improved. This enables us to match the shift by constant cases, which are pretty trivial to expand again if necessary. D88420 will add non-uniform support for funnel shifts as well once its been finalized. Differential Revision: https://reviews.llvm.org/D88834	2020-10-08 11:05:14 +01:00
Simon Pilgrim	fe0197e194	[InstCombine] Add checks for and(logicalshift(zext(x),undef),y) cases Prep work before some cleanup in narrowMaskedBinOp	2020-10-07 20:59:31 +01:00
Amara Emerson	322d0afd87	[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787	2020-10-07 10:36:44 -07:00
Roman Lebedev	bef27e50b9	[NFC][InstCombine] Autogenerate a few tests being affected by upcoming patch	2020-10-07 19:00:08 +03:00
Philip Reames	14d5ee63e3	[Tests] Precommit test showing gap around load forwarding of vectors in instcombine	2020-10-07 08:57:24 -07:00
Roman Lebedev	fed0f890e5	InstCombine: Negator: don't rely on complexity sorting already being performed (PR47752) In some cases, we can negate instruction if only one of it's operands negates. Previously, we assumed that constants would have been canonicalized to RHS already, but that isn't guaranteed to happen, because of InstCombine worklist visitation order, as the added test (previously-hanging) shows. So if we only need to negate a single operand, we should ensure ourselves that we try constant operand first. Do that by re-doing the complexity sorting ourselves, when we actually care about it. Fixes https://bugs.llvm.org/show_bug.cgi?id=47752	2020-10-07 15:09:50 +03:00
Simon Pilgrim	dce03e3059	[InstCombine] Tweak funnel by constant tests for better shl/lshr commutation coverage	2020-10-07 11:47:03 +01:00
Dávid Bolvanský	86429c4eaf	[SimplifyLibCalls] Optimize mempcpy_chk to mempcpy	2020-10-06 17:08:46 +02:00
Arthur Eubanks	8df17b4dc1	[test][InstCombine][NewPM] Fix InstCombine tests under NPM Some of these depended on analyses being present that aren't provided automatically in NPM. early_dce_clobbers_callgraph.ll was previously inlining a noinline function? cast-call-combine.ll relied on the legacy always-inline pass being a CGSCC pass and getting rerun. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D88187	2020-10-06 07:39:00 -07:00
Roman Lebedev	e00f189d39	[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592) (it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html) This canonicalization seems dubious. Most importantly, while it does not create `inttoptr` casts by itself, it may cause them to appear later, see e.g. D88788. I think it's pretty obvious that it is an undesirable outcome, by now we've established that seemingly no-op `inttoptr`/`ptrtoint` casts are not no-op, and are no longer eager to look past them. Which e.g. means that given ``` %a = load i32 %b = inttoptr %a %c = inttoptr %a ``` we likely won't be able to tell that `%b` and `%c` is the same thing. As we can see in D88789 / D88788 / D88806 / D75505, we can't really teach SCEV about this (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at least) And we can't recover the situation post-inlining in instcombine. So it really does look like this fold is actively breaking otherwise-good IR, in a way that is not recoverable. And that means, this fold isn't helpful in exposing the passes that are otherwise unaware of these patterns it produces. Thusly, i propose to simply not perform such a canonicalization. The original motivational RFC does not state what larger problem that canonicalization was trying to solve, so i'm not sure how this plays out in the larger picture. On vanilla llvm test-suite + RawSpeed, this results in increase of asm instructions and final object size by ~+0.05% decreases final count of bitcasts by -4.79% (-28990), ptrtoint casts by -15.41% (-3423), and of inttoptr casts by -25.59% (-6919, sic). Overall, there's -0.04% less IR blocks, -0.39% instructions. See https://bugs.llvm.org/show_bug.cgi?id=47592 Differential Revision: https://reviews.llvm.org/D88789	2020-10-06 00:00:30 +03:00
Dávid Bolvanský	a4bae56ab8	Revert "[SLC] Optimize mempcpy_chk to mempcpy" This reverts commit `3f1fd59de3`.	2020-10-05 22:27:14 +02:00
Dávid Bolvanský	3f1fd59de3	[SLC] Optimize mempcpy_chk to mempcpy As reported in PR46735: void* f(void d, const void s, size_t l) { return __builtin___mempcpy_chk(d, s, l, __builtin_object_size(d, 0)); } This can be optimized to `return mempcpy(d, s, l);`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86019	2020-10-05 22:18:36 +02:00
Nikita Popov	3641d375f6	[InstCombine] Handle GEP inbounds in select op replacement (PR47730) When retrying the "simplify with operand replaced" select optimization without poison flags, also handle inbounds on GEPs. Of course, this particular example would also be safe to transform while keeping inbounds, but the underlying machinery does not know this (yet).	2020-10-05 21:13:02 +02:00
Nikita Popov	0f8e4a5ed0	[InstCombine] Add test for PR47730	2020-10-05 21:09:53 +02:00
Simon Pilgrim	5ba084c42f	[InstCombine] Extend 'shift with constants' vector tests Added missing test coverage for shl(add(and(lshr(x,c1),c2),y),c1) -> add(and(x,c2<<c1),shl(y,c1)) combine Rename tests as 'foo' and 'bar' isn't very extensible Added vector tests with undefs and nonuniform constants	2020-10-05 17:22:15 +01:00
Simon Pilgrim	2efd9fd699	[InstCombine] Add or(shl(v,and(x,bw-1)),lshr(v,bw-and(x,bw-1))) funnel shift tests If we know the shift amount is less than the bitwidth we should be able to convert this to a funnel shift	2020-10-05 17:22:14 +01:00
Simon Pilgrim	2cd7b0e130	[ValueTracking] canCreateUndefOrPoison - use APInt to check bounds instead of getZExtValue(). Fixes OSS Fuzz #26135	2020-10-05 13:45:27 +01:00
Roman Lebedev	cd20c26622	[NFC][InstCombine] Autogenerate a few tests being affected by an upcoming patch	2020-10-03 22:49:58 +03:00
Simon Pilgrim	53fc426088	[InstCombine] Add tests for or(shl(x,c1),lshr(y,c2)) patterns that could fold to funnel shifts Some initial test coverage toward fixing PR46896 - these are just copied from rotate.ll	2020-10-03 18:32:47 +01:00
Simon Pilgrim	b82a7486d1	[InstCombine] Add or(shl(v,and(x,bw-1)),lshr(v,bw-and(x,bw-1))) rotate tests If we know the shift amount is less than the bitwidth we should be able to convert this to a rotate/funnel shift	2020-10-03 17:17:42 +01:00
Simon Pilgrim	aacfe2be53	[InstCombine] recognizeBSwapOrBitReverseIdiom - add vector support Add basic vector handling to recognizeBSwapOrBitReverseIdiom/collectBitParts - this works at the element level, all vector element operations must match (splat constants etc.) and there is no cross-element support (insert/extract/shuffle etc.).	2020-10-03 16:26:46 +01:00
Simon Pilgrim	3aa93f690b	[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) (Reapplied) If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern. Reapplied with early-out if recognizeBSwapOrBitReverseIdiom collects a source wider than the result type. Differential Revision: https://reviews.llvm.org/D88578	2020-10-03 14:52:42 +01:00
Simon Pilgrim	0364721e3e	Revert rG3d14a1e982ad27 - "[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)" This reverts commit `3d14a1e982`. This is breaking on some 2stage clang buildbots	2020-10-02 18:17:14 +01:00
Simon Pilgrim	d0dd7cadbd	[InstCombine] Add trunc(bswap(trunc/zext(x))) vector tests	2020-10-02 18:05:16 +01:00
Simon Pilgrim	3d14a1e982	[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern. Differential Revision: https://reviews.llvm.org/D88578	2020-10-02 17:25:12 +01:00
Simon Pilgrim	53fb9d062b	[InstCombine] Add partial bswap vector test from D88578	2020-10-02 13:19:19 +01:00
Simon Pilgrim	ec07ae2a83	[InstCombine] Add some basic vector bswap tests We get the vNi16 cases already via matching as a rotate followed by the fshl -> bswap combines	2020-10-02 11:08:12 +01:00
Simon Pilgrim	670e60c023	[InstCombine] Add partial bswap test from D88578	2020-10-02 10:34:30 +01:00
Nikita Popov	9d1c8c0ba9	[InstCombine] Fix select operand simplification with undef (PR47696) When replacing X == Y ? f(X) : Z with X == Y ? f(Y) : Z, make sure that Y cannot be undef. If it may be undef, we might end up picking a different value for undef in the comparison and the select operand.	2020-10-01 21:15:48 +02:00
Sanjay Patel	114e964dce	[InstCombine] auto-generate complete test checks; NFC	2020-10-01 13:44:31 -04:00
Simon Pilgrim	f425418fc4	[InstCombine] Add tests for 'partial' bswap patterns As mentioned on PR47191, if we're bswap'ing some bytes and the zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.	2020-09-30 16:09:09 +01:00
Simon Pilgrim	323d08e50a	[InstCombine] Fix bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector support Use getScalarSizeInBits not getPrimitiveSizeInBits to determine the shift value at the element level.	2020-09-30 16:01:08 +01:00
Simon Pilgrim	b85de2c69c	[InstCombine] Add bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector tests Add tests showing failure to correctly fold vector bswap(trunc(bswap(x))) intrinsic patterns	2020-09-30 16:01:08 +01:00
Simon Pilgrim	2ef73025af	[InstCombine] Remove %tmp variable names from bswap-fold tests Appease update_test_checks script that was complaining about potential %TMP clashes	2020-09-30 15:38:04 +01:00
Simon Pilgrim	7fcad5583a	[InstCombine] Remove %tmp variable names from bswap tests Appease update_test_checks script that was complaining about potential %TMP clashes	2020-09-30 14:44:15 +01:00
Simon Pilgrim	08c5720405	[InstCombine] Add PR47191 bswap tests	2020-09-30 14:19:18 +01:00
Simon Pilgrim	af47d40b9c	[InstCombine] recognizeBSwapOrBitReverseIdiom - recognise zext(bswap(trunc(x))) patterns (PR39793) PR39793 demonstrated an issue where we fail to recognize 'partial' bswap patterns of the lower bytes of an integer source. In fact, most of this is already in place collectBitParts suitably tags zero bits, so we just need to correctly handle this case by finding the zero'd upper bits and reducing the bswap pattern just to the active demanded bits. Differential Revision: https://reviews.llvm.org/D88316	2020-09-30 12:07:19 +01:00
Sanjay Patel	0527c8749b	[InstCombine] ease alignment restriction for converting masked load to normal load I think we initially made this fold conservative to be safer, but we do not need the alignment attribute/metadata limitation because the masked load intrinsic itself specifies the alignment. A normal vector load is better for IR transforms and should be no worse in codegen than the masked alternative. If it is worse for some target, the backend can reverse this transform. Differential Revision: https://reviews.llvm.org/D88505	2020-09-29 15:26:22 -04:00
Sanjay Patel	5409e4831f	[InstCombine] adjust duplicate test for masked load; NFC The test after the changed test was checking exactly the same dereferenceable bytes.	2020-09-29 13:31:10 -04:00
Simon Pilgrim	0cf48a7065	[InstCombine] visitTrunc - trunc (shr (trunc A), C) --> trunc(shr A, C) Attempt to fold trunc (shr (trunc A), C) --> trunc(shr A, C) iff the shift amount if small enough that all zero/sign bits created by the shift are removed by the last trunc. Helps fix the regressions encountered in D88316. I've tweaked a couple of shift values as suggested by @lebedev.ri to ensure we have coverage of shift values close (above/below) to the max limit. Differential Revision: https://reviews.llvm.org/D88429	2020-09-29 18:27:42 +01:00
Sanjay Patel	388b068956	[InstCombine] fix weird formatting in test file; NFC It apparently didn't cause trouble for the parser or FileCheck, but it was confusing to see a function def split by asserts.	2020-09-29 13:22:25 -04:00
Simon Pilgrim	e5f047f27e	[InstCombine] Fix the outofrange tests and add exact shift tests for D88429	2020-09-29 17:15:16 +01:00
Sanjay Patel	ee34d9b210	[InstCombine] use redirect of input file in regression tests; NFC This is a repeat of `1880092722` from 2009. We should have less risk of hitting bugs at this point because we auto-generate positive CHECK lines only, but this makes things consistent. Copying the original commit msg: "Change tests from "opt %s" to "opt < %s" so that opt doesn't see the input filename so that opt doesn't print the input filename in the output so that grep lines in the tests don't unintentionally match strings in the input filename."	2020-09-29 11:06:25 -04:00
Simon Pilgrim	7a55989dc4	[InstCombine] Add some basic trunc(lshr(zext(x),c)) tests Copied from the sext equivalents	2020-09-29 15:49:57 +01:00
Simon Pilgrim	89a8a0c910	[InstCombine] Inherit exact flags on extended shifts in trunc (lshr (sext A), C) --> (ashr A, C) This was missed in D88475	2020-09-29 15:32:09 +01:00
Simon Pilgrim	042f22bda5	[InstCombine] Add exact shift tests missed in D88475 I missed the post-LGTM comment from @lebedev.ri	2020-09-29 15:24:59 +01:00
Simon Pilgrim	14ff38e235	[InstCombine] visitTrunc - trunc (lshr (sext A), C) --> (ashr A, C) non-uniform support This came from @lebedev.ri's suggestion to use m_SpecificInt_ICMP for D88429 - since I was going to change the m_APInt to m_Constant for that patch I thought I would do it for the only other user of the APInt first. I've added a ConstantExpr::getUMin helper - its trivial to add UMAX/SMIN/SMAX but thought I'd wait until we have use cases. Differential Revision: https://reviews.llvm.org/D88475	2020-09-29 15:01:16 +01:00
Simon Pilgrim	324df2661b	[InstCombine] Add trunc(lshr(sext(x),c)) non-uniform vector tests	2020-09-29 10:56:15 +01:00
Serguei Katkov	297ec61130	[IsKnownNonZero] Handle the case with non-constant phi nodes Handle the case when all inputs of phi are proven to be non zero. Constants are checked in beginning of this method before check for depth of recursion, so it is a partial case of non-constant phi. Recursion depth is already handled by the function. Reviewers: aqjune, nikic, efriedma Reviewed By: nikic Subscribers: dantrushin, hiraditya, jdoerfert, llvm-commits Differential Revision: https://reviews.llvm.org/D88276	2020-09-29 15:22:10 +07:00
Simon Pilgrim	2f768a68a1	[InstCombine] Regenerate cast tests. NFC.	2020-09-28 21:32:12 +01:00
Simon Pilgrim	d047bb1cf6	[InstCombine] Add trunc(shr(trunc(x),c)) non-uniform vector tests	2020-09-28 18:53:38 +01:00
Simon Pilgrim	ad4f11a9d3	[InstCombine] Add basic trunc(shr(trunc(x),c)) tests Helps improve the minor regressions noticed on D88316	2020-09-28 18:00:28 +01:00
Simon Pilgrim	63ee42a06b	[InstCombine] matchRotate - force splat of uniform constant rotation amounts (PR46895) Fixes minor bug in D88402 where we were using the original shift constant (with undefs) instead of one with the splat values (re)splatted to all elements.	2020-09-28 15:12:41 +01:00
Simon Pilgrim	dabb14cadd	[InstCombine] matchRotate - allow undef in uniform constant rotation amounts (PR46895) An extension to D87452, we can safely permit undefs in the uniform/splat detection https://alive2.llvm.org/ce/z/nT-ptN Differential Revision: https://reviews.llvm.org/D88402	2020-09-28 13:36:13 +01:00
Simon Pilgrim	0c671bfe00	[InstCombine] Add tests for vector rotate by constants with undefs.	2020-09-28 09:55:43 +01:00
Simon Pilgrim	98c5eebcf7	[InstCombine] Add basic vector test coverage for icmp_eq/ne zero combines	2020-09-26 20:09:11 +01:00
Simon Pilgrim	9ff9c1d8ee	[InstCombine] matchRotate - support (uniform) constant rotation amounts (PR46895) This patch adds handling of rotation patterns with constant shift amounts - the next bit will be how we want to support non-uniform constant vectors. Differential Revision: https://reviews.llvm.org/D87452	2020-09-25 22:03:10 +01:00
Simon Pilgrim	994ef4e7bb	[InstCombine] Fix test name to match type. NFCI. We're testing a <2 x i36> not <2 x i16>	2020-09-25 22:00:36 +01:00
Simon Pilgrim	2a0ca17f66	[InstCombine] collectBitParts - add fshl/fshr handling Pulled from D87452, this is a fixed version of the collectBitParts fshl/fshr handling which as @nikic noticed wasn't checking for different providers or had correct bit ordering (which was hid by only testing shift amounts of bitwidth/2). Differential Revision: https://reviews.llvm.org/D88292	2020-09-25 20:34:59 +01:00
Simon Pilgrim	132f29ce06	[InstCombine] Add some extra bswap tests from PR39793 Also test for cases where recognizeBSwapOrBitReverseIdiom checks for a truncated bswap pattern.	2020-09-25 15:46:19 +01:00
Simon Pilgrim	8d90d92f0d	[InstCombine] Add 'partial' bswap tests from PR39793 Tests for basic zext(bswap(trunc(x))) patterns shown on PR39793	2020-09-25 15:28:21 +01:00
Simon Pilgrim	852447650c	[InstCombine] Add bswap tests from funnel shift intrinsics Based on (WIP) patch in D87452 - I'm intending to add the intrinsics handling to collectBitParts as a separate patch to make the changes clearer.	2020-09-25 12:40:23 +01:00
Matt Arsenault	dc08185ca7	IR: Have byref imply dereferenceable The langref already states it does, but this wasn't implemented. Also covers inalloca and preallocated. Also helps fix a dependence on pointer element types.	2020-09-24 09:57:28 -04:00
Sanjay Patel	8e712807e4	[InstCombine] regenerate test checks; NFC	2020-09-24 09:34:17 -04:00
David Sherwood	59c4d5aad0	[SVE] Fix InstCombinerImpl::PromoteCastOfAllocation for scalable vectors In this patch I've fixed some warnings that arose from the implicit cast of TypeSize -> uint64_t. I tried writing a variety of different cases to show how this optimisation might work for scalable vectors and found: 1. The optimisation does not work for cases where the cast type is scalable and the allocated type is not. This because we need to know how many times the cast type fits into the allocated type. 2. If we pass all the various checks for the case when the allocated type is scalable and the cast type is not, then when creating the new alloca we have to take vscale into account. This leads to sub-optimal IR that is worse than the original IR. 3. For the remaining case when both the alloca and cast types are scalable it is hard to find examples where the optimisation would kick in, except for simple bitcasts, because we typically fail the ABI alignment checks. For now I've changed the code to bail out if only one of the alloca and cast types is scalable. This means we continue to support the existing cases where both types are fixed, and also the specific case when both types are scalable with the same size and alignment, for example a simple bitcast of an alloca to another type. I've added tests that show we don't attempt to promote the alloca, except for simple bitcasts: Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll Differential revision: https://reviews.llvm.org/D87378	2020-09-23 08:43:05 +01:00
Arthur Eubanks	61ac58e10a	[NewPM] Pin tests with -debug-pass to legacy PM -debug-pass is a legacy PM only option. Some tests checks that the pass returned that it made a change, which is not relevant to the NPM, since passes return PreservedAnalyses. Some tests check that passes are freed at the proper time, which is also not relevant to the NPM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87945	2020-09-22 17:54:25 -07:00
Hubert Tong	a60852e9d6	[InstCombine][NFC][tests] Add ninf base value case to pow-sqrt.ll	2020-09-22 18:58:05 -04:00
Hubert Tong	32c9991dab	[InstCombine] Fix errno bug in pow expansion to sqrt A conversion from `pow` to `sqrt` shall not call an `errno`-setting `sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow` call need not. This patch avoids the erroneous (pun not intended) transformation by applying the restrictions discussed in the thread for https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html. The existing tests are updated (depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage): - to add `ninf`, retaining the intended library call, - to use the intrinsic, retaining the use of `select`, or - to expect the replacement to not occur. The following is tested: - The pow intrinsic folds to a `select` instruction to handle -//infinity//. - The pow library call folds, with `ninf`, to `sqrt` without the `select` instruction associated with handling -//infinity//. - The pow library call does not fold to `sqrt` without `ninf`. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D87877	2020-09-22 18:58:05 -04:00
Hubert Tong	6801950192	[InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case The current code for handling pow(x, y) where y is an integer plus 0.5 is not explicitly guarded against attempting to transform the case where abs(y) is exactly 0.5. The latter case is meant to be handled by `replacePowWithSqrt`. Indeed, if the pow(x, integer+0.5) case proceeds past a certain point, it will hit an assertion by attempting to form pow(x, 0) using `getPow`. This patch adds an explicit check to prevent attempting the pow(x, integer+0.5) transformation on pow(x, +/-0.5) as suggested during the review of D87877. This has the effect of retaining the shrinking of `pow` to `powf` when the `sqrt` libcall cannot be formed. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88066	2020-09-22 14:23:32 -04:00
Arthur Eubanks	44b1643d17	[NewPM] Support -disable-simplify-libcall/-disable-builtin in NPM opt Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87932	2020-09-21 16:38:37 -07:00
Arthur Eubanks	f4f7df037e	[DIE] Remove DeadInstEliminationPass This pass is like DeadCodeEliminationPass, but only does one pass through a function instead of iterating on users of eliminated instructions. DeadCodeEliminationPass should be used in all cases. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87933	2020-09-21 12:12:25 -07:00
Sanjay Patel	7903ae4720	[InstCombine] factorize left shifts of add/sub We do similar factorization folds in SimplifyUsingDistributiveLaws, but that drops no-wrap properties. Propagating those optimally may help solve: https://llvm.org/PR47430 The propagation is all-or-nothing for these patterns: when all 3 incoming ops have nsw or nuw, the 2 new ops should have the same no-wrap property: https://alive2.llvm.org/ce/z/Dv8wsU This also solves: https://llvm.org/PR47584	2020-09-20 12:55:24 -04:00
Sanjay Patel	cf75e83275	[InstCombine] replace zombie unreachable values with 'undef' before erasing The test (currently crashing) is reduced from the example provided in the post-commit discussion in D87149. Differential Revision: https://reviews.llvm.org/D87965	2020-09-20 12:25:08 -04:00
Nikita Popov	a2f9098f7a	[InstCombine] Regenerate test checks (NFC)	2020-09-19 21:07:54 +02:00
Sanjay Patel	534e9132af	[InstCombine] auto-generate test checks; NFC	2020-09-19 11:06:47 -04:00

1 2 3 4 5 ...

5421 Commits