llvm-project

Commit Graph

Author	SHA1	Message	Date
Matthias Braun	4f82406c46	SelectionDAG: Reuse bigger sized constants in memset expansion. When implementing memset's today we often see this pattern: $x0 = MOV 0xXYXYXYXYXYXYXYXY store $x0, ... $w1 = MOV 0xXYXYXYXY store $w1, ... We first create a 64bit constant in a 64bit register with all bytes the same and then create a 32bit constant with all bytes the same in a 32bit register. In many targets we could just access the lower byte of the 64bit register instead. - Ideally this would be handled by the ConstantHoist pass but it runs too early when memset isn't expanded yet. - The memset expansion code already had this optimization implemented, however SelectionDAG constantfolding would constantfold the "trunc(bigconstnat)" pattern to "smallconstant". - This patch makes the memset expansion mark the constant as Opaque and stop DAGCombiner from constant folding in this situation. (Similar to how ConstantHoisting marks things as Opaque to avoid folding ADD/SUB/etc.) Differential Revision: https://reviews.llvm.org/D53181 llvm-svn: 345102	2018-10-23 23:19:23 +00:00
Craig Topper	c8e183f9ee	Recommit r344877 "[X86] Stop promoting integer loads to vXi64" I've included a fix to DAGCombiner::ForwardStoreValueToDirectLoad that I believe will prevent the previous miscompile. Original commit message: Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to rem I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping. I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the lo I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D53306 llvm-svn: 344965	2018-10-22 22:14:05 +00:00
Matt Arsenault	687ec75d10	DAG: Change behavior of fminnum/fmaxnum nodes Introduce new versions that follow the IEEE semantics to help with legalization that may need quieted inputs. There are some regressions from inserting unnecessary canonicalizes when these are matched from fast math fcmp + select which should be fixed in a future commit. llvm-svn: 344914	2018-10-22 16:27:27 +00:00
Sanjay Patel	e439cc2745	[DAGCombiner] reduce insert+bitcast+extract vector ops to truncate (PR39016) This is a late backend subset of the IR transform added with: D52439 We can confirm that the conversion to a 'trunc' is correct by running: $ opt -instcombine -data-layout="e" (assuming the IR transforms are correct; change "e" to "E" for big-endian) As discussed in PR39016: https://bugs.llvm.org/show_bug.cgi?id=39016 ...the pattern may emerge during legalization, so that's we are waiting for an insertelement to become a scalar_to_vector in the pattern matching here. The DAG allows for fun variations that are not possible in IR. Result types for extracts and scalar_to_vector don't necessarily match input types, so that means we have to be a bit more careful in the transform (see code comments). The tests show that we don't handle cases that require a shift (as we did in the IR version). I've left that as a potential follow-up because I'm not sure if that's a real concern at this late stage. Differential Revision: https://reviews.llvm.org/D53201 llvm-svn: 344872	2018-10-21 20:13:29 +00:00
Sanjay Patel	8bd74785f0	[DAGCombiner] allow undef elts in vector fmul matching llvm-svn: 344534	2018-10-15 16:54:07 +00:00
Sanjay Patel	89e2197c33	[DAGCombiner] refactor folds for fadd (fmul X, -2.0), Y; NFCI The transform doesn't work if the vector constant has undef elements. llvm-svn: 344532	2018-10-15 16:47:01 +00:00
Sanjay Patel	9e7e0fd828	[DAGCombiner] allow undef elts in vector fma matching llvm-svn: 344528	2018-10-15 15:56:39 +00:00
Sanjay Patel	4e970ff022	[DAGCombiner] allow undef elts in vector fma matching llvm-svn: 344525	2018-10-15 15:38:38 +00:00
Sanjay Patel	56b6660d2e	[DAGCombiner] rearrange extract_element+bitcast fold; NFC I want to add another pattern here that includes scalar_to_vector, so this makes that patch smaller. I was hoping to remove the hasOneUse() check because it shouldn't be necessary for common codegen, but an AMDGPU test has a comment suggesting that the extra check makes things better on one of those targets. llvm-svn: 344320	2018-10-11 23:56:56 +00:00
Nirav Dave	f1f2a2a31a	[DAG] Fix Big Endian in Load-Store forwarding Summary: Correct offset calculation in load-store forwarding for big-endian targets. Reviewers: rnk, RKSimon, waltl Subscribers: sdardis, nemanjai, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53147 llvm-svn: 344272	2018-10-11 18:28:59 +00:00
Sanjay Patel	4875662e57	[DAGCombiner] move comment closer to the corresponding code; NFC llvm-svn: 344255	2018-10-11 16:07:25 +00:00
Nirav Dave	07acc992dc	[DAGCombine] Improve Load-Store Forwarding Summary: Extend analysis forwarding loads from preceeding stores to work with extended loads and truncated stores to the same address so long as the load is fully subsumed by the store. Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are deleted as they've no longer seem to be relevant. Reviewers: RKSimon, rnk, kparzysz, javed.absar Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D49200 llvm-svn: 344142	2018-10-10 14:15:52 +00:00
Nemanja Ivanovic	72d4866e57	[DAGCombiner] Expand combining of FP logical ops to sign-setting FP ops We already do the following combines: (bitcast int (and (bitcast fp X to int), 0x7fff...) to fp) -> fabs X (bitcast int (xor (bitcast fp X to int), 0x8000...) to fp) -> fneg X When the target has "bit preserving fp logic". This patch just extends it to also combine: (bitcast int (or (bitcast fp X to int), 0x8000...) to fp) -> fneg (fabs X) As some targets have fnabs and even those that don't can efficiently lower both the fabs and the fneg. Differential revision: https://reviews.llvm.org/D44548 llvm-svn: 344093	2018-10-09 23:20:11 +00:00
Sanjay Patel	b64c0d7b53	[DAGCombiner] simplify code for fmul with constant fold; NFCI llvm-svn: 343997	2018-10-08 21:17:20 +00:00
Sanjay Patel	ecc8af61e7	[DAGCombiner] allow undef elts in vector fadd matching llvm-svn: 343945	2018-10-07 16:30:42 +00:00
Sanjay Patel	ef76e27985	[DAGCombiner] allow undefs when matching vector splats for fmul folds llvm-svn: 343942	2018-10-07 16:05:37 +00:00
Sanjay Patel	0b74c840dd	[DAGCombiner] allow undef elts in vector fabs/fneg matching This change is proposed as a part of D44548, but we need this independently to avoid regressions from improved undef propagation in SimplifyDemandedVectorElts(). llvm-svn: 343940	2018-10-07 15:32:06 +00:00
Sanjay Patel	46a9dc2e3e	[DAGCombiner] shorten code for bitcast+fabs fold; NFC llvm-svn: 343939	2018-10-07 15:18:30 +00:00
Sanjay Patel	f6a160a102	[SelectionDAG] allow undefs when matching splat constants And use that to transform fsub with zero constant operands. The integer part isn't used yet, but it is proposed for use in D44548, so adding both enhancements here makes that patch simpler. llvm-svn: 343865	2018-10-05 17:42:19 +00:00
Matthias Braun	004fe6bf83	DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types This fixes a case of bad index calculation when merging mismatching vector types. This changes the existing code to just use the existing extract_{subvector\|element} and a bitcast (instead of bitcast first and then newly created extract_xxx) so we don't need to adjust any indices in the first place. rdar://44584718 Differential Revision: https://reviews.llvm.org/D52681 llvm-svn: 343493	2018-10-01 16:25:50 +00:00
Simon Pilgrim	818cfc40ff	[DAG] Don't perform SINT_TO_FP<->UINT_TO_FP custom conversion after legalization The SINT_TO_FP<->UINT_TO_FP combines for non-negative integers should only occur for legal ops once LegalOperations = true No test case to hand, noticed when investigating PR38226 + PR38970 llvm-svn: 343405	2018-09-30 12:46:42 +00:00
David Bolvansky	8e90bad63d	[DAGCombiner] [NFC] Improve X div/rem 1 fold Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52661 llvm-svn: 343349	2018-09-28 18:40:30 +00:00
Fangrui Song	0cac726a00	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Craig Topper	b2a00acb24	[DAGCombiner] Remove unnecessary check for visitSDIVLike/visitUDIVLike returning a UDIVREM or SDIVREM node. This shouldn't be possible and is a leftover from when we used to recursively call combine here. llvm-svn: 343049	2018-09-25 23:52:07 +00:00
Nirav Dave	f445a67be4	[DAGCombine] Improve Predecessor check in SimplifySelectOps. NFCI. Reuse search space bookkeeping across multiple predecessor checks qdone to avoid redundancy. This should cut search cost by ~4x. llvm-svn: 342984	2018-09-25 15:29:30 +00:00
Nirav Dave	7373d5e646	[DAGCombine] Share predecessor bookkeeping in CombineToPostIndexedLoadStore. NFCI. llvm-svn: 342983	2018-09-25 15:29:04 +00:00
Nirav Dave	46ab89a0d0	[DAGCombine] Don't fold dependent loads across SELECT_CC. DAGCombine will try to fold two loads that feed a SELECT or SELECT_CC after the select, resulting in a select of an address and a single load after. If either of the loads depend on the other, this is not legal as it could introduce cycles. However, it only checked this if the opcode was a SELECT, and not for a SELECT_CC. Unfortunately, the only reproducer I have for this is for our downstream target. I've tried getting it to trigger on an upstream one but haven't been successful. Patch thanks to Bevin Hansson. llvm-svn: 342980	2018-09-25 14:43:05 +00:00
Sanjay Patel	2c901742ca	[DAGCombiner] use UADDO to optimize saturated unsigned add This is a preliminary step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select. As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant: https://rise4fun.com/Alive/V1Q But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities. This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests. Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place. Differential Revision: https://reviews.llvm.org/D51929 llvm-svn: 342886	2018-09-24 14:47:15 +00:00
Hans Wennborg	83d15dfe2d	Remove debug printf leftover from r342397 llvm-svn: 342863	2018-09-24 08:18:47 +00:00
Craig Topper	5bef27e808	[DAGCombiner] Remove some dead code from ConstantFoldBITCASTofBUILD_VECTOR This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015. llvm-svn: 342856	2018-09-24 02:03:11 +00:00
Craig Topper	b3b94a8e8b	[DAGCombiner] Clarify a comment. NFC This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed. llvm-svn: 342851	2018-09-23 21:17:56 +00:00
Sanjay Patel	0027946915	[DAGCombiner][x86] extend decompose of integer multiply into shift/add with negation This is an alternative to https://reviews.llvm.org/D37896. We can't decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some existing code that overlaps with this transform. This extends D52195 and may resolve PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 (still an open question about transforming legal vector multiplies, but we could open another bug report for those) llvm-svn: 342844	2018-09-23 18:41:38 +00:00
Craig Topper	81f67f7afb	[DAGCombiner] Simplify some code in visitBITCAST. NFCI llvm-svn: 342826	2018-09-22 23:12:34 +00:00
Craig Topper	e79a588cac	[DAGCombiner] Rewrite r331896 in a different way to address a FIXME. NFCI llvm-svn: 342809	2018-09-22 18:03:14 +00:00
Sanjay Patel	8a1227ccc8	[SelectionDAG] replace duplicated peekThroughBitcast helper functions; NFCI x86 had 2 versions of peekThroughBitcast. DAGCombiner had 1. Plus, it had a 1-off implementation for the one-use variant. Move the x86 versions of the code to SelectionDAG, so we don't have different copies of the code. No functional change intended. I'm putting this next to isBitwiseNot() because I am planning to use it in there. Another option is next to the helpers in the ISD namespace (eg, ISD::isConstantSplatVector()). But if there's no good reason for those to be there, I'd prefer to pull other helpers over to SelectionDAG in follow-up steps. Differential Revision: https://reviews.llvm.org/D52285 llvm-svn: 342669	2018-09-20 17:34:08 +00:00
Sanjay Patel	fdc0de19cb	[SelectionDAG] allow vector types with isBitwiseNot() The test diff in not-and-simplify.ll is from a use in SimplifyDemandedBits, and the test diff in add.ll is from a DAGCombiner transform. llvm-svn: 342594	2018-09-19 21:48:30 +00:00
Sanjay Patel	4fd2e2a498	[DAGCombiner][x86] add transform/hook to decompose integer multiply into shift/add This is an alternative to D37896. I don't see a way to decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some duplicate code that overlaps with this transform. As a first step, we're only getting the most clear wins on the vector examples requested in PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 As noted in the code comment, it's likely that the x86 constraints are tighter than necessary, but it may not always be a win to replace a pmullw/pmulld. Differential Revision: https://reviews.llvm.org/D52195 llvm-svn: 342554	2018-09-19 15:57:40 +00:00
Amara Emerson	91c2913522	Revert "Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index."" Fixed the assertion failure. llvm-svn: 342397	2018-09-17 14:40:13 +00:00
Sanjay Patel	3eaf500a6d	[DAGCombiner] try to convert pow(x, 1/3) to cbrt(x) This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040. Copying the motivational statement by @evandro from that patch: "This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. Otherwise, no regressions on x86-64 or A64." I'm proposing to add only the minimum support for a DAG node here. Since we don't have an LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I don't think we need to worry about DAG builder, legalization, a strict variant, etc. We should be able to expand as needed when adding more functionality/transforms. For reference, these are transform suggestions currently listed in SimplifyLibCalls.cpp: // * cbrt(expN(X)) -> expN(x/3) // * cbrt(sqrt(x)) -> pow(x,1/6) // * cbrt(cbrt(x)) -> pow(x,1/9) Also, given that we bail out on long double for now, there should not be any logical differences between platforms (unless there's some platform out there that has pow() but not cbrt()). Differential Revision: https://reviews.llvm.org/D51753 llvm-svn: 342348	2018-09-16 16:50:26 +00:00
Reid Kleckner	4d1b75c6b7	Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index." Causes 'isVector() && "Invalid vector type!"' assertion when building Skia in Chrome. llvm-svn: 342265	2018-09-14 19:39:40 +00:00
Amara Emerson	ef600cbd86	[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index. Differential Revision: https://reviews.llvm.org/D51831 llvm-svn: 342183	2018-09-13 21:28:58 +00:00
Sanjay Patel	8a478b79dc	[DAGCombiner] improve formatting for select+setcc code; NFC llvm-svn: 342095	2018-09-12 23:03:50 +00:00
Simon Pilgrim	96d6b9c2e2	[DAGCombiner] foldBitcastedFPLogic - Add basic vector support Add support for bitcasts from float type to an integer type of the same element bitwidth. There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet. llvm-svn: 341652	2018-09-07 12:13:45 +00:00
Sanjay Patel	dbf52837fe	[DAGCombiner] try to convert pow(x, 0.25) to sqrt(sqrt(x)) This was proposed as an IR transform in D49306, but it was not clearly justifiable as a canonicalization. Here, we only do the transform when the target tells us that sqrt can be lowered with inline code. This is the basic case. Some potential enhancements are in the TODO comments: 1. Generalize the transform for other exponents (allow more than 2 sqrt calcs if that's really cheaper). 2. If we have less fast-math-flags, generate code to avoid -0.0 and/or INF. 3. Allow the transform when optimizing/minimizing size (might require a target hook to get that right). Note that by default, x86 converts single-precision sqrt calcs into sqrt reciprocal estimate with refinement. That codegen is controlled by CPU attributes and can be manually overridden. We have plenty of test coverage for that already, so I didn't bother to include extra testing for that here. AArch uses its full-precision ops in all cases (not sure if that's the intended behavior or not, but that should also be covered by existing tests). Differential Revision: https://reviews.llvm.org/D51630 llvm-svn: 341481	2018-09-05 17:01:56 +00:00
Craig Topper	6666861158	[DAGCombiner] Fix bad identation. NFC llvm-svn: 341103	2018-08-30 19:35:40 +00:00
Simon Pilgrim	b49d5f3b53	[DAGCombiner] Add X / X -> 1 & X % X -> 0 folds Adds more divrem folds to try and get in sync with InstructionSimplify Differential Revision: https://reviews.llvm.org/D50636 llvm-svn: 340919	2018-08-29 11:30:16 +00:00
Nirav Dave	11e39fb6fb	[DAGCombine] Rework MERGE_VALUES to inline in single pass. NFCI. Avoid hyperlinear cost of inlining MERGE_VALUE node by constructing temporary vector and doing a single replacement. llvm-svn: 340853	2018-08-28 18:13:26 +00:00
Craig Topper	c7506b28c1	[DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the resulting load is legal for the target. Summary: I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there... I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))). Reviewers: efriedma, atanasyan, arsenm Reviewed By: efriedma Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50491 llvm-svn: 340797	2018-08-28 03:47:20 +00:00
Matt Arsenault	cea7c6969d	DAG: Check transformed type for forming fminnum/fmaxnum from vselect Follow up to r340655 to fix vector types which are split. llvm-svn: 340766	2018-08-27 18:11:31 +00:00
Sanjay Patel	f645927875	[SelectionDAG] add helper query for binops; NFC We will also use this in a planned enhancement for vector insertelement. llvm-svn: 340741	2018-08-27 14:20:15 +00:00
Sanjay Patel	113cac3b15	[SelectionDAG][x86] turn insertelement into undef with variable index into splat I noticed this along with the patterns in D51125, but when the index is variable, we don't convert insertelement into a build_vector. For x86, that means these get expanded at legalization time into the loading/spilling code that we see in the tests. I think it's always better to avoid going to memory on these, and we get the optimal 'broadcast' if it's available. I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have regression tests that would be affected (although I did not check what would happen in those cases). In the most basic cases shown here, AArch64 would probably do much better with a splat. Differential Revision: https://reviews.llvm.org/D51186 llvm-svn: 340705	2018-08-26 18:20:41 +00:00
Matt Arsenault	5b9ef39bdd	DAG: Allow matching fminnum/fmaxnum from vselect llvm-svn: 340655	2018-08-24 21:24:18 +00:00
Craig Topper	d8e91c3e8d	[DAGCombiner][Mips] Don't combine bitcast+store after LegalOperations when the store is volatile, if the resulting store isn't Legal Previously we allowed the store to be Custom. But without knowing for sure that the Custom handling won't split the store, we shouldn't convert a volatile store. We also probably shouldn't be creating a store the requires custom handling after LegalizeOps. This could lead to an infinite loop if the custom handling was to insert a bitcast. Though I guess isStoreBitCastBeneficial could be used to block such a loop. The test changes here are due to the volatile part of this. The stores in the test are all volatile and i32 stores are marked custom, So we are no longer converting them This is related to D50491 where I was trying to allow some bitcasting of volatile loads Differential Revision: https://reviews.llvm.org/D50578 llvm-svn: 340626	2018-08-24 17:48:25 +00:00
Sam Parker	597811e7a7	[DAGCombiner] Reduce load widths of shifted masks During combining, ReduceLoadWdith is used to combine AND nodes that mask loads into narrow loads. This patch allows the mask to be a shifted constant. This results in a narrow load which is then left shifted to compensate for the new offset. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 340261	2018-08-21 10:26:59 +00:00
Craig Topper	cc5dbbf759	[DAGCombiner] Allow divide by constant optimization on opaque constants. Summary: I believe this restores the behavior we had before r339147. Fixes PR38622. Reviewers: RKSimon, chandlerc, spatel Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50936 llvm-svn: 340120	2018-08-18 05:52:42 +00:00
Simon Pilgrim	03e57521c0	[DAGCombiner] extractShiftForRotate - fix out of range shift issue Don't just check for negative shift amounts. Fixes OSS Fuzz #9935 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9935 llvm-svn: 340015	2018-08-17 12:25:18 +00:00
Simon Pilgrim	5113b48798	[DAGCombine] Improve (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) folding Add support for cases where only some c1+c2 results exceed the max bitshift, clamping accordingly. Differential Revision: https://reviews.llvm.org/D35722 llvm-svn: 340010	2018-08-17 10:52:49 +00:00
Craig Topper	883ff69c93	[DAGCombiner] Don't reassociate operations that have the vector reduction flag set. When nodes are reassociated the vector-reduction flag gets lost. The test case is here is what would happen if you had a sum of absolute differences loop that started with a non-zero but contant sum and that loop was unrolled. The vectorizer will generate a constant vector for the initial value. And DAGCombiner reassociate tries to move it down the addition tree erasing the vector-reduction flag. Interestingly this moves constants the opposite direction of the reassociate IR pass. I've chosen to just punt on the reassociate, but I suppose we could maybe preserve the flag if both nodes have it set. Differential Revision: https://reviews.llvm.org/D50827 llvm-svn: 339946	2018-08-16 21:54:05 +00:00
Simon Pilgrim	e8a906ba47	[DagCombiner] Don't bother adding to the work list if TLI.BuildSDIVPow2 failed. NFCI. Matches the code in BuildSDIV/BuildUDIV llvm-svn: 339757	2018-08-15 10:02:54 +00:00
Eli Friedman	0d12e90bf5	[ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly. Intentionally excluding nodes from the DAGCombine worklist is likely to lead to weird optimizations and infinite loops, so it's generally a bad idea. To avoid the infinite loops, fix DAGCombine to use the isDesirableToCommuteWithShift target hook before performing the transforms in question, and implement the target hook in the ARM backend disable the transforms in question. Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a reduced testcase for that bug. But we should have sufficient test coverage for PerformSHLSimplify given that we're not playing weird tricks with the worklist. I can try to bugpoint it if necessary, though.) Differential Revision: https://reviews.llvm.org/D50667 llvm-svn: 339734	2018-08-14 22:10:25 +00:00
Nirav Dave	fbfe2ad9e0	[DAG] Avoid redundant chain transversal in store merge cycle check. NFCI. Patch by Henric Karlsson. llvm-svn: 339688	2018-08-14 16:20:43 +00:00
Simon Pilgrim	26e3d3f1c8	[DAGCombiner] simplifyDivRem - add comment describing divide by undef/zero combine. NFC. llvm-svn: 339561	2018-08-13 13:12:25 +00:00
Matt Arsenault	1201301b94	DAG: Check no-signed-zeros instead of unsafe-fp-math Addresses fixme, although this should still be checking individual operand flags. llvm-svn: 339525	2018-08-12 19:09:12 +00:00
Michael Berg	ca38254601	extend folding fsub/fadd to fneg for FMF Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation Reviewers: spatel, wristow, arsenm Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D50195 llvm-svn: 339357	2018-08-09 17:00:03 +00:00
Sanjay Patel	e47dc1a405	[DAGCombiner] loosen constraints for fsub+fadd fold isNegatibleForFree() should not matter here (as the test diffs show) because it's always a win to replace an fsub+fadd with fneg. The problem in D50195 persists because either (1) we are doing these folds in the wrong order or (2) we're missing another fold for fadd. llvm-svn: 339299	2018-08-08 23:04:43 +00:00
Sanjay Patel	e327266d45	[DAGCombiner] move fadd simplification ahead of other folds I don't know if it's possible to expose this diff in a test, but we should always try simplifications (no new nodes created) before more complicated transforms for efficiency (similar to what we do in IR). llvm-svn: 339298	2018-08-08 22:46:30 +00:00
Simon Pilgrim	4d4220fa2a	[DAG] DAGCombiner::visitSDIVLike - remove unnecessary isConstOrConstSplat call. NFCI. The isConstOrConstSplat result is only used in a ISD::matchUnaryPredicate call which can perform the equivalent iteration just as quickly. llvm-svn: 339262	2018-08-08 15:37:52 +00:00
Craig Topper	49ed49fcb1	[SelectionDAG] When splitting scatter nodes during DAGCombine, create a serial chain dependency. Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code. Differential Revision: https://reviews.llvm.org/D50374 llvm-svn: 339157	2018-08-07 17:35:02 +00:00
Simon Pilgrim	1bfadb0499	[DAG] Allow non-uniform constant vectors to call BuildSDIV This was missed in D50185. NFC until we add actual non-uniform support to BuildSDIV (similar BuildUDIV support in D49248) - for now it just early outs. llvm-svn: 339147	2018-08-07 14:50:39 +00:00
Simon Pilgrim	7e18938793	[TargetLowering] Add support for non-uniform vectors to BuildUDIV This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators. It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV. Differential Revision: https://reviews.llvm.org/D49248 llvm-svn: 339121	2018-08-07 09:51:34 +00:00
Craig Topper	9de1797c50	[SelectionDAG][X86] Rename MaskedLoadSDNode::getSrc0 to getPassThru. Src0 doesn't really convey any meaning to what the operand is. Passthru matches what's used in the documentation for the intrinsic this comes from. llvm-svn: 339101	2018-08-07 06:52:49 +00:00
Craig Topper	17989208a9	[SelectionDAG][X86] Rename getValue to getPassThru for gather SDNodes. getValue is more meaningful name for scatter than it is for gather. Split them and use getPassThru for gather. llvm-svn: 339096	2018-08-07 06:13:40 +00:00
Simon Pilgrim	94112ebc75	[TargetLowering] Generalise BuildSDIV function First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used. Differential Revision: https://reviews.llvm.org/D50185 llvm-svn: 338838	2018-08-03 10:00:54 +00:00
Craig Topper	2f60ef2c78	[DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329	2018-07-30 23:22:00 +00:00
Sanjay Patel	9f807f44b1	[DAGCombiner] transform sub-of-shifted-signbit to add This is exchanging a sub-of-1 with add-of-minus-1: https://rise4fun.com/Alive/plKAH This is another step towards improving select-of-constants codegen (see D48970). x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral. I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'. Differential Revision: https://reviews.llvm.org/D49924 llvm-svn: 338317	2018-07-30 22:21:37 +00:00
Craig Topper	a568a27dfa	[DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2. llvm-svn: 338303	2018-07-30 21:04:34 +00:00
Craig Topper	b94d5f853b	Revert r338222 "[DAGCombiner] Remove unnecessary calls to AddToWorklist." Thinking about it more it might be possible for the later nodes to be folded in getNode in such a way that the other created nodes are left dead. This can cause use counts to be incorrect on nodes that aren't dead. So its probably safer to leave this alone. llvm-svn: 338298	2018-07-30 20:27:10 +00:00
Fangrui Song	f78650a8de	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00
David Bolvansky	2fa7fb14ea	[DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a rotate can be formed Summary: Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed. This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op. Patch by: sameconrad (Sam Conrad) Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar Reviewed By: lebedev.ri Subscribers: efriedma, kparzysz, llvm-commits Differential Revision: https://reviews.llvm.org/D47681 llvm-svn: 338270	2018-07-30 16:50:00 +00:00
Craig Topper	e978d2ee4a	[DAGCombiner] Remove unnecessary calls to AddToWorklist. The DAGCombiner has a mechanism for ensuring all nodes have been visited at least once. Every time a node is visited, it makes sure its operands have been in the worklist at least once. This ensures that when multiple nodes are created by a combine, only the last node needs to be returned. The earlier nodes can all be found Through this operand check. These means we don't need to explicitly add nodes to the worklist when a combine creates multiple nodes. I've removed the most obvious cases here. There are probably more than can be removed. llvm-svn: 338222	2018-07-29 18:39:26 +00:00
Craig Topper	9db3573d3a	[SelectionDAG] Pass std::vector by reference instead of by pointer to BuildSDIV/BuildUDIV. This removes the need for an assert to ensure the pointer isn't null. Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here. llvm-svn: 338205	2018-07-28 19:44:20 +00:00
Craig Topper	50b1d4303d	[DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B) This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181	2018-07-28 00:27:25 +00:00
Sanjay Patel	c7abb416dc	[DAGCombiner] fold 'not' with signbit math This is a follow-up suggested in D48970. Alive proofs: https://rise4fun.com/Alive/sII We can eliminate an instruction in the usual select-of-constants to bit hack transform by adjusting the add/sub with constant. This is always a win. There are more transforms that are likely wins, but they may need target hooks in case some targets do not benefit. This is another step towards making up for canonicalizing to select-of-constants in rL331486. llvm-svn: 338132	2018-07-27 16:42:55 +00:00
Craig Topper	8b5a2f7aac	[DAGCombiner] Remove some calls to AddToWorklist that should be unnecessary. The DAGCombiner has a system for ensuring all nodes are visited. It doesn't require an AddToWorkList for every node that is created by a combine. llvm-svn: 338079	2018-07-26 22:40:22 +00:00
Nirav Dave	25802ac9fd	[DAG] Avoid Node Update assertion due to AND simplification Check for construction-time folding for incomplete AND nodes in BackwardsPropagateMask. Fixes PR38185. Reviewers: RKSimon, samparker Reviewed By: samparker Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D49444 llvm-svn: 337563	2018-07-20 15:27:24 +00:00
Nirav Dave	5a4e11ad9c	[DAG] Fix Memory ordering check in ReduceLoadOpStore. When merging through a TokenFactor we need to check that the load may be ordered such that no other aliasing memory operations may happen. It is not sufficient to just check that the load is a member of the chain token factor as it there may be a indirect chain. Require the load's chain has only one use. This fixes PR37826. Reviewers: spatel, davide, efriedma, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D49388 llvm-svn: 337560	2018-07-20 15:20:50 +00:00
Craig Topper	d8734450a2	[DAGCombiner] Fold X - (-Y Z) -> X + (Y Z) llvm-svn: 337518	2018-07-20 01:40:03 +00:00
Craig Topper	c12c5d421f	[DAGCombiner] Teach DAGCombiner that A-(-B) is A+B. We already knew A+(-B) is A-B in visitAdd. This does the opposite for visitSub. llvm-svn: 337502	2018-07-19 22:24:43 +00:00
Simon Pilgrim	e4d12bb2d6	[DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELT If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops. Differential Revision: https://reviews.llvm.org/D49262 llvm-svn: 337258	2018-07-17 09:45:35 +00:00
Fangrui Song	cb0bab86b3	[CodeGen] Fix inconsistent declaration parameter name llvm-svn: 337200	2018-07-16 18:51:40 +00:00
Sanjay Patel	79a423cfa2	[DAGCombiner] fix typo in comment; NFC llvm-svn: 337132	2018-07-15 17:09:35 +00:00
Sanjay Patel	a41c886c55	[DAGCombiner] extend(ifpositive(X)) -> shift-right (not X) This is almost the same as an existing IR canonicalization in instcombine, so I'm assuming this is a good early generic DAG combine too. The motivation comes from reduced bit-hacking for select-of-constants in IR after rL331486. We want to restore that functionality in the DAG as noted in the commit comments for that change and the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2018-July/124433.html The PPC and AArch tests show that those targets are already doing something similar. x86 will be neutral in the minimal case and generally better when this pattern is extended with other ops as shown in the signbit-shift.ll tests. Note the asymmetry: we don't include the (extend (ifneg X)) transform because it already exists in SimplifySelectCC(), and that is verified in the later unchanged tests in the signbit-shift.ll files. Without the 'not' op, the general transform to use a shift is always a win because that's a single instruction. Alive proofs: https://rise4fun.com/Alive/ysli Name: if pos, get -1 %c = icmp sgt i16 %x, -1 %r = sext i1 %c to i16 => %n = xor i16 %x, -1 %r = ashr i16 %n, 15 Name: if pos, get 1 %c = icmp sgt i16 %x, -1 %r = zext i1 %c to i16 => %n = xor i16 %x, -1 %r = lshr i16 %n, 15 Differential Revision: https://reviews.llvm.org/D48970 llvm-svn: 337130	2018-07-15 16:27:07 +00:00
Diogo N. Sampaio	b0d85ef975	[NFC][InstCombine] Converts isLegalNarrowLoad into isLegalNarrowLdSt Reuse this function as to test correctness and profitability of reducing width of either load or store operations. Reviewsers: samparker Differential Revision: https://reviews.llvm.org/D48624 llvm-svn: 336800	2018-07-11 12:59:42 +00:00
Simon Pilgrim	075b04a55f	[SelectionDAG] Add constant buildvector support to isKnownNeverZero This allows us to use SelectionDAG::isKnownNeverZero in DAGCombiner::visitREM (visitSDIVLike/visitUDIVLike handle the checking for constants). llvm-svn: 336779	2018-07-11 09:56:41 +00:00
Simon Pilgrim	df9d59771b	[DAGCombiner] Support non-uniform X%C -> X-(X/C)*C folds First stage in PR38057 - support non-uniform constant vectors in the combine to reuse the division-by-constant logic. We can definitely do better for srem pow2 remainders (and avoid that extra multiply....) but this at least helps keep everything on the vector unit. Differential Revision: https://reviews.llvm.org/D48975 llvm-svn: 336774	2018-07-11 09:22:42 +00:00
Simon Pilgrim	97cf111689	[DAGCombiner] Add (urem X, -1) -> select(X == -1, 0, x) fold llvm-svn: 336773	2018-07-11 09:14:37 +00:00
Simon Pilgrim	4cb4609392	[DAGCombiner] Add special case fast paths for udiv x,1 and udiv x,-1 udiv x,-1 was going down the (slow) BuildUDIV route resulting in unnecessary shifts. llvm-svn: 336701	2018-07-10 16:33:07 +00:00
Simon Pilgrim	641097d561	[DAGCombiner] visitREM - call visitSDIVLike/visitUDIVLike directly to avoid recursive combining. As suggested by @efriedma on D48975 use the visitSDIVLike/visitUDIVLike functions introduced at rL336656. llvm-svn: 336664	2018-07-10 13:18:16 +00:00
Simon Pilgrim	ce5c19b623	[DAGCombiner] Split SDIV/UDIV optimization expansions from the rest of the combines. NFCI. As suggested by @efriedma on D48975, this patch separates the BuildDiv/Pow2 style optimizations from the rest of the visitSDIV/visitUDIV to make it easier to reuse the combines and will allow us to avoid some rather nasty node recursive combining in visitREM. llvm-svn: 336656	2018-07-10 11:38:00 +00:00
Roman Lebedev	5ccae1750b	[X86][TLI] DAGCombine: Unfold variable bit-clearing mask to two shifts. Summary: This adds a reverse transform for the instcombine canonicalizations that were added in D47980, D47981. As discussed later, that was worse at least for the code size, and potentially for the performance, too. https://rise4fun.com/Alive/Zmpl Reviewers: craig.topper, RKSimon, spatel Reviewed By: spatel Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D48768 llvm-svn: 336585	2018-07-09 19:06:42 +00:00
Simon Pilgrim	c1d1944053	[DAGCombiner] Add EXTRACT_SUBVECTOR to SimplifyDemandedVectorElts As discussed on PR37989, this patch adds EXTRACT_SUBVECTOR handling to TargetLowering::SimplifyDemandedVectorElts and calls it from DAGCombiner::visitEXTRACT_SUBVECTOR. Differential Revision: https://reviews.llvm.org/D48825 llvm-svn: 336490	2018-07-07 17:30:06 +00:00
Nico Weber	038dbf3c24	Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084. llvm-svn: 336453	2018-07-06 17:37:24 +00:00
Diogo N. Sampaio	17be994942	Added missing semicolon llvm-svn: 336428	2018-07-06 10:09:04 +00:00
Diogo N. Sampaio	742bf1a255	[SelectionDAG] https://reviews.llvm.org/D48278 D48278 Allow to reduce redundant shift masks. For example: x1 = x & 0xAB00 x2 = (x >> 8) & 0xAB can be reduced to: x1 = x & 0xAB00 x2 = x1 >> 8 It only allows folding when the masks and shift values are constants. llvm-svn: 336426	2018-07-06 09:42:25 +00:00
Diogo N. Sampaio	734cfd11fe	Testing commit permision llvm-svn: 336384	2018-07-05 18:49:32 +00:00
Simon Pilgrim	74cc4cfa94	[DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegen Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code llvm-svn: 336198	2018-07-03 14:11:32 +00:00
Simon Pilgrim	fae337704e	[DAGCombiner] Handle correctly non-splat power of 2 -1 divisor (PR37119) The combine added in commit 329525 overlooked the case where one, but not all, of the divisor elements is -1, -1 is the only power of two value for which the sdiv expansion recipe breaks. Thanks to @zvi for the original patch. Differential Revision: https://reviews.llvm.org/D45806 llvm-svn: 336048	2018-06-30 12:22:55 +00:00
Simon Pilgrim	9c70d48cb2	[DAGCombiner] Ensure we use the correct CC result type in visitSDIV (REAPPLIED) We could get away with it for constant folded cases, but not for rL335719. Thanks to Krzysztof Parzyszek for noticing. Reapply original commit rL335821 which was reverted at rL335871 due to a WebAssembly bug that was fixed at rL335884. llvm-svn: 335886	2018-06-28 17:33:41 +00:00
Haojian Wu	2103990e63	Revert "[DAGCombiner] Ensure we use the correct CC result type in visitSDIV" This reverts commit r335821. This crashes the webassembly test, run "ninja check-llvm-codegen-webassembly" to reproduce. llvm-svn: 335871	2018-06-28 16:25:57 +00:00
Simon Pilgrim	abebe4c746	[DAGCombiner] Ensure we use the correct CC result type in visitSDIV We could get away with it for constant folded cases, but not for rL335719. Thanks to Krzysztof Parzyszek for noticing. llvm-svn: 335821	2018-06-28 09:54:28 +00:00
Simon Pilgrim	49cb65bb7b	[DAGCombiner] Remove unused variable. NFCI. Noticed in D45806 review. llvm-svn: 335817	2018-06-28 09:29:08 +00:00
Nirav Dave	7c57ae57a8	[DAGCombine] Disable TokenFactor simplifications when optnone. llvm-svn: 335773	2018-06-27 19:41:25 +00:00
Sanjay Patel	d052de856d	[DAGCombiner] restrict (float)((int) f) --> ftrunc with no-signed-zeros As noted in the D44909 review, the transform from (fptosi+sitofp) to ftrunc can produce -0.0 where the original code does not: #include <stdio.h> int main(int argc) { float x; x = -0.8 * argc; printf("%f\n", (float)((int)x)); return 0; } $ clang -O0 -mavx fp.c ; ./a.out 0.000000 $ clang -O1 -mavx fp.c ; ./a.out -0.000000 Ideally, we'd use IR/node flags to predicate the transform, but the IR parser doesn't currently allow fast-math-flags on the cast instructions. So for now, just use the function attribute that corresponds to clang's "-fno-signed-zeros" option. Differential Revision: https://reviews.llvm.org/D48085 llvm-svn: 335761	2018-06-27 18:16:40 +00:00
Simon Pilgrim	d3e583a52d	[DAGCombiner] visitSDIV - add special case handling for (sdiv X, 1) -> X in pow2 expansion For divisor = 1, perform a select of X - reduces scalarisation of simple SDIVs llvm-svn: 335727	2018-06-27 12:45:31 +00:00
Simon Pilgrim	e835f662fa	[DAGCombiner] visitSDIV - simplify pow2 handling. NFCI. Use the builtin constant folding of getNode() etc. instead of doing it manually. llvm-svn: 335720	2018-06-27 10:51:55 +00:00
Simon Pilgrim	dfbcc66adc	[DAGCombiner] Fold SDIV(%X, MIN_SIGNED) -> SELECT(%X == MIN_SIGNED, 1, 0) Fixes PR37569. llvm-svn: 335719	2018-06-27 10:21:06 +00:00
Simon Pilgrim	0a566bc0ae	[DAGCombiner] Don't accept signbit sdiv divisors in sdiv-by-pow2 vector expansion (PR37569) llvm-svn: 335717	2018-06-27 09:41:22 +00:00
Sanjay Patel	fb9c440ba5	[DAGCombiner] use isBitwiseNot to simplify code; NFC llvm-svn: 335652	2018-06-26 19:46:56 +00:00
Simon Pilgrim	7f55af37f4	[DAGCombiner] Don't accept -1 sdiv divisors in sdiv-by-pow2 vector expansion (PR37119) Temporary fix until I've managed to get D45806 updated - both +1 and -1 special cases need to be properly supported. llvm-svn: 335637	2018-06-26 17:46:51 +00:00
Simon Pilgrim	133b1cdf08	[DAGCombiner] Pull out VT bitwidth in visitSDIV. NFCI. llvm-svn: 335617	2018-06-26 15:39:16 +00:00
Simon Pilgrim	5b6b500687	Fix -Wparentheses gcc warning. NFCI. llvm-svn: 335451	2018-06-25 11:19:05 +00:00
Sanjay Patel	962ee178fa	[DAGCombiner] eliminate setcc bool math when input is low-bit of some value This patch has the same motivating example as D48466: define void @foo(i64 %x, i32 %c.0282.in, i32 %d.0280, i32* %ptr0, i32* %ptr1) { %c.0282 = and i32 %c.0282.in, 268435455 %a16 = lshr i64 32508, %x %a17 = and i64 %a16, 1 %tobool = icmp eq i64 %a17, 0 %. = select i1 %tobool, i32 1, i32 2 %.286 = select i1 %tobool, i32 27, i32 26 %shr97 = lshr i32 %c.0282, %. %shl98 = shl i32 %c.0282.in, %.286 %or99 = or i32 %shr97, %shl98 %shr100 = lshr i32 %d.0280, %. %shl101 = shl i32 %d.0280, %.286 %or102 = or i32 %shr100, %shl101 store i32 %or99, i32* %ptr0 store i32 %or102, i32* %ptr1 ret void } ...but I'm trying to kill the setcc bool math sooner rather than later. By matching a larger pattern that includes both the low-bit mask and the trailing add/sub, we can create a universally good fold because we always eliminate the condition code intermediate value. Here are Alive proofs for these (currently instcombine folds the 'add' variants, but misses the 'sub' patterns): https://rise4fun.com/Alive/Gsyp Name: sub of zext cmp mask %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %z = zext i1 %c to i32 %r = sub i32 C1, %z => %optional_cast = zext i8 %a to i32 %r = add i32 %optional_cast, C1-1 Name: add of zext cmp mask %a = and i32 %x, 1 %c = icmp eq i32 %a, 0 %z = zext i1 %c to i8 %r = add i8 %z, C1 => %optional_cast = trunc i32 %a to i8 %r = sub i8 C1+1, %optional_cast All of the tests look like improvements or neutral to me. But it is possible that x86 test+set+bitop is better than what we now show here. I suspect we could do better by adding another fold for the 'sub' variants. We start with select-of-constant in IR in the larger motivating test, so that's why I included tests with selects. Proofs for those variants: https://rise4fun.com/Alive/Bx1 Name: true const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C2, i64 C1 => %z = zext i8 %a to i64 %r = sub i64 C2, %z Name: false const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C1, i64 C2 => %z = zext i8 %a to i64 %r = add i64 C1, %z Differential Revision: https://reviews.llvm.org/D48466 llvm-svn: 335433	2018-06-24 14:37:30 +00:00
Stanislav Mekhanoshin	22ee191c3e	DAG combine "and\|or (select c, -1, 0), x" -> "select c, x, 0\|-1" Allowed folding for "and/or" binops with non-constant operand if arguments of select are 0/-1 values. Normally this code with "and" opcode does not get to a DAG combiner and simplified yet in the InstCombine. However AMDGPU produces it during lowering and InstCombine has no chance to optimize it out. In turn the same pattern with "or" opcode can reach DAG. Differential Revision: https://reviews.llvm.org/D48301 llvm-svn: 335250	2018-06-21 16:02:05 +00:00
David Green	a465188500	[DAGCombine] Fix alignment for offset loads/stores The alignment parameter to getExtLoad is treated as a base alignment, not the alignment of the load (base + offset). When we infer a better alignment for a Ptr we need to ensure that it applies to the base to prevent the alignment on the load from being wrong. This fixes a bug where the alignment could then be used to incorrectly prove noalias between a load and a store, leading to a miscompile. Differential Revision: https://reviews.llvm.org/D48029 llvm-svn: 335210	2018-06-21 08:30:07 +00:00
Stanislav Mekhanoshin	20279dc025	Allow binop C1, (select cc, CF, CT) -> select folding Previously this folding was done only if select is a first operand. However, for non-commutative operations constant may go before select. Differential Revision: https://reviews.llvm.org/D48223 llvm-svn: 335167	2018-06-20 20:24:20 +00:00
Nirav Dave	cd558887d3	[DAG] Fix and-mask folding when narrowing loads. Summary: Check that and masks are strictly smaller than implicit mask from narrowed load. Fixes PR37820. Reviewers: samparker, RKSimon, nemanjai Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48335 llvm-svn: 335137	2018-06-20 15:36:29 +00:00
Craig Topper	ddd88a559f	[DAGCombiner] Add some comments to some true/false arguments to make it obvious what they are. NFC llvm-svn: 335095	2018-06-20 04:32:07 +00:00
Michael Berg	7b993d762f	Utilize new SDNode flag functionality to expand current support for fadd Summary: This patch originated from D46562 and is a proper subset, with some issues addressed. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar Reviewed By: spatel Subscribers: wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D47909 llvm-svn: 334996	2018-06-18 23:44:59 +00:00
Michael Berg	932ba20af8	refactor of visitFADD for AllowNewConst cases Summary: Refactoring for all constant cases which require AllowNewConst and some staging for future fmf usage. Reviewers: spatel, hfinkel, wristow Reviewed By: spatel Subscribers: nhaehnle Differential Revision: https://reviews.llvm.org/D48289 llvm-svn: 334984	2018-06-18 21:12:21 +00:00
Michael Berg	8e570c3390	Utilize new SDNode flag functionality to expand current support for fma Summary: This patch originated from D47388 and is a proper subset of the originating changes, containing only the fmf optimization guard extensions. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar, rampitec, nhaehnle, nemanjai Reviewed By: rampitec, nhaehnle Subscribers: tpr, nemanjai, wdng Differential Revision: https://reviews.llvm.org/D47918 llvm-svn: 334876	2018-06-16 00:03:06 +00:00
Michael Berg	02d1c6c0cf	Utilize new SDNode flag functionality to expand current support for fdiv Summary: This patch originated from D46562 and is a proper subset, with some issues addressed. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D47954 llvm-svn: 334862	2018-06-15 20:44:55 +00:00
Matt Arsenault	df2f4ef29d	DAG: Fix creating concat_vectors with illegal type Test passes as is, but fails with future patch to make v4i16/v4f16 legal. llvm-svn: 334823	2018-06-15 12:09:15 +00:00
Michael Berg	0c20447a02	easing the constraint for isNegatibleForFree and GetNegatedExpression Summary: Here we relax the old constraint which utilized unsafe with the TargetOption flag HonorSignDependentRoundingFPMathOption, with the assertion that unsafe is no longer needed or never was required for correctness on FDIV/FMUL. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar Reviewed By: spatel Subscribers: efriedma, wdng, tpr Differential Revision: https://reviews.llvm.org/D48057 llvm-svn: 334769	2018-06-14 20:54:13 +00:00
Michael Berg	4663ceb63f	updating isNegatibleForFree and GetNegatedExpression with fmf for fadd Summary: A FMF constraint is added to FADD with unsafe still available as the fallback Reviewers: spatel, wristow, arsenm, hfinkel Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D48180 llvm-svn: 334753	2018-06-14 18:48:31 +00:00
Sanjay Patel	7d4929611c	[DAGCombiner] remove hasOneUse() check from fadd constants transform We're constant folding here, so we shouldn't check uses. This matches the IR optimizer behavior. The x86 test shows the expected win. The AArch64 test shows something else. This only seems to happen if the "generic" AArch64 CPU model is used by MachineCombiner, so I'll file a bug report to follow-up. llvm-svn: 334608	2018-06-13 15:22:48 +00:00
Krzysztof Parzyszek	82d284c1d2	[DAGCombiner] Recognize more patterns for ABS Differential Revision: https://reviews.llvm.org/D47831 llvm-svn: 334553	2018-06-12 21:51:49 +00:00
Michael Berg	5d49f66570	Utilize new SDNode flag functionality to expand current support for fmul Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fmul. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: nhaehnle, wdng Differential Revision: https://reviews.llvm.org/D47911 llvm-svn: 334514	2018-06-12 16:13:11 +00:00
Krzysztof Parzyszek	3d671248ab	[SelectionDAG] Provide default expansion for rotates Implement default legalization of rotates: either in terms of the rotation in the opposite direction (if legal), or in terms of shifts and ors. Implement generating of rotate instructions for Hexagon. Hexagon only supports rotates by an immediate value, so implement custom lowering of ROTL/ROTR on Hexagon. If a rotate is not legal, use the default expansion. Differential Revision: https://reviews.llvm.org/D47725 llvm-svn: 334497	2018-06-12 12:49:36 +00:00
Matt Arsenault	5615fa0a87	DAG: Fix extract_subvector combine for a single element This would fail before because 1x vectors aren't legal, so instead just use the scalar type. Avoids regressions in a future AMDGPU commit to add v4i16/v4f16 as legal types. Test update is just the one test that this triggers on in tree now. It wasn't checking anything before. The result is completely changed since the selects are eliminated. Not sure if it's considered better or not. llvm-svn: 334440	2018-06-11 21:27:41 +00:00
Sanjay Patel	3e5c70cc1d	[DAGCombiner] match vector compare and select sizes with extload operand (PR37427) This patch started off much more general and ambitious, but it's been a nightmare seeing all the ways x86 vector codegen can go wrong. So the code is still structured to allow extending easily, but it's currently limited in several ways: 1. Only handle cases with an extending load. 2. Only handle cases with a zero constant compare. 3. Ignore setcc with vector bitmask (SetCCWidth != 1) - so AVX512 should be unaffected. The motivating case from PR37427: https://bugs.llvm.org/show_bug.cgi?id=37427 ...is the 1st test, and that shows the expected win - we eliminated the unnecessary intermediate cast. There's a clear regression in the last test (sgt_zero_fp_select) because we longer recognize a 'SHRUNKBLEND' opportunity. I think that general problem is also present in sgt_zero, so I'll try to fix that in a follow-up. We need to match a sign-bit setcc from a sign-extended operand and remove it. Differential Revision: https://reviews.llvm.org/D47330 llvm-svn: 334378	2018-06-10 23:09:50 +00:00
Craig Topper	61998289f9	Use SmallPtrSet instead of SmallSet in places where we iterate over the set. SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work. These places were found by hiding the begin/end methods in the SmallSet forwarding llvm-svn: 334343	2018-06-09 05:04:20 +00:00
Sanjay Patel	498564e6fb	[DAGCombiner] clean up comments; NFC llvm-svn: 334312	2018-06-08 18:00:46 +00:00
Michael Berg	bf90d1f263	Utilize new SDNode flag functionality to expand current support for fsub Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fsub. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D47910 llvm-svn: 334306	2018-06-08 17:39:50 +00:00
Sam Parker	16f963ba0d	[DAGCombine] Fix for PR37667 While trying to propagate AND masks back to loads, we currently allow one non-load node to be included as a leaf in chain. This fix now limits that node to produce only a single data value. Differential Revision: https://reviews.llvm.org/D47878 llvm-svn: 334268	2018-06-08 07:49:04 +00:00
Michael Berg	77b5be7ec6	propagate fast math flags via IR on fma and sub expressions Summary: This change uses fmf subflags to guard fma optimizations as well as unsafe. These changes originated from D46483 and have been simplified via getNode. Reviewers: spatel, arsenm, hfinkel, javed.absar Reviewed By: spatel Subscribers: nemanjai, wdng Differential Revision: https://reviews.llvm.org/D47388 llvm-svn: 334242	2018-06-07 22:49:09 +00:00
Matt Arsenault	e8eb567e17	DAG: Avoid bitcast/ext/build_vector combine This avoids regressions in a future AMDGPU change to make v4i16/v4f16 legal. For these types, build_vector is implemented as bitcasted operations on v2i32. This combine was creating v4i16s out of what would have been already been a v2i32 build_vector, creating a mess of nodes that never get cleaned up. I'm not sure this is the right condition to check. I initially tried just checking for the legality of the new build_vector. This works for my case, but breaks dozens of x86 tests. A Mips test seems to show some improvement or at least a neutral change. I don't want to think about how long it would take to analyze the set of different x86 vector operations impacted. Test included in future commit. llvm-svn: 334218	2018-06-07 19:42:27 +00:00
Michael Berg	cc1c4b6912	guard fsqrt with fmf sub flags Summary: This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483. It contains only context for fsqrt. Reviewers: spatel, hfinkel, arsenm Reviewed By: spatel Subscribers: hfinkel, wdng, andrew.w.kaylor, wristow, efriedma, nemanjai Differential Revision: https://reviews.llvm.org/D47749 llvm-svn: 334113	2018-06-06 18:47:55 +00:00
Reid Kleckner	adcaddb6da	Fix -Wcovered-switch-default warning and clang-format it llvm-svn: 333967	2018-06-04 23:47:29 +00:00
Amaury Sechet	da661e9236	[DAGcombine] Teach the combiner about -a = ~a + 1 Summary: This include variant for add, uaddo and addcarry. usubo and subcarry require the carry to be flipped to preserve semantic, but we chose to do the transform anyway in that case as to push the transform down the carry chain. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46505 llvm-svn: 333943	2018-06-04 19:23:22 +00:00
Amaury Sechet	93a7d2aa3c	Get rid of SETCCE Summary: It has been deprecated in favor of SETCCCARRY for a year now and isn't used by any in tree backend. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47685 llvm-svn: 333939	2018-06-04 18:36:22 +00:00

1 2 3 4 5 ...

2401 Commits