llvm-project

Commit Graph

Author	SHA1	Message	Date
Amy Huang	153f20057c	Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" and and partial fix. Causes windows buildbot errors. This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and `53da7ca943`. llvm-svn: 367496	2019-07-31 23:59:31 +00:00
Michael Berg	005d705d43	Migrate some more fadd and fsub cases away from UnsafeFPMath control to utilize NoSignedZerosFPMath options control Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context. Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper Reviewed By: spatel Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji Differential Revision: https://reviews.llvm.org/D65170 llvm-svn: 367486	2019-07-31 21:57:28 +00:00
Amy Huang	27a73dd02c	Fix to r367374 "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" after windows buildbot failure. Added a check that the MachineInstr exists and is a call before trying to add symbols around it. llvm-svn: 367483	2019-07-31 21:03:38 +00:00
Peter Collingbourne	33773d5cfc	SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474	2019-07-31 20:14:09 +00:00
Wei Mi	f49c107f06	[DAGCombine] Limit the number of times for the same store and root nodes to bail out in store merging dependence check. We run into a case where dependence check in store merging bail out many times for the same store and root nodes in a huge basicblock. That increases compile time by almost 100x. The patch add a map to track how many times the bailing out happen for the same store and root, and if it is over a limit, stop considering the store with the same root as a merging candidate. Differential Revision: https://reviews.llvm.org/D65174 llvm-svn: 367472	2019-07-31 19:59:24 +00:00
Amy Huang	53da7ca943	[MS] Emit S_HEAPALLOCSITE debug info in SelectionDAG Summary: This emits labels around heapallocsite calls in SelectionDAG. Reviewers: rnk Subscribers: MatzeB, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61105 llvm-svn: 367374	2019-07-31 00:16:13 +00:00
Wei Mi	888efda280	[DAGCombiner] Add an option to control whether or not to enable store merging. Add an option to control whether or not to enable store merging in dag combiner so we can workaround some bugs more easily. Differential Revision: https://reviews.llvm.org/D65482 llvm-svn: 367365	2019-07-30 23:14:56 +00:00
Simon Pilgrim	f8a7e9de06	[DAGCombine] narrowInsertExtractVectorBinOp - early out for binops that change value type. NFCI. This is implicit in the value type checks in getSubVectorSrc - this just makes it upfront and obvious. llvm-svn: 367220	2019-07-29 11:34:45 +00:00
Simon Pilgrim	76f2f04d9d	[DAGCombine] narrowInsertExtractVectorBinOp - early out for illegal op. NFCI. If the subvector binop is illegal then early-out and avoid the subvector searches. llvm-svn: 367181	2019-07-27 19:42:58 +00:00
Simon Pilgrim	603f94aa2a	[TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through support (Reapplied) This allows us to peek through BITCASTs, attempt to simplify the source operand, and then bitcast back. This reapplies rL367091 which was reverted at rL367118 - we were inconsistently peeking through the bitcasts to the source value. Fixes PR42777 llvm-svn: 367174	2019-07-27 14:11:59 +00:00
Simon Pilgrim	8a52671782	[SelectionDAG] Check for any recursion depth greater than or equal to limit instead of just equal the limit. If anything called the recursive isKnownNeverNaN/computeKnownBits/ComputeNumSignBits/SimplifyDemandedBits/SimplifyMultipleUseDemandedBits with an incorrect depth then we could continue to recurse if we'd already exceeded the depth limit. This replaces the limit check (Depth == 6) with a (Depth >= 6) to make sure that we don't circumvent it. This causes a couple of regressions as a mixture of calls (SimplifyMultipleUseDemandedBits + combineX86ShufflesRecursively) were calling with depths that were already over the limit. I've fixed SimplifyMultipleUseDemandedBits to not do this. combineX86ShufflesRecursively is trickier as we get a lot of regressions if we reduce its own limit from 8 to 6 (it also starts at Depth == 1 instead of Depth == 0 like the others....) - I'll see what I can do in future patches. llvm-svn: 367171	2019-07-27 12:48:46 +00:00
Simon Pilgrim	3ff6126487	[TargetLowering] Add depth limit to SimplifyMultipleUseDemandedBits We're getting reports of massive compile time increases because SimplifyMultipleUseDemandedBits was losing track of the depth and not earlying-out. No repro yet, but consider this a pre-emptive commit. llvm-svn: 367169	2019-07-27 12:23:36 +00:00
Nico Weber	13f337c4cb	Revert r367091, it caused PR42777. llvm-svn: 367118	2019-07-26 14:58:42 +00:00
Simon Pilgrim	a424a1f351	[SelectionDAG] GetDemandedBits - update SIGN_EXTEND_INREG op to just call SimplifyMultipleUseDemandedBits. llvm-svn: 367098	2019-07-26 10:03:07 +00:00
Simon Pilgrim	9758407bf1	[TargetLowering] SimplifyMultipleUseDemandedBits - add SIGN_EXTEND_INREG support. llvm-svn: 367096	2019-07-26 09:41:08 +00:00
Simon Pilgrim	d0164fc525	[SelectionDAG] GetDemandedBits - update OR/XOR ops to just call SimplifyMultipleUseDemandedBits. Eventually all of these will be moved over, but we create nodes in GetDemandedBits recursion at the moment which causes regressions when we try to remove them all. llvm-svn: 367092	2019-07-26 09:13:29 +00:00
Simon Pilgrim	b32ceb79b0	[TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through support. This allows us to peek through BITCASTs and attempt simplify the source operand, and then bitcast back. llvm-svn: 367091	2019-07-26 08:38:39 +00:00
Roman Lebedev	017e272c3a	[Codegen] (X & (C l>>/<< Y)) ==/!= 0 --> ((X <</l>> Y) & C) ==/!= 0 fold Summary: This was originally reported in D62818. https://rise4fun.com/Alive/oPH InstCombine does the opposite fold, in hope that `C l>>/<< Y` expression will be hoisted out of a loop if `Y` is invariant and `X` is not. But as it is seen from the diffs here, if it didn't get hoisted, the produced assembly is almost universally worse. Much like with my recent "hoist add/sub by/from const" patches, we should get almost universal win if we hoist constant, there is almost always an "and/test by imm" instruction, but "shift of imm" not so much, so we may avoid having to materialize the immediate, and thus need one less register. And since we now shift not by constant, but by something else, the live-range of that something else may reduce. Special care needs to be applied not to disturb x86 `BT` / hexagon `tstbit` instruction pattern. And to not get into endless combine loop. Reviewers: RKSimon, efriedma, t.p.northover, craig.topper, spatel, arsenm Reviewed By: spatel Subscribers: hiraditya, MaskRay, wuzish, xbolva00, nikic, nemanjai, jvesely, wdng, nhaehnle, javed.absar, tpr, kristof.beyls, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62871 llvm-svn: 366955	2019-07-24 22:57:22 +00:00
Simon Pilgrim	2bf871be4c	Fix signed/unsigned comparison warning. NFCI. llvm-svn: 366935	2019-07-24 17:44:22 +00:00
Simon Pilgrim	7d318b2bb1	[DAGCombine] matchBinOpReduction - add partial reduction matching This patch adds support for recognizing cases where a larger vector type is being used to reduce just the elements in the lower subvector: e.g. <8 x i32> reduction pattern in a <16 x i32> vector: <4,5,6,7,u,u,u,u,u,u,u,u,u,u,u,u> <2,3,u,u,u,u,u,u,u,u,u,u,u,u,u,u> <1,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u> matchBinOpReduction returns the lower extracted subvector in such cases, assuming isExtractSubvectorCheap accepts the extraction. I've only enabled it for X86 reduction sums so far. I intend to enable it for the bitop/minmax cases in future patches, and eventually I think its worth turning it on all the time. This is mainly just a case of ensuring calls to matchBinOpReduction don't make assumptions on the vector width based on the original vector extraction. Fixes the x86 partial reduction sum cases in PR33758 and PR42023. Differential Revision: https://reviews.llvm.org/D65047 llvm-svn: 366933	2019-07-24 17:29:56 +00:00
Simon Pilgrim	3f01c7197f	[SelectionDAG] makeEquivalentMemoryOrdering - early out for equal chains (PR42727) If we are already using the same chain for the old/new memory ops then just return. Fixes PR42727 which had getLoad() reusing an existing node. llvm-svn: 366922	2019-07-24 16:53:14 +00:00
Sanjay Patel	10dad95a75	[SDAG] convert (sub x, 1) to (add x, -1) in ctpop expansion; NFC We canonicalize to the add form, so create that directly for efficiency. llvm-svn: 366914	2019-07-24 15:43:50 +00:00
Simon Pilgrim	0e8359aec1	[TargetLowering] SimplifyMultipleUseDemandedBits - add VECTOR_SHUFFLE support. If all the demanded elts are from one operand and are inline, then we can use the operand directly. The changes are mainly from SSE41 targets which has blendvpd but not cmpgtq, allowing the v2i64 comparison to be simplified as we only need the signbit from alternate v4i32 elements. llvm-svn: 366817	2019-07-23 15:35:55 +00:00
Simon Pilgrim	743d45ee25	[TargetLowering] Add SimplifyMultipleUseDemandedBits This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured. The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use. We do see a couple of regressions that need to be addressed: AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better). X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG. The code owners have confirmed its ok for these cases to fixed up in future patches. Differential Revision: https://reviews.llvm.org/D63281 llvm-svn: 366799	2019-07-23 12:39:08 +00:00
Craig Topper	a658cb0b12	[DAGCombiner] Make ShrinkLoadReplaceStoreWithStore return an SDValue instead of an SDNode*. NFCI The function was calling getNode() on an SDValue to return and the caller turned the result back into a SDValue. So just return the original SDValue to avoid this. llvm-svn: 366779	2019-07-23 05:13:39 +00:00
Craig Topper	f5247244f2	[DAGCombiner] Use SDNode::isOperandOf to simplify some code. NFCI llvm-svn: 366778	2019-07-23 05:13:35 +00:00
Richard Trieu	81a5045cd6	Move variable out from debug only section. MFI is no longer just needed for an assert. Move it out of the debug only section to allow non-assert builds to be able to find it. llvm-svn: 366773	2019-07-23 02:59:15 +00:00
Philip Reames	2f5543aa72	[Statepoints] Fix a bug in statepoint lowering for functions w/no-realign-stack We were silently using the ABI alignment for all of the stores generated for deopt and gc values. We'd gotten the alignment of the stack slot itself properly reduced (via MachineFrameInfo's clamping), but having the MMO on the store incorrect was enough for us to generate an aligned store to a unaligned location. The simplest fix would have been to just pass the alignment to the helper function, but once we do that, the helper function doesn't really help. So, inline it and directly call the MMO version of DAG.getStore with a properly constructed MMO. Note that there's a separate performance possibility here. Even if we can realign stacks, we probably don't want to if all of the stores are in slowpaths. But that's a later patch, if at all. :) llvm-svn: 366765	2019-07-22 23:33:18 +00:00
Matt Arsenault	542720b2bc	TableGen: Support physical register inputs > 255 This was truncating register value that didn't fit in unsigned char. Switch AMDGPU sendmsg intrinsics to using a tablegen pattern. llvm-svn: 366695	2019-07-22 15:02:34 +00:00
Christudasan Devadasan	006cf8c03d	Added address-space mangling for stack related intrinsics Modified the following 3 intrinsics: int_addressofreturnaddress, int_frameaddress & int_sponentry. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64561 llvm-svn: 366679	2019-07-22 12:42:48 +00:00
Oliver Stannard	6771a89fa0	[IPRA][ARM] Make use of the "returned" parameter attribute ARM has code to recognise uses of the "returned" function parameter attribute which guarantee that the value passed to the function in r0 will be returned in r0 unmodified. IPRA replaces the regmask on call instructions, so needs to be told about this to avoid reverting the optimisation. Differential revision: https://reviews.llvm.org/D64986 llvm-svn: 366669	2019-07-22 08:44:36 +00:00
Roman Lebedev	cd9b19484b	[Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvements Summary: Four things here: 1. Generalize the fold to handle non-splat divisors. Reasonably trivial. 2. Unban power-of-two divisors. I don't see any reason why they should be illegal. * There is no ban in Hacker's Delight * I think the ban came from the same bug that caused the miscompile in the base patch - in `floor((2^W - 1) / D)` we were dividing by `D0` instead of `D`, and we were ensuring that `D0` is not `1`, which made sense. 3. Unban `1` divisors. I no longer believe Hacker's Delight actually says that the fold is invalid for `D = 0`. Further considerations: * We know that * `(X u% 1) == 0` can be constant-folded to `1`, * `(X u% 1) != 0` can be constant-folded to `0`, * Also, we know that * `X u<= -1` can be constant-folded to `1`, * `X u> -1` can be constant-folded to `0`, * https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p * We know will end up with the following: `(setule/setugt (rotr (mul N, P), K), Q)` * Therefore, for given new DAG nodes and comparison predicates (`ule`/`ugt`), we will still produce the correct answer if: `Q` is a all-ones constant; and both `P` and `K` are anything other than `undef`. * The fold will indeed produce `Q = all-ones`. 4. Try to re-splat the `P` and `K` vectors - we don't care about their values for the lanes where divisor was `1`. Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00 Reviewed By: RKSimon Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63963 llvm-svn: 366637	2019-07-20 16:33:15 +00:00
Matt Arsenault	5905aae169	DAG: Handle dbg_value for arguments split into multiple subregs This was handled previously for arguments split due to not fitting in an MVT. This was dropping the register for argument registers split due to TLI::getRegisterTypeForCallingConv. llvm-svn: 366574	2019-07-19 13:36:46 +00:00
Amy Huang	f332fe642c	[COFF] Change a variable type to be const in the HeapAllocSite map. llvm-svn: 366479	2019-07-18 18:22:52 +00:00
Simon Pilgrim	8b525e357f	[DAGCombine] Pull getSubVectorSrc helper out of narrowInsertExtractVectorBinOp. NFCI. NFC step towards reusing this in other EXTRACT_SUBVECTOR combines. llvm-svn: 366435	2019-07-18 13:45:53 +00:00
Evgeniy Stepanov	d752f5e953	Basic codegen for MTE stack tagging. Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360	2019-07-17 19:24:02 +00:00
Amaury Sechet	f34a69c2e2	[DAGCombiner] fold (addcarry (xor a, -1), b, c) -> (subcarry b, a, !c) and flip carry. Summary: As per title. DAGCombiner only mathes the special case where b = 0, this patches extends the pattern to match any value of b. Depends on D57302 Reviewers: hfinkel, RKSimon, craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59208 llvm-svn: 366214	2019-07-16 15:17:00 +00:00
Rui Ueyama	49a3ad21d6	Fix parameter name comments using clang-tidy. NFC. This patch applies clang-tidy's bugprone-argument-comment tool to LLVM, clang and lld source trees. Here is how I created this patch: $ git clone https://github.com/llvm/llvm-project.git $ cd llvm-project $ mkdir build $ cd build $ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \ -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \ -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm $ ninja $ parallel clang-tidy -checks='-,bugprone-argument-comment' \ -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \ ::: ../llvm/lib//.{cpp,h} ../clang/lib/*/.{cpp,h} ../lld/*/.{cpp,h} llvm-svn: 366177	2019-07-16 04:46:31 +00:00
Simon Pilgrim	701e2c0d71	[DAGCombine] narrowExtractedVectorBinOp - wrap subvector extraction in helper. NFCI. First step towards supporting 'free' subvector extractions other than concat_vectors. llvm-svn: 365896	2019-07-12 13:00:35 +00:00
Simon Pilgrim	d0307f93a7	[DAGCombine] narrowInsertExtractVectorBinOp - add CONCAT_VECTORS support We already split extract_subvector(binop(insert_subvector(v,x),insert_subvector(w,y))) -> binop(x,y). This patch adds support for extract_subvector(binop(concat_vectors(),concat_vectors())) cases as well. In particular this means we don't have to wait for X86 lowering to convert concat_vectors to insert_subvector chains, which helps avoid some cases where demandedelts/combine calls occur too late to split large vector ops. The fast-isel-store.ll load folding regression is annoying but I don't think is that critical. Differential Revision: https://reviews.llvm.org/D63653 llvm-svn: 365785	2019-07-11 14:45:03 +00:00
Tim Northover	f2d6597653	OpaquePtr: use byval accessor instead of inspecting pointer type. NFC. The accessor can deal with both "byval(ty)" and "ty* byval" forms seamlessly. llvm-svn: 365769	2019-07-11 13:12:38 +00:00
Sanjay Patel	138328e45c	[SDAG] commute setcc operands to match a subtract If we have: R = sub X, Y P = cmp Y, X ...then flipping the operands in the compare instruction can allow using a subtract that sets compare flags. Motivated by diffs in D58875 - not sure if this changes anything there, but this seems like a good thing independent of that. There's a more involved version of this transform already in IR (in instcombine although that seems misplaced to me) - see "swapMayExposeCSEOpportunities()". Differential Revision: https://reviews.llvm.org/D63958 llvm-svn: 365711	2019-07-10 23:23:54 +00:00
Michael Berg	f4572249d7	Move three folds for FADD, FSUB and FMUL in the DAG combiner away from Unsafe to more aligned checks that reflect context Summary: Unsafe does not map well alone for each of these three cases as it is missing NoNan context when accessed directly with clang. I have migrated the fold guards to reflect the expectations of handing nan and zero contexts directly (NoNan, NSZ) and some tests with it. Unsafe does include NSZ, however there is already precedent for using the target option directly to reflect that context. Reviewers: spatel, wristow, hfinkel, craig.topper, arsenm Reviewed By: arsenm Subscribers: michele.scandale, wdng, javed.absar Differential Revision: https://reviews.llvm.org/D64450 llvm-svn: 365679	2019-07-10 18:23:26 +00:00
Nick Desaulniers	8728e45706	[TargetLowering] support BlockAddress as "i" inline asm constraint Summary: This allows passing address of labels to inline assembly "i" input constraints. Fixes pr/42502. Reviewers: ostannard Reviewed By: ostannard Subscribers: void, echristo, nathanchance, ostannard, javed.absar, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D64167 llvm-svn: 365664	2019-07-10 17:08:25 +00:00
Simon Pilgrim	94c84aca5d	[DAGCombine] visitINSERT_SUBVECTOR - use uint64_t subvector index. NFCI. Keep the uint64_t type from getZExtValue() to stop truncation/extension overflow warnings in MSVC in subvector index math. llvm-svn: 365621	2019-07-10 12:21:35 +00:00
Simon Pilgrim	bb1167a3a1	Fix const/non-const lambda return type warning. NFCI. llvm-svn: 365613	2019-07-10 10:45:09 +00:00
Craig Topper	84a1f07363	[X86][AMDGPU][DAGCombiner] Move call to allowsMemoryAccess into isLoadBitCastBeneficial/isStoreBitCastBeneficial to allow X86 to bypass it Basically the problem is that X86 doesn't set the Fast flag from allowsMemoryAccess on certain CPUs due to slow unaligned memory subtarget features. This prevents bitcasts from being folded into loads and stores. But all vector loads and stores of the same width are the same cost on X86. This patch merges the allowsMemoryAccess call into isLoadBitCastBeneficial to allow X86 to skip it. Differential Revision: https://reviews.llvm.org/D64295 llvm-svn: 365549	2019-07-09 19:55:28 +00:00
Simon Pilgrim	57603cbde8	[DAGCombine] LoadedSlice - keep getOffsetFromBase() uint64_t offset. NFCI. Keep the uint64_t type from getOffsetFromBase() to stop truncation/extension overflow warnings in MSVC in alignment math. llvm-svn: 365504	2019-07-09 15:28:57 +00:00
Tim Northover	60afa49abe	OpaquePtr: add Type parameter to Loads analysis API. This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job. Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering. llvm-svn: 365468	2019-07-09 11:35:35 +00:00
Bjorn Pettersson	051a6a1c33	[SelectionDAG] Simplify some calls to getSetCCResultType. NFC DAGTypeLegalizer and SelectionDAGLegalize has helper functions wrapping the call to TLI.getSetCCResultType(...). Use those helpers in more places. llvm-svn: 365456	2019-07-09 10:27:51 +00:00

1 2 3 4 5 ...

10005 Commits