llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	952e7106b3	[NFC][InstCombine] Negator: tests for extractelement negation	2020-05-20 21:44:30 +03:00
Arthur Eubanks	8a88755610	Reland [X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 11:25:44 -07:00
Arthur Eubanks	b8cbff51d3	Revert "[X86] Codegen for preallocated" This reverts commit `810567dc69`. Some tests are unexpectedly passing	2020-05-20 10:04:55 -07:00
Sanjay Patel	ad953a1ae1	[InstCombine] add tests for reassociative fsub/fadd expressions; NFC	2020-05-20 12:45:27 -04:00
Arthur Eubanks	810567dc69	[X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 09:20:38 -07:00
Jay Foad	9bc989a48d	[InstCombine] Remove hasNoInfs check for pow(C,y) -> exp2(log2(C)*y) We already check hasNoNaNs and that x is finite and strictly positive. That only leaves the following special cases (taken from the Linux man page for pow): If x is +1, the result is 1.0 (even if y is a NaN). If the absolute value of x is less than 1, and y is negative infinity, the result is positive infinity. If the absolute value of x is greater than 1, and y is negative infinity, the result is +0. If the absolute value of x is less than 1, and y is positive infinity, the result is +0. If the absolute value of x is greater than 1, and y is positive infinity, the result is positive infinity. The first case is handled elsewhere, and this transformation preserves all the others, so there is no need to limit it to hasNoInfs. Differential Revision: https://reviews.llvm.org/D79409	2020-05-19 17:06:05 +01:00
Vedant Kumar	623b254244	[Local] Do not ignore zexts in salvageDebugInfo, PR45923 Summary: When salvaging a dead zext instruction, append a convert operation to the DIExpressions of the debug uses of the instruction, to prevent the salvaged value from being sign-extended. I confirmed that lldb prints out the correct unsigned result for "f" in the example from PR45923 with this changed applied. rdar://63246143 Reviewers: aprantl, jmorse, chrisjackson, davide Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80034	2020-05-18 09:52:02 -07:00
Max Kazantsev	a2a4e5aae8	[Test] Opportunity for sinking to unreachable in InstCombine	2020-05-18 16:27:16 +07:00
Roman Lebedev	fde8eb00e1	[InstCombine] visitMaskedMerge(): when unfolding, sanitize undef constants (PR45955) We can't leave undef vector element constants as-is, it is a miscompile, so we need to sanitize them. We have two vectors (C and ~C): * We can't replace undef with 0 in both of them * We can't replace undef with 0 in only one of them * We could replace undef with -1 in both of them * We could replace undef with -1 in only one(!) of them * We could replace undef with -1 in one and 0 in another one of them. Therefore, it seems best to go with the last option, since otherwise we'd loose knowledge that C and ~C have no common bits set, which seems more important than preserving partial undef knowledge. Fixes https://bugs.llvm.org/show_bug.cgi?id=45955	2020-05-17 22:53:03 +03:00
Sanjay Patel	130a2356ae	[InstCombine] add tests for FP cast of cast; NFC A fold of casts is proposed as a backend transform in D79187, but we can also do that in IR (and that may obsolete the need for a backend transform).	2020-05-17 11:42:07 -04:00
Sanjay Patel	bfd512160f	[InstCombine] improve analysis of FP->int->FP to eliminate fpextend This was originally in D79116. Converting from a narrow-enough FP source value to integer and back to FP guarantees that the conversion to FP is exact because of UB/poison-on-overflow. This was suggested in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19	2020-05-17 09:06:57 -04:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Nikita Popov	f89f7da999	[IR] Convert null-pointer-is-valid into an enum attribute The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it from a string into an enum attribute. In the future, this attribute may be replaced with data layout properties. Differential Revision: https://reviews.llvm.org/D78862	2020-05-15 19:41:07 +02:00
Simon Pilgrim	33d96bf7b9	[InstCombine] Add vector tests for the or(shl(zext(x),32)\|zext(y)) concat combines	2020-05-13 18:48:02 +01:00
Sanjay Patel	856cc60bc1	[InstCombine] canonicalize bitcast after insertelement into undef We have a transform in the opposite direction only for the x86 MMX type, Other types are not handled either way before this patch. The motivating case from PR45748: https://bugs.llvm.org/show_bug.cgi?id=45748 ...is the last test diff. In that example, we are triggering an existing bitcast transform, so we reduce the number of casts, and that should give us the ideal x86 codegen. Differential Revision: https://reviews.llvm.org/D79171	2020-05-10 11:37:47 -04:00
Simon Pilgrim	bab44a698e	[InstCombine] matchOrConcat - match BITREVERSE Fold or(zext(bitreverse(x)),shl(zext(bitreverse(y)),bw/2) -> bitreverse(or(zext(x),shl(zext(y),bw/2)) Practically this is the same as the BSWAP pattern so we might as well handle it.	2020-05-10 16:00:29 +01:00
Sanjay Patel	a62533c29f	[InstCombine] fold fpext into exact integer-to-FP cast We can combine a floating-point extension cast with a conversion from integer if we know the earlier cast is exact. This is an optimization suggested in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19 However, this patch does not change the example suggested there. This patch only uses the existing analysis to handle cases where the integer source value magnitude is narrower than the intermediate FP mantissa (guarantees that the conversion to FP is exact). Follow-up patches to the analysis function can enable more cases. Differential Revision: https://reviews.llvm.org/D79116	2020-05-10 07:04:54 -04:00
Matt Arsenault	16295d521e	InstCombine: Broaden copy-constant-to-alloca optimization Consider any constant memory type, not just global constants. AMDGPU kernel parameters are effectively global constants, but appear as either reads from an intrinsic derived pointer or function argument.	2020-05-09 16:00:27 -04:00
zoecarver	f65f566aeb	Re-commit: Mark values as trivially dead when their only use is a start or end lifetime intrinsic. Summary: If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well. Currently, this only works for allocas, globals, and arguments. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79355	2020-05-08 12:24:10 -07:00
Sanjay Patel	1aa8cef97a	[InstCombine] add/adjust tests for fpext of casted value; NFC	2020-05-08 15:22:36 -04:00
Sanjay Patel	df5c9fdaac	[InstCombine] add tests for known bits before FP casts; NFC	2020-05-08 13:44:32 -04:00
Huihui Zhang	e8ea1eb4c1	[NFC] Adjust test check lines for D78267. This wasn't identified through buildbot before.	2020-05-07 13:20:15 -07:00
Huihui Zhang	1ec0cc0f02	[InstCombine][SVE] Fix visitExtractElementInst for scalable type. Summary: This patch fix the following issues with visitExtractElementInst: 1. Restrict VectorUtils::findScalarElement to fixed-length vector. For scalable type, the number of elements in shuffle mask is unknown at compile-time. 2. Fix out-of-range calculation for fixed-length vector. 3. Skip scalable type when analysis rely on fixed number of elements. 4. Add unit tests to check functionality of extractelement for scalable type. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78267	2020-05-07 13:03:52 -07:00
Huihui Zhang	08c9c13749	[InstCombine][SVE] Fix visitInsertElementInst for scalable type. Summary: This patch fixes the following issues in visitInsertElementInst: 1. Bail out for scalable type when analysis requires fixed size number of vector elements. 2. Use cast<FixedVectorType> to get vector number of elements. This ensure assertion on scalable vector type. 3. For scalable type, avoid folding a chain of insertelement into splat: insertelt(insertelt(insertelt(insertelt X, %k, 0), %k, 1), %k, 2) ... -> shufflevector(insertelt(X, %k, 0), undef, zero) The length of scalable vector is unknown at compile-time, therefore we don't know if given insertelement sequence is valid for splat. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: sdesmalen, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78895	2020-05-07 12:44:52 -07:00
zoecarver	1998e796e9	Revert "Mark values as trivially dead when their only use is a start or end lifetime intrinsic." This reverts commit `95aa28cc8f`.	2020-05-06 11:07:22 -07:00
zoecarver	95aa28cc8f	Mark values as trivially dead when their only use is a start or end lifetime intrinsic. Summary: If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well. Currently, this only works for allocas, globals, and arguments. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79355	2020-05-06 10:58:08 -07:00
Sanjay Patel	2058c98715	[InstCombine] limit bitcast+insertelement transform to x86 MMX type This is unusual for the general case because we are replacing 1 instruction with 2. Splitting from a potential conflicting transform in D79171	2020-05-06 13:12:36 -04:00
Christopher Tetreault	855e02e799	[SVE] Fix invalid usage of getNumElements() in InstCombineMulDivRem Summary: getLogBase2 tries to iterate over the number of vector elements. Since the number of elements of a scalable vector is unknown at compile time, we must return null if the input type is scalable. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, fpetrogalli, kmclaughlin, spatel Reviewed By: efriedma, fpetrogalli Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79197	2020-05-05 15:19:01 -07:00
Jay Foad	22829ab5fa	[InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y) We check that C is finite and strictly positive, but there's no need to check that it's normal too. exp2 should be just as accurate on denormals as pow is. Differential Revision: https://reviews.llvm.org/D79413	2020-05-05 16:25:48 +01:00
Jay Foad	47f5066553	Precommit new test cases for D79413 [InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y)	2020-05-05 16:08:09 +01:00
Jay Foad	fa2783d79a	[InstCombine] Remove hasOneUse check for pow(C,x) -> exp2(log2(C)*x) I don't think there's any good reason not to do this transformation when the pow has multiple uses. Differential Revision: https://reviews.llvm.org/D79407	2020-05-05 14:46:08 +01:00
Simon Pilgrim	5c91aa6603	[InstCombine] Fold or(zext(bswap(x)),shl(zext(bswap(y)),bw/2)) -> bswap(or(zext(x),shl(zext(y), bw/2)) This adds a general combine that can be used to fold: or(zext(OP(x)), shl(zext(OP(y)),bw/2)) --> OP(or(zext(x), shl(zext(y),bw/2))) Allowing us to widen 'concat-able' style or+zext patterns - I've just set this up for BSWAP but we could use this for other similar ops (BITREVERSE for instance). We already do something similar for bitop(bswap(x),bswap(y)) --> bswap(bitop(x,y)) Fixes PR45715 Reviewed By: @lebedev.ri Differential Revision: https://reviews.llvm.org/D79041	2020-05-05 12:30:10 +01:00
Simon Pilgrim	940061438e	[InstCombine] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476) This patch adds support for discarding integer absolutes (abs + nabs variants) from self-multiplications. ABS Alive2: http://volta.cs.utah.edu:8080/z/rwcc8W NABS Alive2: http://volta.cs.utah.edu:8080/z/jZXUwQ This is an InstCombine version of D79304 - I'm not sure yet if we'll need that after this. Reviewed By: @lebedev.ri and @xbolva00 Differential Revision: https://reviews.llvm.org/D79319	2020-05-04 15:21:52 +01:00
Jay Foad	e737847b8f	[SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func optimizePow does not create any new calls to pow, so it should work regardless of whether the pow library function is available. This allows it to optimize the llvm.pow intrinsic on targets with no math library. Based on a patch by Tim Renouf. Differential Revision: https://reviews.llvm.org/D68231	2020-05-04 10:54:07 +01:00
Simon Pilgrim	8e9a8dc185	[InstCombine] Add tests showing failure to fold mul(abs(x),abs(x)) -> mul(x,x) (PR39476) Includes abs() and nabs() variants	2020-05-04 10:24:18 +01:00
Jay Foad	6c42814a26	Precommit test updates for D68231.	2020-05-04 09:55:59 +01:00
Sanjay Patel	682f0b366b	[InstCombine] use select-of-constants with set/clear bit mask patterns Cond ? (X & ~C) : (X \| C) --> (X & ~C) \| (Cond ? 0 : C) Cond ? (X \| C) : (X & ~C) --> (X & ~C) \| (Cond ? C : 0) The select-of-constants form results in better codegen. There's an existing test diff that shows a transform that results in an extra IR instruction, but that's an existing problem. This is motivated by code seen in LLVM itself - see PR37581: https://bugs.llvm.org/show_bug.cgi?id=37581 define i8 @src(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %or = or i8 %x, %C %cond = select i1 %b, i8 %or, i8 %and ret i8 %cond } define i8 @tgt(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %mul = select i1 %b, i8 %C, i8 0 %or = or i8 %mul, %and ret i8 %or } http://volta.cs.utah.edu:8080/z/Vt2WVm Differential Revision: https://reviews.llvm.org/D78880	2020-05-03 09:44:43 -04:00
Nikita Popov	7c649b58f0	[InstCombine] Duplicate some InstSimplify tests (NFC) Duplicate some tests in preparation for D79294.	2020-05-03 12:49:36 +02:00
Sanjay Patel	7fa150203f	[InstCombine] fix miscompile from multi-use cttz/ctlz transform PR45762: https://bugs.llvm.org/show_bug.cgi?id=45762	2020-05-01 13:52:24 -04:00
Sanjay Patel	43b0e446fb	[InstCombine] add test for faulty cttz fold (PR45762); NFC	2020-05-01 13:52:23 -04:00
Simon Pilgrim	4548e62ca4	[InstCombine] Additional 'concat of ORs' BSWAP/BITREVERSE tests for D79041	2020-05-01 18:05:24 +01:00
Sanjay Patel	5486e00dc3	[InstSimplify] remove poison-unsafe insertelement of undef value PR45481: https://bugs.llvm.org/show_bug.cgi?id=45481 SDAG has an identical transform to this, so there's little chance of any real-world impact. OTOH, that means we are effectively sweeping the bug out of sight because poison exists in codegen too.	2020-05-01 09:22:05 -04:00
Sanjay Patel	5013a788f8	[InstCombine] adjust tests for pow(); NFC D68231 would change this, but the existing test doesn't cover what was probably intended (a libcall test).	2020-05-01 08:42:51 -04:00
Sanjay Patel	4a065a72ef	[InstCombine] add tests for bitcast+inselt; NFC	2020-04-30 09:11:29 -04:00
Sanjay Patel	35fe2814cf	[InstCombine] update auto-generated test checks; NFC	2020-04-30 08:39:02 -04:00
Sanjay Patel	2cfeaf3b2d	[InstCombine] add tests for FP->int->FP->FP casting; NFC	2020-04-30 07:41:28 -04:00
Simon Pilgrim	751a554f25	[InstCombine] Add PR45715 test case	2020-04-28 21:53:59 +01:00
Roman Lebedev	a0004358a8	[InstCombine] Negator: 'or' with no common bits set is just 'add' In `InstCombiner::visitAdd()`, we have ``` // A+B --> A\|B iff A and B have no bits set in common. if (haveNoCommonBitsSet(LHS, RHS, DL, &AC, &I, &DT)) return BinaryOperator::CreateOr(LHS, RHS); ``` so we should handle such `or`'s here, too.	2020-04-28 19:16:32 +03:00
Roman Lebedev	a5f22f2b0e	[NFC][InstCombine] Tests for negation of 'or' with no common bits set	2020-04-28 19:16:31 +03:00
Sanjay Patel	54fe6c9599	[InstCombine] add tests for set/clear masked bits; NFC	2020-04-27 15:55:45 -04:00

1 2 3 4 5 ...

4946 Commits