llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	0c5b6e2ea5	Recommit "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This patch should fix the cause of the stage2 failures and PR45185. This reverts the revert commit `c52f839e72`.	2020-03-13 17:03:22 +00:00
Simon Pilgrim	a2db388dce	[CostModel][X86] Improve ISD::CTTZ costs accounting for BSF/TZCNT implementations	2020-03-13 16:51:13 +00:00
Tyker	2543567c41	[AssumeBundles] filter usefull attriutes to preserve Summary: This patch will filter attributes to only preserve those that are usefull. In the case of NoAlias it is filtered out not because it isn't usefull but because it is incorrect to preserve it as it is only valdi for the duration of the function. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75828	2020-03-13 17:35:47 +01:00
Tyker	69375fd0a3	[AssumeBundles] Preserve Information in the inliner Summary: during inling Create and insert an llvm.assume with attributes to preserve them. to prevent any changes for now generation of llvm.assume is under a flag disabled by default. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75825	2020-03-13 17:35:47 +01:00
omarahmed1111	b285b333dc	[Attributor] Detect possibly unbounded cycles in functions This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not. Loops with maximum trip count are considered bounded, any other cycle not. It also contains some fixed tests and some added tests contain bounded and unbounded loops and non-loop cycles. Reviewed By: jdoerfert, uenoku, baziotis Differential Revision: https://reviews.llvm.org/D74691	2020-03-13 11:17:33 -05:00
Pankaj Gode	bf990530ae	[Attributor] Improve noalias preservation using reachability Resolution for below fixme: (ii) Check whether the value is captured in the scope using AANoCapture. FIXME: This is conservative though, it is better to look at CFG and check only uses possibly executed before this callsite. Propagates caller argument's noalias attribute to callee. Reviewed by: jdoerfert, uenoku Reviewers: jdoerfert, sstefan1, uenoku Subscribers: uenoku, sstefan1, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D71617	2020-03-13 21:09:08 +05:30
Nico Weber	86eb2c3991	Revert "[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't" This reverts commit `1f5b471b8b`. Causes asserts when building code with arc. See https://bugs.chromium.org/p/chromium/issues/detail?id=1061289#c2 for a full repro. Will post a creduced repro once creduce is done running.	2020-03-13 10:16:02 -04:00
Juneyoung Lee	c39cb1c0dd	[CodeGenPrepare] Expand freeze conversion to support fcmp and icmp with null Summary: This is a simple patch that expands https://reviews.llvm.org/D75859 to pointer comparison and fcmp Checked with Alive2 Reviewers: reames, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76048	2020-03-13 17:21:33 +09:00
Juneyoung Lee	48b901b0e1	Add tests to Transforms/CodeGenPrepare/X86/freeze-cmp.ll before commiting D76048	2020-03-13 17:18:42 +09:00
Johannes Doerfert	a198adb490	[Attributor] IPO across definition boundary of a function marked alwaysinline Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D75590	2020-03-13 01:06:12 -05:00
Johannes Doerfert	40815a4957	Revert "[Attributor] Enable test with update check lines" This reverts commit `13def55b3f`. This broke a buildbot, will investigate.	2020-03-13 00:59:47 -05:00
Johannes Doerfert	1c9c23d60e	[OpenMP][Opt][NFC] Add test case for known runtime function attributes This test somehow did not make it in before.	2020-03-13 00:28:14 -05:00
Johannes Doerfert	13def55b3f	[Attributor] Enable test with update check lines The test disabled in `528a6a1d4c` is enabled again with the check lines for `9708279c72`.	2020-03-12 23:24:15 -05:00
Huihui Zhang	118abf2017	[SVE] Update API ConstantVector::getSplat() to use ElementCount. Summary: Support ConstantInt::get() and Constant::getAllOnesValue() for scalable vector type, this requires ConstantVector::getSplat() to take in 'ElementCount', instead of 'unsigned' number of element count. This change is needed for D73753. Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74386	2020-03-12 13:22:41 -07:00
Florian Hahn	c52f839e72	Revert "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This commit is likely causing clang-with-lto-ubuntu to fail http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/16052 Also causes PR45185. This reverts commit `f1ac5d2263`.	2020-03-12 18:49:11 +00:00
Sanjay Patel	a66dc755db	[InstSimplify] simplify FP ops harder with FMF (part 2) This is part of the IR sibling for: D75576 Related transform committed with: rG8ec71585719d	2020-03-12 09:53:20 -04:00
Sanjay Patel	8ec7158571	[InstSimplify] simplify FP ops harder with FMF This is part of the IR sibling for: D75576 (I'm splitting part of the transform as a separate commit to reduce risk. I don't know of any bugs that might be exposed by this improved folding, but it's hard to see those in advance...)	2020-03-12 09:13:28 -04:00
Sanjay Patel	d748e759d5	[InstSimplify] add tests for FP poison; NFC Adapted from codegen tests seen in D75576.	2020-03-12 08:32:04 -04:00
Florian Hahn	f1ac5d2263	[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI) This patch switches SCCP to use ValueLatticeElement for lattice values, instead of the local LatticeVal, as first step to enable integer range support. This patch does not make use of constant ranges for additional operations and the only difference for now is that integer constants are represented by single element ranges. To preserve the existing behavior, the following helpers are used * isConstant(LV): returns true when LV is either a constant or a constant range with a single element. This should return true in the same cases where LV.isConstant() returned true previously. * getConstant(LV): returns a constant if LV is either a constant or a constant range with a single element. This should return a constant in the same cases as LV.getConstant() previously. * getConstantInt(LV): same as getConstant, but additionally casted to ConstantInt. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60582	2020-03-12 12:03:06 +00:00
Max Kazantsev	3dc6e53c97	[LoopPeel] Turn incorrect assert into a check Summary: This patch replaces incorrectt assert with a check. Previously it asserts that if SCEV cannot prove `isKnownPredicate(A != B)`, then it should be able to prove `isKnownPredicate(A == B)`. Both these fact may be not provable. It is shown in the provided test: Could not prove: `{-294,+,-2}<%bb1> != 0` Asserting: `{-294,+,-2}<%bb1> == 0` Obviously, this SCEV is not equal to zero, but 0 is in its range so we cannot also prove that it is not zero. Instead of assert, we should be checking the required conditions explicitly. Reviewers: lebedev.ri, fhahn, sanjoy, fedor.sergeev Reviewed By: lebedev.ri Subscribers: hiraditya, zzheng, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76050	2020-03-12 17:23:07 +07:00
Juneyoung Lee	629cf3c1c5	Apply update_test_check.py to CodeGenPrepare/X86/freeze-icmp.ll test	2020-03-12 16:37:16 +09:00
Huihui Zhang	8f52573962	[InstSimplify][SVE] Fix SimplifyInsert/ExtractElementInst for scalable vector. Summary: For scalable vector, index out-of-bound can not be determined at compile-time. The same apply for VectorUtil findScalarElement(). Add test cases to check the functionality of SimplifyInsert/ExtractElementInst for scalable vector. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: cameron.mcinally, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75782	2020-03-11 15:09:56 -07:00
Sanjay Patel	fae900921b	[InstCombine] reduce demand-limited bool math to logic The cmp math test is inspired by memcmp() patterns seen in D75840. I know there's at least 1 related fold we can do here if both values are sext'd, but I'm not seeing a way to generalize further. We have some other bool math patterns that we want to reduce, but that might require fixing the bogus transforms noted in D72396. Alive proof translations of the regression tests: https://rise4fun.com/Alive/zGWi Name: demand add 1 %xz = zext i1 %x to i32 %ys = sext i1 %y to i32 %sub = add i32 %xz, %ys %r = lshr i32 %sub, 31 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = zext i1 %and to i32 Name: demand add 2 %xz = zext i1 %x to i5 %ys = sext i1 %y to i5 %sub = add i5 %xz, %ys %r = and i5 %sub, 16 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = select i1 %and, i5 -16, i5 0 Name: demand add 3 %xz = zext i1 %x to i8 %ys = sext i1 %y to i8 %a = add i8 %ys, %xz %r = ashr i8 %a, 7 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = sext i1 %and to i8 Name: cmp math %gt = icmp ugt i32 %x, %y %lt = icmp ult i32 %x, %y %xz = zext i1 %gt to i32 %yz = zext i1 %lt to i32 %s = sub i32 %xz, %yz %r = lshr i32 %s, 31 => %r = zext i1 %lt to i32 Differential Revision: https://reviews.llvm.org/D75961	2020-03-11 15:45:58 -04:00
Sanjay Patel	fa8c4c7ffa	[InstCombine] add tests for bool math; NFC	2020-03-11 15:45:58 -04:00
Juneyoung Lee	8eb2f865c3	[CodeGenPrepare] Fold br(freeze(icmp x, const)) to br(icmp(freeze x, const)) Summary: This patch helps CodeGenPrepare move freeze into the icmp when it is used by branch. It reenables generation of efficient conditional jumps. This is only done when at least one of icmp's operands is constant to prevent the transformation from increasing # of freeze instructions. Performance degradation of MultiSource/Benchmarks/Ptrdist/yacr2/yacr2.test is resolved with this patch. Checked with Alive2 Reviewers: reames, fhahn, nlopes Reviewed By: reames Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75859	2020-03-12 03:16:15 +09:00
Florian Hahn	bc6c8c4bbb	[Matrix] Add remark propagation along the inlined-at chain. This patch adds support for propagating matrix expressions along the inlined-at chain and emitting remarks at the traversed function scopes. To motivate this new behavior, consider the example below. Without the remark 'up-leveling', we would only get remarks in load.h and store.h, but we cannot generate a remark describing the full expression in toplevel.cpp, which is the place where the user has the best chance of spotting/fixing potential problems. With this patch, we generate a remark for the load in load.h, one for the store in store.h and one for the complete expression in toplevel.cpp. For a bigger example, please see remarks-inlining.ll. load.h: template <typename Ty, unsigned R, unsigned C> Matrix<Ty, R, C> load(Ty Ptr) { Matrix<Ty, R, C> Result; Result.value = reinterpret_cast <typename Matrix<Ty, R, C>::matrix_t >(Ptr); return Result; } store.h: template <typename Ty, unsigned R, unsigned C> void store(Matrix<Ty, R, C> M1, Ty Ptr) { reinterpret_cast<typename decltype(M1)::matrix_t >(Ptr) = M1.value; } toplevel.cpp void test(double A, double B, double *C) { store(add(load<double, 3, 5>(A), load<double, 3, 5>(B)), C); } For a given function, we traverse the inlined-at chain for each matrix instruction (= instructions with shape information). We collect the matrix instructions in each DISubprogram we visit. This produces a mapping of DISubprogram -> (List of matrix instructions visible in the subpogram). We then generate remarks using the list of instructions for each subprogram in the inlined-at chain. Note that the list of instructions for a subprogram includes the instructions from its own subprograms recursively. For example using the example above, for the subprogram 'test' this includes inline functions 'load' and 'store'. This allows surfacing the remarks at a level useful to users. Please note that the current approach may create a lot of extra remarks. Additional heuristics to cut-off the traversal can be implemented in the future. For example, it might make sense to stop 'up-leveling' once all matrix instructions are at the same debug location. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D73600	2020-03-11 17:40:08 +00:00
Fangrui Song	a0c0389ffb	[SimplifyLibcalls] Don't replace locked IO (fgetc/fgets/fputc/fputs/fread/fwrite) with unlocked IO (_unlocked) This essentially reverts some of the SimplifyLibcalls part changes of D45736 [SimplifyLibcalls] Replace locked IO with unlocked IO. C11 7.21.5.2 The fflush function > If stream is a null pointer, the fflush function performs this flushing action on all streams for which the behavior is defined above. i.e. fopen'ed FILE is inherently captured. POSIX.1-2017 getc_unlocked, getchar_unlocked, putc_unlocked, putchar_unlocked - stdio with explicit client locking > These functions can safely be used in a multi-threaded program if and only if they are called while the invoking thread owns the ( FILE ) object, as is the case after a successful call to the flockfile() or ftrylockfile() functions. After a thread fopen'ed a FILE, when it is calling foobar() which is now replaced by foobar_unlocked(), if another thread is concurrently calling fflush(0), the behavior is undefined. C11 7.22.4.4 The exit function > Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed. The replacement is only feasible if the program is single threaded, or exit or fflush(0) is never called. See also http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180528/556615.html for how the replacement makes libc interceptors difficult to implement. dalias: in a worst case, it's unbounded data corruption because of concurrent access to pointers without synchronization. f->wpos or rpos could get outside of the buffer, thread A could do f->wpos += j after knowing j is in bounds, while thread B also changes it concurrently. This can produce exploitable conditions depending on libc internals. Revert the SimplifyLibcalls part change because the cons obviously overweigh the pros. Even when the replacement is feasible, the benefit is indemonstrable, more so in an application instead of an artificial glibc benchmark. Theoretically the replacement could be beneficial when calling getc_unlocked/putc_unlocked in a loop, but then it is better using a blocked IO operation and the user is likely aware of that. The function attribute inference is still useful and thus kept. Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D75933	2020-03-10 11:11:58 -07:00
Tyker	a4cde9ad7b	Fixed [AssumeBundles] Move to IR so it can be used by Analysis This is a recommit of `57c964aaa7` after fixing modules build.	2020-03-10 18:02:39 +01:00
Simon Moll	d871ef4e6a	[instcombine] remove fsub to fneg hacks; only emit fneg Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp negation. This also extends the scalarization cost in instcombine for unary operators to result in the same IR rewrites for fneg as for the idiom. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75467	2020-03-10 16:57:02 +01:00
Florian Hahn	c8c14d979a	[InstCombine] Support vectors in SimplifyAddWithRemainder. SimplifyAddWithRemainder currently also matches for vector types, but tries to create an integer constant, which causes a crash. By using Constant::getIntegerValue() we can support both the scalar and vector cases. The 2 added test cases crash without the fix. Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Differential Revision: https://reviews.llvm.org/D75906	2020-03-10 14:29:40 +00:00
Jonas Paulsson	c2dafe12dc	[SimplifyCFG] Skip merging return blocks if it would break a CallBr. SimplifyCFG should not merge empty return blocks and leave a CallBr behind with a duplicated destination since the verifier will then trigger an assert. This patch checks for this case and avoids the transformation. CodeGenPrepare has a similar check which also has a FIXME comment about why this is needed. It seems perhaps better if these two passes would eventually instead update the CallBr instruction instead of just checking and avoiding. This fixes https://bugs.llvm.org/show_bug.cgi?id=45062. Review: Craig Topper Differential Revision: https://reviews.llvm.org/D75620	2020-03-10 14:59:13 +01:00
Sanjay Patel	6e60e1025f	[InstCombine] regenerate test checks; NFC tmp -> t because 'tmp' tends to cause problems for the auto-generation script.	2020-03-10 09:57:41 -04:00
Sanjay Patel	467eec0910	[InstCombine] fold gep-of-select-of-constants (PR45084) As shown in: https://bugs.llvm.org/show_bug.cgi?id=45084 ...we failed to combine a gep with constant indexes with a pointer operand that is a select of constants. Differential Revision: https://reviews.llvm.org/D75807	2020-03-10 09:25:13 -04:00
Sanjay Patel	5b465ad290	[InstCombine] add/adjust tests for select-gep; NFC Goes with D75807	2020-03-10 09:25:13 -04:00
Florian Hahn	2d6ecf4648	[SLP] Support vectorizing functions provided by vector libs. It seems like the SLPVectorizer is currently not aware of vector versions of functions provided by libraries like Accelerate [1]. This patch updates SLPVectorizer to use the same infrastructure the LoopVectorizer uses to detect vectorizable library functions. For calls, it computes the cost of an intrinsic call (existing behavior) and the cost of a vector function library call, if available. Like LoopVectorizer, it assumes the cost of the vector function is simply the cost of a call to a vector function. [1] https://developer.apple.com/documentation/accelerate Reviewers: ABataev, RKSimon, spatel Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D75878	2020-03-10 13:10:50 +00:00
Simon Pilgrim	5cbddf7cbc	[X86][SSE] Add more accurate costs for fmaxnum/fminnum codegen Based off llvm-mca reports on codegen in llvm\test\CodeGen\X86\fmaxnum.ll + llvm\test\CodeGen\X86\fminnum.ll	2020-03-10 11:59:40 +00:00
Simon Pilgrim	9b05596eff	[SLPVectorizer][X86] Add fmaxnum/fminnum tests	2020-03-10 11:18:28 +00:00
Florian Hahn	b53907bfed	[SLP] Precommit vector library test for D75878.	2020-03-10 10:17:34 +00:00
ahatanak	1f5b471b8b	[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't a tail call Previosly ARC optimizer removed the autoreleaseRV/retainRV pair in the following code, which caused the object returned by @something to be placed in the autorelease pool because the call to @something isn't a tail call: ``` %call = call i8* @something(...) %2 = call i8* @objc_retainAutoreleasedReturnValue(i8* %call) %3 = call i8* @objc_autoreleaseReturnValue(i8* %2) ret i8* %3 ``` Fix the bug by checking whether @something is a tail call. rdar://problem/59275894	2020-03-09 13:21:38 -07:00
JF Bastien	8fc9eea43a	Test that volatile load type isn't changed Summary: As discussed in D75505, it's not particularly useful to change the type of a load to/from floating-point/integer because it's followed by a bitcast, and it might lead to surprising code generation. Check that this doesn't generally happen. Reviewers: lebedev.ri Subscribers: jkorous, dexonsmith, ributzka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75644	2020-03-09 11:19:23 -07:00
Nikita Popov	45555c3819	[InstSimplify] Simplify calls with "returned" attribute If a call argument has the "returned" attribute, we can simplify the call to the value of that argument. The "-inst-simplify" pass already handled this for the constant integer argument case via known bits, which is invoked in SimplifyInstruction. However, non-constant (or non-int) arguments are not handled at all right now. This addresses one of the regressions from D75801. Differential Revision: https://reviews.llvm.org/D75815	2020-03-09 18:53:47 +01:00
Nikita Popov	c3ca6876ed	[InstCombine] Don't simplify calls without uses When simplifying a call without uses, replaceInstUsesWith() is going to do nothing, but we'll skip all following folds. We can only run into this problem with calls that both simplify and are not trivially dead if unused, which currently seems to happen only with calls to undef, as the test diff shows. When extending SimplifyCall() to handle "returned" attributes, this becomes a much bigger problem, so I'm fixing this first. Differential Revision: https://reviews.llvm.org/D75814	2020-03-09 18:47:46 +01:00
Nikita Popov	829d377a98	[InstSimplify] Don't simplify musttail calls As pointed out by jdoerfert on D75815, we must be careful when simplifying musttail calls: We can only replace the return value if we can eliminate the call entirely. As we can't make this guarantee for all consumers of InstSimplify, this patch disables simplification of musttail calls. Without this patch, musttail simplification currently results in module verification errors. Differential Revision: https://reviews.llvm.org/D75824	2020-03-09 18:46:56 +01:00
Jonas Devlieghere	882f589e20	Revert "[AssumeBundles] Move to IR so it can be used by Analysis" This breaks the modules build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/ http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/ This reverts commit `57c964aaa7`.	2020-03-09 09:02:47 -07:00
evgeny	6d2032e259	[WPD] Provide a way to prevent functions from being devirtualized Differential revision: https://reviews.llvm.org/D75617	2020-03-09 14:05:15 +03:00
Clement Courbet	6518b72f93	[ExpandMemCmp] Properly constant-fold all compares. Summary: This gets rid of duplicated code and diverging behaviour w.r.t. constants. Fixes PR45086. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75519	2020-03-09 10:40:52 +01:00
Clement Courbet	f7e6f5f8e3	[ExpandMemCmp] Properly constant-fold all compares. Summary: This gets rid of duplicated code and diverging behaviour w.r.t. constants. Fixes PR45086. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75519	2020-03-09 09:10:34 +01:00
Hideto Ueno	bdcbdb4848	[Attributor] Deduction based on path exploration This patch introduces the propagation of known information based on path exploration. For example, ``` int u(int c, int p){ if(c) { return p; } else { return *p + 1; } } ``` An argument `p` is dereferenced whatever c's value is. For an instruction `CtxI`, we accumulate branch instructions in the must-be-executed-context of `CtxI` and then, we take the conjunction of the successors' known state. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D65593	2020-03-09 14:29:26 +09:00
Sanjay Patel	a69158c12a	[VectorCombine] fold extract-extract-op with different extraction indexes opcode (extelt V0, Ext0), (ext V1, Ext1) --> extelt (opcode (splat V0, Ext0), V1), Ext1 The first part of this patch generalizes the cost calculation to accept different extraction indexes. The second part creates a shuffle+extract before feeding into the existing code to create a vector op+extract. The patch conservatively uses "TargetTransformInfo::SK_PermuteSingleSrc" rather than "TargetTransformInfo::SK_Broadcast" (splat specifically from element 0) because we do not have a more general "SK_Splat" currently. That does not affect any of the current regression tests, but we might be able to find some cost model target specialization where that comes into play. I suspect that we can expose some missing x86 horizontal op codegen with this transform, so I'm speculatively adding a debug flag to disable the binop variant of this transform to allow easier testing. The test changes show that we're sensitive to cost model diffs (as we should be), so that means that patches like D74976 should have better coverage. Differential Revision: https://reviews.llvm.org/D75689	2020-03-08 09:57:55 -04:00
Sanjay Patel	b827a95b87	[VectorCombine] add tests for wider vectors; NFC	2020-03-08 09:33:07 -04:00
Tyker	57c964aaa7	[AssumeBundles] Move to IR so it can be used by Analysis Summary: Assume bundles need to be usable by Analysis and Transforms/Utils isn't. so this commit moves utilities to deal with asusme bundles to IR. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75618	2020-03-08 12:21:50 +01:00
Nikita Popov	51a466a61f	[InstCombine] Fix known bits handling in SimplifyDemandedUseBits Fixes a regression from D75801. SimplifyDemandedUseBits() is also supposed to compute the known bits (of the demanded subset) of the instruction. For unknown instructions it does so by directly calling computeKnownBits(). For known instructions it will compute known bits itself. However, for instructions where only some cases are handled directly (e.g. a constant shift amount) the known bits invocation for the unhandled case is sometimes missing. This patch adds the missing calls and thus removes the main discrepancy with ExpensiveCombines mode. Differential Revision: https://reviews.llvm.org/D75804	2020-03-07 18:16:41 +01:00
Nikita Popov	f2419adc48	[InstCombine] Regenerate test checks; NFC	2020-03-07 17:58:33 +01:00
Nikita Popov	d2dab92f01	[InstSimplify] Add tests for "returned" attribute; NFC	2020-03-07 17:17:21 +01:00
Nikita Popov	2904a332fe	[InstCombine] Add additional known bits folding tests; NFC	2020-03-07 17:17:03 +01:00
Nikita Popov	4cfb4afb70	[InstCombine] Highlight tests using expensive combines; NFC	2020-03-07 17:16:47 +01:00
Sanjay Patel	89fdee87f7	[InstCombine] regenerate complete test checks; NFC	2020-03-07 10:20:38 -05:00
Sanjay Patel	564f5eed1a	[InstCombine] add test for gep (select),... (PR45084); NFC	2020-03-07 10:00:31 -05:00
Stefanos Baziotis	01c48d7d11	[Attributor] Fold terminators before changing instructions to unreachable It is possible that an instruction to be changed to unreachable is in the same block with a terminator that can be constant-folded. In this case, as of now, the instruction will be changed to unreachable before the terminator is folded. But, then the whole BB becomes invalidated and so when we go ahead to fold the terminator, we trap. Change the order of these two. Differential Revision: https://reviews.llvm.org/D75780	2020-03-07 12:38:44 +02:00
Anna Thomas	59029b9eef	[RS4GC] Handle uses of extractelement for conversion from vector to scalar base As mentioned in the comments, extractelement is special since we actually want a scalar base for that element we extracted from the vector (i.e. not a vector base). This same logic should apply to uses of the extractelement such as phis and selects which have the same BDV as the extractelement. Howeber, for these uses we conservatively mark the BDV state as conflict, since setting the EE's new base BDV does not always dominate these uses. Added testcase showcases the problem where the BDV identification chokes on the incorrect cast from vector to scalar for the phi use of extractelement. Tests-Run: make check, internal fuzzer testing Reviewers: reames, skatkov, dantrushin Reviewed-By: dantrushin Differential Revision: https://reviews.llvm.org/D75704	2020-03-06 16:28:49 -05:00
Roman Lebedev	1badf7c33a	[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant Summary: This is potentially more friendly for further optimizations, analysies, e.g.: https://godbolt.org/z/G24anE This resolves phase-ordering bug that was introduced in D75145 for https://godbolt.org/z/2gBwF2 https://godbolt.org/z/XvgSua Reviewers: spatel, nikic, dmgreen, xbolva00 Reviewed By: nikic, xbolva00 Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75757	2020-03-06 21:39:07 +03:00
Roman Lebedev	69ec84f8e7	[NFC][InstCombine] Add 'x - (x & y)' tests with multi-use 'and' If %y is constant, we could still perform the fold	2020-03-06 19:41:19 +03:00
Fangrui Song	952ee0df9e	ThinLTOBitcodeWriter: drop dso_local when a GlobalVariable is converted to a declaration If we infer the dso_local flag for -fpic, dso_local should be dropped when we convert a GlobalVariable a declaration. dso_local causes the generation of direct access (e.g. R_X86_64_PC32). Such relocations referencing STB_GLOBAL STV_DEFAULT objects are not allowed in a -shared link. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74749	2020-03-05 18:09:33 -08:00
Zhongduo Lin	eae228a292	[IndVarSimplify] Extend previous special case for load use instruction to any narrow type loop variant to avoid extra trunc instruction Summary: The widenIVUse avoids generating trunc by evaluating the use as AddRec, this will not work when: 1) SCEV traces back to an instruction inside the loop that SCEV can not expand, eg. add %indvar, (load %addr) 2) SCEV finds a loop variant, eg. add %indvar, %loopvariant While SCEV fails to avoid trunc, we can still try to use instruction combining approach to prove trunc is not required. This can be further extended with other instruction combining checks, but for now we handle the following case (sub can be "add" and "mul", "nsw + sext" can be "nus + zext") ``` Src: %c = sub nsw %b, %indvar %d = sext %c to i64 Dst: %indvar.ext1 = sext %indvar to i64 %m = sext %b to i64 %d = sub nsw i64 %m, %indvar.ext1 ``` Therefore, as long as the result of add/sub/mul is extended to wide type with right extension and overflow wrap combination, no trunc is required regardless of how %b is generated. This pattern is common when calculating address in 64 bit architecture. Note that this patch reuse almost all the code from D49151 by @az: https://reviews.llvm.org/D49151 It extends it by providing proof of why trunc is unnecessary in more general case, it should also resolve some of the concerns from the following discussion with @reames. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180910/585945.html Reviewers: sanjoy, efriedma, sebpop, reames, az, javed.absar, amehsan Reviewed By: az, amehsan Subscribers: hiraditya, llvm-commits, amehsan, reames, az Tags: #llvm Differential Revision: https://reviews.llvm.org/D73059	2020-03-05 16:27:59 -05:00
Juneyoung Lee	d7267ee194	[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators Summary: ``` br i1 c, BB1, BB2: BB1: use1(c) BB2: use2(c) ``` In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB. This is a resubmission of `952ad47` with crash fix of llvm/test/Transforms/LoopRotate/freeze-crash.ll. Checked with Alive2 Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy Reviewed By: reames Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75401	2020-03-06 01:08:35 +09:00
Sanjay Patel	85ae5aa6ff	[VectorCombine] add tests for different extract indexes; NFC	2020-03-05 10:33:21 -05:00
Sanjay Patel	59196f8452	[VectorCombine] add x86 AVX run to test for better coverage; NFC	2020-03-05 07:54:31 -05:00
Daniil Suchkov	f35a898f5f	[Test] Add a regression test for failure introduced by `952ad4701c`	2020-03-05 16:32:37 +07:00
Daniil Suchkov	3db48f9324	Revert "[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators" That commit causes SIGSEGV on some simple tests. This reverts commit `952ad4701c`.	2020-03-05 16:32:36 +07:00
Jun Ma	b10deb9487	[Coroutines] Optimized coroutine elision based on reachability Differential Revision: https://reviews.llvm.org/D75440	2020-03-05 14:43:50 +08:00
Sameer Sahasrabuddhe	42febbab91	StructurizeCFG: simplify phi nodes when possible After structurization, some phi nodes can have a single incoming edge and can be simplified away. This change runs a simplify query on all phis that are either modified or added by the structurizer. This also moves some phis closer to their use as a side benefit. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D75500	2020-03-05 10:33:15 +05:30
Nikita Popov	c6ff3c9bad	[InstSimplify] Constant fold icmp of gep InstSimplify can fold icmps of gep where the base pointers are the same and the offsets are constant. It does so by constructing a constant expression icmp and assumes that it gets folded -- but this doesn't actually happen, because GEP expressions can usually only be folded by the target-dependent constant folding layer. As such, we need to explicitly invoke it here. Differential Revision: https://reviews.llvm.org/D75407	2020-03-04 23:16:52 +01:00
Nikita Popov	9b5de84e27	[InstCombine] Use IRBuilder to create bitcast This makes sure that the constant expression bitcast goes through target-dependent constant folding, and thus avoids an additional iteration of InstCombine.	2020-03-04 18:28:38 +01:00
Nikita Popov	17be8e4a6f	[ConstProp] Add test for bitcast to gep fold; NFC	2020-03-04 18:27:20 +01:00
Nikita Popov	a99b97b818	[InstSimplify] Add additional icmp of gep folding test; NFC	2020-03-04 18:27:01 +01:00
Nikita Popov	0940c32385	[InstSimplify] Regenerate compare.ll checks; NFC	2020-03-04 18:26:42 +01:00
Sanjay Patel	71a316883d	[PassManager] adjust VectorCombine placement The initial placement of vector-combine in the opt pipeline revealed phase ordering bugs: https://bugs.llvm.org/show_bug.cgi?id=45015 https://bugs.llvm.org/show_bug.cgi?id=42022 This patch contains a few independent changes: 1. Move the pass up in the pipeline, so it happens just after loop-vectorization. This is only to keep vectorization passes together in the pipeline at the moment. I don't have evidence of interaction between these yet. 2. Add an -early-cse pass after -vector-combine to clean up redundant ops. This was partly proposed as far back as rL219644 (which is why it's effectively being moved in the old PM code). This is important because the subsequent -instcombine doesn't work as well without EarlyCSE. With the CSE, -instcombine is able to squash shuffles together in 1 of the tests (because those are simple "select" shuffles). 3. Remove the -vector-combine pass that was running after SLP. We may want to do that eventually, but I don't have a test case to support it yet. Differential Revision: https://reviews.llvm.org/D75145	2020-03-04 11:10:49 -05:00
David Green	587feec07e	[ARM] Change all tests from "thumbv8.1-m.main" to "thumbv8.1m.main". NFC	2020-03-04 13:47:35 +00:00
Juneyoung Lee	952ad4701c	[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators Summary: ``` br i1 c, BB1, BB2: BB1: use1(c) BB2: use2(c) ``` In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB. Checked with Alive2 Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy Reviewed By: reames Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75401	2020-03-04 11:43:31 +09:00
Brian Gesiak	aa85b437a9	[Coroutines] Use dbg.declare for frame variables Summary: https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f is an example of a small C++ program that uses C++20 coroutines that is difficult to debug, due to the loss of debug info for variables that "spill" across coroutine suspension boundaries. This patch addresses that issue by inserting 'llvm.dbg.declare' intrinsics that point the debugger to the variables' location at an offset to the coroutine frame. With this patch, I confirmed that running the 'frame variable' commands in https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f at the specified breakpoints results in the correct values being printed for coroutine frame variables 'i' and 'j' when using an lldb built from trunk, as well as with gdb 8.3 (lldb 9.0.1, however, could not print the values). The added test case also verifies this improved behavior. The existing coro-debug.ll test case is also modified to reflect the locations at which Clang actually places calls to 'dbg.declare', and additional checks are added to ensure this patch works as intended in that example as well. Reviewers: vsk, jmorse, GorNishanov, lewissbaker, wenlei Subscribers: EricWF, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75338	2020-03-03 17:13:46 -05:00
David Green	ec7e4a9a80	[LoopVectorizer] Add reduction tests for inloop reductions. NFC Also adds a force-reduction-intrinsics option for testing, for forcing the generation of reduction intrinsics even when the backend is not requesting them.	2020-03-03 10:54:00 +00:00
Roman Lebedev	9e1443e6f6	[NFC][InstCombine] Add test with non-CSE'd casts of load in @t0 we can still change type of load and get rid of casts.	2020-03-03 11:27:27 +03:00
Alok Kumar Sharma	6f029dadf6	[DebugInfo] Avoid generating duplicate llvm.dbg.value Summary: This is to avoid generating duplicate llvm.dbg.value instrinsic if it already exists after the Instruction. Before inserting llvm.dbg.value instruction, LLVM checks if the same instruction is already present before the instruction to avoid duplicates. Currently it misses to check if it already exists after the instruction. flang generates IR like this. %4 = load i32, i32* %i1_311, align 4, !dbg !42 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 When this IR is processed in llvm, it ends up inserting duplicates. %4 = load i32, i32* %i1_311, align 4, !dbg !42 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 We have now updated LdStHasDebugValue to include the cases when instruction is already followed by same dbg.value instruction we intend to insert. Now, Definition and usage of function LdStHasDebugValue are deleted. RemoveRedundantDbgInstrs is called for the cleanup of duplicate dbg.value's Testing: Added unit test for validation check-llvm check-debuginfo (the debug info integration tests) Reviewers: aprantl, probinson, dblaikie, jmorse, jini.susan.george SouraVX, awpandey, dstenb, vsk Reviewed By: aprantl, jmorse, dstenb, vsk Differential Revision: https://reviews.llvm.org/D74030	2020-03-03 09:56:45 +05:30
Juneyoung Lee	9f1f244d3c	[LICM] Allow freeze to hoist/sink out of a loop Summary: This patch allows LICM to hoist/sink freeze instructions out of a loop. Reviewers: reames, fhahn, efriedma Reviewed By: reames Subscribers: jfb, lebedev.ri, hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75400	2020-03-03 12:29:39 +09:00
Teresa Johnson	80bf137fa1	Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"" This reverts commit `80d0a137a5`, and the follow on fix in `873c0d0786`. It is causing test failures after a multi-stage clang bootstrap. See discussion on D73242 and D75201.	2020-03-02 14:02:13 -08:00
Arkady Shlykov	3dcaf296ae	[Loop Peeling] Add possibility to enable peeling on loop nests. Summary: Current peeling implementation bails out in case of loop nests. The patch introduces a field in TargetTransformInfo structure that certain targets can use to relax the constraints if it's profitable (disabled by default). Also additional option is added to enable peeling manually for experimenting and testing purposes. Reviewers: fhahn, lebedev.ri, xbolva00 Reviewed By: xbolva00 Subscribers: RKSimon, xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D70304	2020-03-02 08:37:11 -08:00
Reid Kleckner	1adbe86d87	[WinEH] Fix inttoptr+phi optimization in presence of catchswitch getFirstInsertionPt's return value must be checked for validity before casting it to Instruction*. Don't attempt to insert casts after a phi in a catchswitch block. Fixes PR45033, introduced in D37832. Reviewed By: davidxl, hfinkel Differential Revision: https://reviews.llvm.org/D75381	2020-03-01 07:49:28 -08:00
Jun Ma	624dbfcc1b	[Coroutines][New pass manager] Move CoroElide pass to right position Differential Revision: https://reviews.llvm.org/D75345	2020-03-01 21:48:24 +08:00
Jun Ma	44d83671c5	Revert "[Coroutines][new pass manager] Move CoroElide pass to right position" This reverts commit `4c0a133a41`.	2020-03-01 21:37:41 +08:00
Jun Ma	4c0a133a41	[Coroutines][new pass manager] Move CoroElide pass to right position Differential Revision: https://reviews.llvm.org/D75345	2020-03-01 20:55:38 +08:00
Juneyoung Lee	5cbb265694	[GVN] Fold equivalent freeze instructions Summary: This patch defines two freeze instructions to have the same value number if they are equivalent. This is allowed because GVN replaces all uses of a duplicated instruction with another. If it partially rewrites use, it is not allowed. e.g) ``` a = freeze(x) b = freeze(x) use(a) use(a) use(b) => use(a) use(b) // This is not allowed! use(b) ``` Reviewers: fhahn, reames, spatel, efriedma Reviewed By: fhahn Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75398	2020-03-01 07:32:05 +09:00
Simon Pilgrim	7e9747b50b	[X86][F16C] Remove cvtph2ps intrinsics and use generic half2float conversion (PR37554) This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default. Differential Revision: https://reviews.llvm.org/D75162	2020-02-29 18:57:35 +00:00
Austin Kerbow	4fa63fd452	[VectorCombine] Fix assert on compare extract index Extract index could be a differnet integral type. Differential Revision: https://reviews.llvm.org/D75327	2020-02-28 10:37:08 -08:00
Alexey Bataev	afa45d23e9	[SLP]Update test checks, NFC.	2020-02-28 13:25:44 -05:00
Hiroshi Yamauchi	f16d2bec40	Devirtualize a call on alloca without waiting for post inline cleanup and next DevirtSCCRepeatedPass iteration. This aims to fix a missed inlining case. If there's a virtual call in the callee on an alloca (stack allocated object) in the caller, and the callee is inlined into the caller, the post-inline cleanup would devirtualize the virtual call, but if the next iteration of DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is based on a heuristic to determine whether to reiterate, we may miss inlining the devirtualized call. This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp. This is a second commit after a revert https://reviews.llvm.org/rG4569b3a86f8a4b1b8ad28fe2321f936f9d7ffd43 and a fix https://reviews.llvm.org/rG41e06ae7ba91. Differential Revision: https://reviews.llvm.org/D69591	2020-02-28 09:43:32 -08:00
Teresa Johnson	f9ca75f19b	[Inliner] Inlining should honor nobuiltin attributes Summary: Final patch in series to fix inlining between functions with different nobuiltin attributes/options, which was specifically an issue in LTO. See discussion on D61634 for background. The prior patch in this series (D67923) enabled per-Function TLI construction that identified the nobuiltin attributes. Here I have allowed inlining to proceed if the callee's nobuiltins are a subset of the caller's nobuiltins, but not in the reverse case, which should be conservatively correct. This is controlled by a new option, -inline-caller-superset-nobuiltin, which is enabled by default. Reviewers: hfinkel, gchatelet, chandlerc, davidxl Subscribers: arsenm, jvesely, nhaehnle, mehdi_amini, eraman, hiraditya, haicheng, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74162	2020-02-28 07:34:14 -08:00
Pierre-vh	2809abbd98	[Transform][MemCpyOpt] Add missing DebugLoc to %tmpbitcast Fix for https://bugs.llvm.org/show_bug.cgi?id=37967 Differential Revision: https://reviews.llvm.org/D75173	2020-02-28 15:20:51 +00:00
Juneyoung Lee	cc28a75467	Let EarlyCSE fold equivalent freeze instructions Summary: This patch makes EarlyCSE fold equivalent freeze instructions. Another optimization that I think will be useful is to remove freeze if its operand is used as a branch condition or at llvm.assume: ``` %c = ... br i1 %c, label %A, .. A: %d = freeze %c ; %d can be optimized to %c because %c cannot be poison or undef (or 'br %c' would be UB otherwise) ``` If it make sense for EarlyCSE to support this as well, I will make a patch for this. Reviewers: spatel, reames, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75334	2020-02-28 20:35:20 +09:00
Hans Wennborg	d48c981697	SROA: Don't drop atomic load/store alignments (PR45010) SROA will drop the explicit alignment on allocas when the ABI guarantees enough alignment. Because the alignment on new load/store instructions are set based on the alloca's alignment, that means SROA would end up dropping the alignment from atomic loads and stores, which is not allowed (see bug). For those, make sure to always carry over the alignment from the previous instruction. Differential revision: https://reviews.llvm.org/D75266	2020-02-28 10:38:40 +01:00
Jun Ma	43c8307c6c	[Coroutines] CoroElide enhancement Fix regression of CoreElide pass when current function is coroutine. Differential Revision: https://reviews.llvm.org/D71663	2020-02-28 10:41:59 +08:00
Juneyoung Lee	2b5a897651	Revert "[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison" .. due to performance regression. This patch is reverted until infrastructore for CSE/LICM support for freeze is added. This reverts commit `181628b`	2020-02-28 11:10:46 +09:00
Eli Friedman	b299926453	[IndVars] Fix sort comparator. std::sort will compare an element to itself in some cases. We should not crash if this happens. Differential Revision: https://reviews.llvm.org/D75000	2020-02-27 17:25:18 -08:00
Artur Pilipenko	02e3d5c3a2	Fix DSE miscompile when store is clobbered across loop iterations DSE would mistakenly remove store (2): a = calloc(n+1) for (int i = 0; i < n; i++) { store 1, a[i+1] // (1) store 0, a[i] // (2) } The fix is to do PHI transaltion while looking for clobbering instructions between the store and the calloc. Reviewed By: efriedma, bjope Differential Revision: https://reviews.llvm.org/D68006	2020-02-27 14:43:01 -08:00
Nikita Popov	4ef272ec9c	[InstCombine] DCE instructions earlier When InstCombine initially populates the worklist, it already performs constant folding and DCE. However, as the instructions are initially visited in program order, this DCE can pick up only the last instruction of a dead chain, the rest would only get picked up in the main InstCombine run. To avoid this, we instead perform the DCE in separate pass over the collected instructions in reverse order, which will allow us to pick up full dead instruction chains. We already need to do this reverse iteration anyway to populate the worklist, so this shouldn't add extra cost. This by itself only fixes a small part of the problem though: The same basic issue also applies during the main InstCombine loop. We generally always want DCE to occur as early as possible, because it will allow one-use folds to happen. Address this by also performing DCE while adding deferred instructions to the main worklist. This drops the number of tests that perform more than 2 InstCombine iterations from ~80 to ~40. There's some spurious test changes due to operand order / icmp toggling. Differential Revision: https://reviews.llvm.org/D75008	2020-02-27 18:45:59 +01:00
Simon Moll	ddd11273d9	Remove BinaryOperator::CreateFNeg Use UnaryOperator::CreateFNeg instead. Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75130	2020-02-27 09:06:03 -08:00
Simon Pilgrim	080890a9f3	[InstCombine] Add PR14365 test cases + vector equivalents.	2020-02-27 15:54:14 +00:00
Simon Pilgrim	168a44a70e	[CostModel][X86] Improve extract/insert element costs (PR43605) This tries to improve the accuracy of extract/insert element costs by accounting for subvector extraction/insertion for >128-bit vectors and the shuffling of elements to/from the 0'th index. It also adds INSERTPS for f32 types and PINSR/PEXTR costs for integer types (at the moment we assume the same cost as MOVD/MOVQ - which isn't always true). Differential Revision: https://reviews.llvm.org/D74976	2020-02-27 15:54:13 +00:00
Bardia Mahjour	1b811ff8a9	[DA] Delinearization of fixed-size multi-dimensional arrays Summary: Currently the dependence analysis in LLVM is unable to compute accurate dependence vectors for multi-dimensional fixed size arrays. This is mainly because the delinearization algorithm in scalar evolution relies on parametric terms to be present in the access functions. In the case of fixed size arrays such parametric terms are not present, but we can use the indexes from GEP instructions to recover the subscripts for each dimension of the arrays. This patch adds this ability under the existing option `-da-disable-delinearization-checks`. Authored By: bmahjour Reviewer: Meinersbur, sebpop, fhahn, dmgreen, grosser, etiotto, bollu Reviewed By: Meinersbur Subscribers: hiraditya, arphaman, Whitney, ppc-slack, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72178	2020-02-27 10:29:01 -05:00
Kirill Bobyrev	4569b3a86f	Revert "Devirtualize a call on alloca without waiting for post inline cleanup and next" This reverts commit `59fb9cde7a`. The patch caused internal miscompilations.	2020-02-27 15:58:39 +01:00
Kirill Bobyrev	51b5b567cc	Require asserts for debuginline-cost-delta.ll test -debug-only=inline-cost does not exist in optimized builds without asserts and therefore the test fails for such configurations. Related revision: `c965fd942f`	2020-02-27 13:20:21 +01:00
Nemanja Ivanovic	b9f3686056	Fix buildbot break after `c46b85aaf4` I added test cases that rely on the availability of the PPC target into the general directory for the loop vectorizer. This causes failures on bots that don't build the PPC target. Moving them to the PowerPC directory to fix this.	2020-02-26 21:56:11 -06:00
Nemanja Ivanovic	c46b85aaf4	[LoopVectorize] Fix cost for calls to functions that have vector versions A recent commit (https://reviews.llvm.org/rG66c120f02560ef528a60924104ead66f330190f1) changed the cost for calls to functions that have a vector version for some vectorization factor. However, no check is performed for whether the vectorization factor matches the current one being cost modeled. This leads to attempts to widen call instructions to a vectorization factor for which such a function does not exist, which in turn leads to an assertion failure. This patch adds the check for vectorization factor (i.e. not just that the called function has a vector version for some VF, but that it has a vector version for this VF). Differential revision: https://reviews.llvm.org/D74944	2020-02-26 21:39:11 -06:00
Kirill Naumov	c965fd942f	Cost Annotation Writer for InlineCost Add extra diagnostics for the inline cost analysis under -print-instruction-deltas cl option. When enabled along with -debug-only=inline-cost it prints the IR of inline candidate annotated with cost and threshold change per every instruction. Reviewed By: apilipenko, davidxl, mtrofin Differential Revision: https://reviews.llvm.org/D71501	2020-02-26 17:03:52 -08:00
Nikita Popov	9d9633fb70	[CVP] Simplify cmp of local phi node CVP currently does not simplify cmps with instructions in the same block, because LVI getPredicateAt() currently does not provide much useful information for that case (D69686 would change that, but is stuck.) However, if the instruction is a Phi node, then LVI can compute the result of the predicate by threading it into the predecessor blocks, which allows it simplify some conditions that nothing else can handle. Relevant code: `6d6a4590c5/llvm/lib/Analysis/LazyValueInfo.cpp (L1904-L1927)` Differential Revision: https://reviews.llvm.org/D72169	2020-02-26 20:36:41 +01:00
Nikita Popov	3e440545dc	[CVP] Add test for cmp of local phi; NFC	2020-02-26 20:32:59 +01:00
Nikita Popov	56f7de5baa	[InstCombine] Remove trivially empty ranges from end InstCombine removes pairs of start+end intrinsics that don't have anything in between them. Currently this is done by starting at the start intrinsic and scanning forwards. This patch changes it to start at the end intrinsic and scan backwards. The motivation here is as follows: When we process the start intrinsic, we have not yet looked at the following instructions, which may still get folded/removed. If they do, we will only be able to remove the start/end pair on the next iteration. When we process the end intrinsic, all the instructions before it have already been visited, and we don't run into this problem. Differential Revision: https://reviews.llvm.org/D75011	2020-02-26 20:04:11 +01:00
Hiroshi Yamauchi	59fb9cde7a	Devirtualize a call on alloca without waiting for post inline cleanup and next DevirtSCCRepeatedPass iteration. Needs ReviewPublic This aims to fix a missed inlining case. If there's a virtual call in the callee on an alloca (stack allocated object) in the caller, and the callee is inlined into the caller, the post-inline cleanup would devirtualize the virtual call, but if the next iteration of DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is based on a heuristic to determine whether to reiterate, we may miss inlining the devirtualized call. This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.	2020-02-26 09:51:24 -08:00
Juneyoung Lee	181628b52d	[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison Summary: Loop unswitch hoists branches on loop-invariant conditions. However, if this condition is poison/undef and the branch wasn't originally reachable, loop unswitch introduces UB (since the optimized code will branch on poison/undef and the original one didn't)). We fix this problem by freezing the condition to ensure we don't introduce UB. We will now transform the following: while (...) { if (C) { A } else { B } } Into: C' = freeze(C) if (C') { while (...) { A } } else { while (...) { B } } This patch fixes the root cause of the following bug reports (which use the old loop unswitch, but can be reproduced with minor changes in the code and -enable-nontrivial-unswitch): - https://llvm.org/bugs/show_bug.cgi?id=27506 - https://llvm.org/bugs/show_bug.cgi?id=31652 Reviewers: reames, majnemer, chenli, sanjoy, hfinkel Reviewed By: reames Subscribers: hiraditya, jvesely, nhaehnle, filcab, regehr, trentxintong, nlopes, llvm-commits, mzolotukhin Tags: #llvm Differential Revision: https://reviews.llvm.org/D29015	2020-02-26 13:47:33 +09:00
Johannes Doerfert	396b725394	[OpenMP][Opt] Combine `struct ident_t` during deduplication If we deduplicate OpenMP runtime calls we have multiple `ident_t` that represent information like source location. So far, we simply kept the one used by the replacement call. However, as exposed by PR44893, that can cause problems if we have stack allocated `ident_t` objects. While we need to revisit the use of these as well, it is clear that we eventually want to merge source location information in some way. With this patch we add the infrastructure to do so but without doing the actual merge. Instead we pick a global `ident_t` from the replaced calls, if possible, or create a new one with an unknown location instead. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D74925	2020-02-25 14:07:14 -08:00
Akira Hatanaka	430512ed7d	[ObjC][ARC] Don't move a retain call living outside a loop into the loop body We started seeing cases where ARC optimizer would move retain calls into loop bodies, causing imbalance in the number of retain and release calls, after changes were made to delete inert ARC calls since the inert calls that used to block code motion are gone. Fix the bug by setting the CFG hazard flag when visiting a loop header. rdar://problem/56908836	2020-02-25 13:00:10 -08:00
Roman Lebedev	400ceda425	[SCEV][IndVars] Always provide insertion point to the SCEVExpander::isHighCostExpansion() Summary: This addresses the `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` regression from D73728 Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73777	2020-02-25 23:05:59 +03:00
Roman Lebedev	44edc6fd2c	[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668) Summary: Replacing uses of IV outside of the loop is likely generally useful, but `rewriteLoopExitValues()` is cautious, and if it isn't told to always perform the replacement, and there are hard uses of IV in loop, it doesn't replace. In [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]], that prevents `-indvars` from replacing uses of induction variable after the loop, which might be one of the optimization failures preventing that code from being vectorized. Instead, now that the cost model is fixed, i believe we should be a little bit more optimistic, and also perform replacement if we believe it is within our budget. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]]. Reviewers: reames, mkazantsev, asbirlea, fhahn, skatkov Reviewed By: mkazantsev Subscribers: nikic, hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73501	2020-02-25 23:05:59 +03:00
Roman Lebedev	d6f47aeb51	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model min/max (PR44668) Summary: Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand. But this isn't really true, it expands into just a comparison+swap pair. And again much like with add/mul, there will be one less such pair than the number of operands. And we need to count the cost of operands themselves. This does change a number of testcases, and as far as i can tell, all of these changes are improvements, in the sense that we fixed up more latches to do the [in]equality comparison. This concludes cost-modelling changes, no other SCEV expressions exist as of now. This is a part of addressing [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]]. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73744	2020-02-25 23:05:59 +03:00
Roman Lebedev	756af2f88b	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model add/mul Summary: While this resolves the regression from D73722 in `llvm/test/Transforms/IndVarSimplify/exit_value_test2.ll`, this now regresses `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` test, we no longer can perform that expansion within default budget of `4`, but require budget of `6`. That regression is being addressed by D73777. The basic idea here is simple. ``` Op0, Op1, Op2 ... \| \| \| \--+--/ \| \| \| \---+---/ ``` I.e. given N operands, we will have N-1 operations, so we have to add cost of an add (mul) for every Op processed, except the first one, plus we need to recurse into every Op. I'm guessing there's already canonicalization that ensures we won't have `1` operand in `scMulExpr`, and no `0` in `scAddExpr`/`scMulExpr`. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73728	2020-02-25 23:05:58 +03:00
Roman Lebedev	cc29600b90	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model plain UDiv Summary: If we don't believe this UDiv is actually a LShr in disguise, things are much worse. First, we try to see if this UDiv actually originates from user code, by looking for `S + 1`, and if found considering this UDiv to be free. But otherwise, we always considered this UDiv to be high-cost. However that is no longer the case with TTI-driven cost model: our default budget is 4, which matches the default cost of UDiv, so now we allow a single UDiv to not be counted as high-cost. While that is the case, it is evident this is actually a regression due to the fact that cost-modelling is incomplete - we did not account for the `add`, `mul` costs yet. That is being addressed in D73728. Cost-modelling for UDiv also seems pretty straight-forward: subtract cost of the UDiv itself, and recurse into both the LHS and RHS. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73722	2020-02-25 23:05:58 +03:00
Roman Lebedev	b8abdf9a17	[NFC][IndVarSimplify] Adjust value names in IndVarSimplify/exit_value_test2.ll %tmp prefix confuses auto-update scripts	2020-02-25 23:05:58 +03:00
Sanjay Patel	922558be9e	[PhaseOrdering] add tests for missed CSE; NFC Also add a RUN line for the new pass manager.	2020-02-25 14:30:59 -05:00
Philip Reames	14845b2c45	Revert "[LICM] Support hosting of dynamic allocas out of loops" This reverts commit `8d22100f66`. There was a functional regression reported (https://bugs.llvm.org/show_bug.cgi?id=44996). I'm not actually sure the patch is wrong, but I don't have time to investigate currently, and this line of work isn't something I'm likely to get back to quickly.	2020-02-25 09:05:31 -08:00
Roman Lebedev	2855c8fed9	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): fix miscompile (PR44802) Much like with reassociateShiftAmtsOfTwoSameDirectionShifts(), as input, we have the following pattern: icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0 We want to rewrite that as: icmp eq/ne (and (x shift (Q+K)), y), 0 iff (Q+K) u< bitwidth(x) While we know that originally (Q+K) would not overflow (because 2 * (N-1) u<= iN -1), we may have looked past extensions of shift amounts. so it may now overflow in smaller bitwidth. To ensure that does not happen, we need to ensure that the total maximal shift amount is still representable in that smaller bitwidth. If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus. https://bugs.llvm.org/show_bug.cgi?id=44802	2020-02-25 18:23:58 +03:00
Roman Lebedev	6f807ca00d	[NFC][InstCombine] Add shift amount reassociation in bittest miscompile example from PR44802 https://bugs.llvm.org/show_bug.cgi?id=44802	2020-02-25 18:23:58 +03:00
Roman Lebedev	781d077afb	[InstCombine] reassociateShiftAmtsOfTwoSameDirectionShifts(): fix miscompile (PR44802) As input, we have the following pattern: Sh0 (Sh1 X, Q), K We want to rewrite that as: Sh x, (Q+K) iff (Q+K) u< bitwidth(x) While we know that originally (Q+K) would not overflow (because 2 * (N-1) u<= iN -1), we may have looked past extensions of shift amounts. so it may now overflow in smaller bitwidth. To ensure that does not happen, we need to ensure that the total maximal shift amount is still representable in that smaller bitwidth. If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus. https://bugs.llvm.org/show_bug.cgi?id=44802	2020-02-25 18:23:51 +03:00
Roman Lebedev	425ef99938	[NFC][InstCombine] Add shift amount reassociation miscompile example from PR44802 https://bugs.llvm.org/show_bug.cgi?id=44802	2020-02-25 18:23:16 +03:00
Sanjay Patel	f452f7b95a	[PhaseOrdering] add test for missing vector/CSE transforms (PR45015); NFC	2020-02-25 09:13:49 -05:00
Sanjay Patel	e0568ef2c5	[VectorCombine] add tests for possible extract->shuffle; NFC	2020-02-25 08:41:59 -05:00
Sanjay Patel	10ea01d80d	[VectorCombine] make cost calc consistent for binops and cmps Code duplication (subsequently removed by refactoring) allowed a logic discrepancy to creep in here. We were being conservative about creating a vector binop -- but not a vector cmp -- in the case where a vector op has the same estimated cost as the scalar op. We want to be more aggressive here because that can allow other combines based on reduced instruction count/uses. We can reverse the transform in DAGCombiner (potentially with a more accurate cost model) if this causes regressions. AFAIK, this does not conflict with InstCombine. We have a scalarize transform there, but it relies on finding a constant operand or a matching insertelement, so that means it eliminates an extractelement from the sequence (so we won't have 2 extracts by the time we get here if InstCombine succeeds). Differential Revision: https://reviews.llvm.org/D75062	2020-02-25 08:41:59 -05:00
Florian Hahn	b8d638d337	[DSE,MSSA] Do not attempt to remove un-removable memdefs. We have to skip MemoryDefs that cannot be removed. This fixes a crash in the newly added test case and fixes a wrong case in memset-and-memcpy.ll.	2020-02-25 13:31:46 +00:00
Hideto Ueno	2c0edbf19c	[Attributor] Use AssumptionCache in AANonNullFloating::initialize	2020-02-25 13:00:03 +09:00
Simon Pilgrim	b82438872b	[CostModel][X86] We don't need a scale factor for SLM extract costs D74976 will handle larger vector types, but since SLM doesn't support AVX+ then we will always be extracting from 128-bit vectors so don't need to scale the cost.	2020-02-24 14:23:04 +00:00
Florian Hahn	7769030b93	Recommit "[PatternMatch] Match XOR variant of unsigned-add overflow check." This version fixes a buildbot failure cause by picking the wrong insert point for XORs. We cannot pick the XOR binary operator as insert point, as it is not guaranteed that both input operands for the overflow intrinsic are defined before it. This reverts the revert commit `c7fc0e5da6`.	2020-02-23 18:33:18 +00:00
Florian Hahn	af69d5e10e	[DSE] Track overlapping stores. Add a map from BasicBlocks to overlap intervals. For partial writes, we can keep track of those in IOLs. We only add candidates that are valid for eliminations. Reviewers: dmgreen, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D73757	2020-02-23 15:44:40 +00:00
Nuno Lopes	98ac6e7696	[NFC] fix test nan value	2020-02-23 12:42:47 +00:00
Johannes Doerfert	528a6a1d4c	[Attributor][FIX] Disable a test to unblock the builders To unblock the builders this disables a test for which the CHECK lines need to be updated. The patch causing the failure was not reverted because it is needed for a different problem we are investigating. Here we just need to update the CHECK lines which will happen in the meantime.	2020-02-21 14:43:31 -08:00
Krzysztof Parzyszek	d2b7c09e79	[Hexagon] Simplify intrinsic (vandvrt (vandqrt q b) m) -> q if possible When each byte in b&m is non-zero, this conversion Q->V->Q is a no-op.	2020-02-21 13:56:04 -06:00
Simon Pilgrim	2769fb90f0	[LoopVectorize][X86] Regenerate tests. NFCI.	2020-02-21 18:23:55 +00:00
Nikita Popov	b178555318	[InstCombine] Improve simplify demanded bits worklist management This fixes a small mistake from D72944: The worklist add should happen before assigning the new operand, not after. In case an actual replacement happens, the old operand needs to be added for DCE. If no actual replacement happens, then old/new are the same, so it doesn't matter. This drops one iteration from the annotated test case.	2020-02-21 18:51:41 +01:00
Florian Hahn	deb0a8bfc4	[DSE,MSSA] Dbg counters required assertions. Mark test accordingly.	2020-02-21 17:34:34 +00:00
Nikita Popov	a8db806d52	[SimplifyLibCalls][IRBuilder] Accept any IRBuilder in SimplifyLibCalls This changes the SimplifyLibCalls utility to accept an IRBuilderBase, which allows us to pass through the IRBuilder used by InstCombine. This will ensure that new instructions get added to the worklist. The annotated test-case drops from 4 to 2 InstCombine iterations thanks to this. To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard, which is basically the same as the existing InsertPointGuard and FastMathFlagsGuard, but for operand bundles. Also add a setDefaultOperandBundles() method so these can be set outside the constructor. Differential Revision: https://reviews.llvm.org/D74792	2020-02-21 18:26:05 +01:00
Florian Hahn	134bab7cd5	[DSE,MSSA] Add debug counter. Can be used like -debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20 Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72147	2020-02-21 17:04:37 +00:00
Bill Wendling	2fe457690d	Filter callbr insts from critical edge splitting Similarly to how splitting predecessors with an indirectbr isn't handled in the generic way, we also shouldn't split callbrs, for similar reasons.	2020-02-20 16:24:42 -08:00
Florian Hahn	99809f98d7	[SCCP] Do not mark unknown loads as overdefined. For tracked globals that are unknown after solving, we expect all non-store uses to be replaced. This is a follow-up to `f8045b250d`, which removed forcedconstant. We should not mark unknown loads as overdefined, as they either load from an unknown pointer or an undef global. Restore the original logic for loads.	2020-02-20 22:48:58 +01:00

1 2 3 4 5 ...

14542 Commits