llvm-project

Commit Graph

Author	SHA1	Message	Date
Guillaume Chatelet	1507fc1506	[Alignment][NFC] Migrate TTI::isLegalToVectorize{Load,Store}Chain to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82653	2020-06-26 14:14:27 +00:00
serge-sans-paille	44f06db439	Fix pass return status for loop extractor As loop extractor has a dependency on another pass (namely BreakCriticalEdges) that may update the IR, use the getAnalysis version introduced in `55fe7b79bb` to carry that change. Add an assert in getAnalysisID to make sure no other changed status is missed - according to validation this was the only one. Related to https://reviews.llvm.org/D80916 Differential Revision: https://reviews.llvm.org/D81236	2020-06-26 15:49:27 +02:00
Guillaume Chatelet	b66e33a689	[Alignment][NFC] Migrate TTI::getGatherScatterOpCost to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82577	2020-06-26 11:08:27 +00:00
Guillaume Chatelet	fdc7c7fb87	[Alignment][NFC] Migrate TTI::getInterleavedMemoryOpCost to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82573	2020-06-26 11:00:53 +00:00
Guillaume Chatelet	7e1f79c3de	[Alignment][NFC] Migrate TTI::getMaskedMemoryOpCost to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82569	2020-06-26 10:14:16 +00:00
Simon Pilgrim	1b10c618e9	LoopVectorize.h - reduce AliasAnalysis.h include to forward declaration. NFC. Replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-26 10:49:00 +01:00
Simon Pilgrim	70f290d95c	VNCoercion.cpp - remove unused includes. NFC.	2020-06-26 09:58:20 +01:00
Simon Pilgrim	dd3580cc29	AggressiveInstCombineInternal.h - reduce unnecessary includes to forward declarations. NFC.	2020-06-26 09:58:20 +01:00
Michael Liao	dccfaacf93	[InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`. Summary: - `ptrtoint` and `inttoptr` are defined as no-op casts if the integer value as the same size as the pointer value. The pair of `ptrtoint`/`inttoptr` is in fact a no-op cast sequence between different address spaces. Teach `infer-address-spaces` to handle them like a `bitcast`. Reviewers: arsenm, chandlerc Subscribers: jvesely, wdng, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81938	2020-06-25 20:46:56 -04:00
Hiroshi Yamauchi	9878996c70	Revert "[PGO] Extend the value profile buckets for mem op sizes." This reverts commit `63a89693f0`. Due to a build failure like http://lab.llvm.org:8011/builders/sanitizer-windows/builds/65386/steps/annotate/logs/stdio	2020-06-25 11:13:49 -07:00
Hiroshi Yamauchi	63a89693f0	[PGO] Extend the value profile buckets for mem op sizes. Extend the memop value profile buckets to be more flexible (could accommodate a mix of individual values and ranges) and to cover more value ranges (from 11 to 22 buckets). Disabled behind a flag (to be enabled separately) and the existing code to be removed later. Differential Revision: https://reviews.llvm.org/D81682	2020-06-25 10:22:56 -07:00
Yuanfang Chen	c4b1daed1d	[NewPM] Move debugging log printing after PassInstrumentation before-pass-callbacks For passes got skipped, this is confusing because the log said it is `running pass` but it is skipped later. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D82511	2020-06-25 10:03:25 -07:00
Sanjay Patel	c9e8c9e3ea	[InstCombine] fold fmul/fdiv with fabs operands fabs(X) * fabs(Y) --> fabs(X * Y) fabs(X) / fabs(Y) --> fabs(X / Y) If both operands of fmul/fdiv are positive, then the result must be positive. There's a NAN corner-case that prevents removing the more specific fold just above this one: fabs(X) * fabs(X) -> X * X That fold works even with NAN because the sign-bit result of the multiply is not specified if X is NAN. We can't remove that and use the more general fold that is proposed here because once we convert to this: fabs (X * X) ...it is not legal to simplify the 'fabs' out of that expression when X is NAN. That's because fabs() guarantees that the sign-bit is always cleared - even for NAN values. So this patch has the potential to lose information, but it seems unlikely if we do the more specific fold ahead of this one. Differential Revision: https://reviews.llvm.org/D82277	2020-06-25 11:35:38 -04:00
Simon Pilgrim	8c2082e1dc	GlobalsModRef.h - reduce CallGraph.h include to forward declarations. NFC. Fix implicit include dependencies in source files.	2020-06-25 16:00:43 +01:00
Simon Pilgrim	db69b17409	LoopAccessAnalysis.h - reduce AliasAnalysis.h include to forward declaration. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-25 16:00:42 +01:00
Florian Hahn	4837daf883	[DSE,MSSA] Check if Def is removable only wen we try to remove it. Non-removable MemoryDefs can still eliminate other defs. Update the isRemovable checks to only candidates for removal.	2020-06-25 14:01:10 +01:00
Tyker	c95ffadb24	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-06-25 12:59:44 +02:00
Max Kazantsev	1eeb714787	[InstCombine] Combine select & Phi by same condition This patch transforms ``` p = phi [x, y] s = select cond, z, p ``` with ``` s = phi[x, z] ``` if we can prove that the Phi node takes values basing on select's condition. Differential Revision: https://reviews.llvm.org/D82072 Reviewed By: nikic	2020-06-25 10:44:10 +07:00
Roman Lebedev	8911a35180	[SROA] convertValue(): we can have <N x iK*> to <M x iQ> cast Provided test case crashes otherwise. Much like to the opposite case.	2020-06-25 00:58:54 +03:00
Roman Lebedev	07a23c06dd	[SROA] convertValue(): we can have <N x iK> to <M x iQ*> cast Provided test case crashes otherwise. If NewTy is already DL.getIntPtrType(NewTy), CreateBitCast() won't actually create any bitcast, so we are better off just doing the general thing.	2020-06-25 00:58:53 +03:00
Roman Lebedev	381054a989	[InstCombine] visitBitCast(): do not crash on weird `bitcast <1 x i8> to i8` Even if we know that RHS of a bitcast is a pointer, we can't assume LHS is, because it might be a single-element vector of pointer.	2020-06-25 00:58:53 +03:00
Christopher Tetreault	3d123e17d8	[SVE] Remove calls to VectorType::getNumElements from IPO Reviewers: efriedma, jdoerfert, sdesmalen, kmclaughlin Reviewed By: efriedma, jdoerfert Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82219	2020-06-24 13:38:51 -07:00
dfukalov	7ddee0922f	[NFCI][CostModel] Add const to Value*. Summary: Get back `const` partially lost in one of recent changes. Additionally specify explicit qualifiers in few places. Reviewers: samparker Reviewed By: samparker Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82383	2020-06-24 23:16:08 +03:00
Florian Hahn	35bb9bfbb0	[SLP] Limit GEP lists based on width of index computation. D68667 introduced a tighter limit to the number of GEPs to simplify together. The limit was based on the vector element size of the pointer, but the pointers themselves are not actually put in vectors. IIUC we try to vectorize the index computations here, so we should base the limit on the vector element size of the computation of the index. This restores the test regression on AArch64 and also restores the vectorization for a important pattern in SPEC2006/464.h264ref on AArch64 (@test_i16_extend). We get a large benefit from doing a single load up front and then processing the index computations in vectors. Note that we could probably even further improve the AArch64 codegen, if we would do zexts to i32 instead of i64 for the sub operands and then do a single vector sext on the result of the subtractions. AArch64 provides dedicated vector instructions to do so. Sketch of proof in Alive: https://alive2.llvm.org/ce/z/A4xYAB Reviewers: craig.topper, RKSimon, xbolva00, ABataev, spatel Reviewed By: ABataev, spatel Differential Revision: https://reviews.llvm.org/D82418	2020-06-24 19:56:53 +01:00
Simon Pilgrim	6c6adde84f	InstCombineInternal.h - reduce AliasAnalysis.h include to forward declaration. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-24 19:27:38 +01:00
Simon Pilgrim	a53dddb3e9	Local.h - reduce includes to forward declarations. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-24 19:27:37 +01:00
Teresa Johnson	d291bd510e	[WPD] Allow virtual calls to be analyzed with multiple type tests Summary: In D52514 I had fixed a bug with WPD after indirect call promotion, by checking that a type test being analyzed dominates potential virtual calls. With that fix I included a small effiency enhancement to avoid processing a devirt candidate multiple times (when there are multiple type tests). This latter change wasn't in response to any measured efficiency issues, it was merely theoretical. Unfortuantely, it turns out to limit optimization opportunities after inlining. Specifically, consider code that looks like: class A { virtual void foo(); }; class B : public A { void foo(); } void callee(A a) { a->foo(); // Call 1 } void caller(B b) { b->foo(); // Call 2 callee(b); } After inlining callee into caller, because of the existing call to b->foo() in caller there will be 2 type tests in caller for the vtable pointer of b: the original type test against B from Call 2, and the inlined type test against A from Call 1. If the code was compiled with -fstrict-vtable-pointers, then after optimization WPD will see that both type tests are associated with the inlined virtual Call 1. With my earlier change to only process a virtual call against one type test, we may only consider virtual Call 1 against the base class A type test, which can't be devirtualized. With my change here to remove this restriction, it also gets considered for the type test against the derived class B type test, where it can be devirtualized. Note that if caller didn't include it's own earlier virtual call b->foo() we will not be able to devirtualize after inlining callee even after this fix, since there would not be a type test against B in the IR. As a future enhancement we can consider inserting type tests at call sites that pass pointers to classes with virtual calls, to enable context-sensitive devirtualization after inlining. Reviewers: pcc, vitalybuka, evgeny777 Subscribers: Prazek, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79235	2020-06-24 10:51:24 -07:00
Simon Pilgrim	c18b753686	LoopUtils.h - reduce AliasAnalysis.h include to forward declarations. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-24 17:58:38 +01:00
Sanjay Patel	a0f967418f	[VectorCombine] give invalid index value a name; NFC	2020-06-24 11:10:36 -04:00
Florian Hahn	4e62c6359c	[DSE] Eliminate stores at the end of the function. This patch add support for eliminating MemoryDefs that do not have any aliasing users, which indicates that there are no reads/writes to the memory location until the end of the function. To eliminate such defs, we have to ensure that the underlying object is not visible in the caller and does not escape via returning. We need a separate check for that, as InvisibleToCaller does not consider returns. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker, george.burgess.iv Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72631	2020-06-24 12:58:20 +01:00
sstefan1	0f426935bb	[OpenMPOpt] ICV macro definitions Summary: This defines some basic information about ICVs in `OMPKinds.def`. We also emit remarks with initial values for each function (which are default for now) as a way to test this. Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6 Subscribers: yaxunl, hiraditya, guansong, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82193	2020-06-24 13:43:35 +02:00
Simon Pilgrim	90ad37646f	ObjCARC.h - remove unnecessary includes. NFC. Add implicit InstIterator.h dependency in ObjCARCContract.cpp	2020-06-24 12:30:59 +01:00
Vedant Kumar	f8bd6a75ed	[SimplifyCFG] Drop debug loc in SpeculativelyExecuteBB Summary: According to HowToUpdateDebugInfo.rst: ``` Preserving the debug locations of speculated instructions can make it seem like a condition is true when it's not (or vice versa), which leads to a confusing single-stepping experience ``` This patch follows the recommendation to drop debug locations on speculated instructions. Reviewers: aprantl, davide Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82420	2020-06-23 18:25:52 -07:00
Zequan Wu	6a822e20ce	[ASan][MSan] Remove EmptyAsm and set the CallInst to nomerge to avoid from merging. Summary: `nomerge` attribute was added at D78659. So, we can remove the EmptyAsm workaround in ASan the MSan and use this attribute. Reviewers: vitalybuka Reviewed By: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82322	2020-06-23 14:22:53 -07:00
Ryan Santhiraraja	f64dc4e686	Preserve GlobalsAA analysis result in InjectTLIMappings InjectTLIMappings fails to preserve the analysis result of GlobalsAA. Not preserving the analysis might affect benchmark performance. This change fixes this issue. Patch by: Ryan Santhiraraja <rsanthir@quicinc.com> Reviewers: fpetrogalli, joerg, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D82343	2020-06-23 22:05:42 +01:00
Florian Hahn	ff4de8683a	[DSE,MSSA] Treat `store 0` after calloc as noop stores. This patch extends storeIsNoop to also detect stores of 0 to an calloced object. This basically ports the logic from legacy DSE to the MemorySSA backed version. It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3 LTO: Same hash: 218 (filtered out) Remaining: 19 Metric: dse.NumNoopStores Program base patch2 diff test-suite...CFP2000/177.mesa/177.mesa.test 1.00 15.00 1400.0% test-suite...6/482.sphinx3/482.sphinx3.test 1.00 14.00 1300.0% test-suite...lications/ClamAV/clamscan.test 2.00 28.00 1300.0% test-suite...CFP2006/433.milc/433.milc.test 1.00 8.00 700.0% test-suite...pplications/oggenc/oggenc.test 2.00 9.00 350.0% test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0% test-suite.../CINT2006/403.gcc/403.gcc.test NaN 137.00 nan% test-suite...libquantum/462.libquantum.test NaN 3.00 nan% test-suite...6/464.h264ref/464.h264ref.test NaN 7.00 nan% test-suite...decode/alacconvert-decode.test NaN 2.00 nan% test-suite...encode/alacconvert-encode.test NaN 2.00 nan% test-suite...ications/JM/ldecod/ldecod.test NaN 9.00 nan% test-suite...ications/JM/lencod/lencod.test NaN 39.00 nan% test-suite.../Applications/lemon/lemon.test NaN 2.00 nan% test-suite...pplications/treecc/treecc.test NaN 4.00 nan% test-suite...hmarks/McCat/08-main/main.test NaN 4.00 nan% test-suite...nsumer-lame/consumer-lame.test NaN 3.00 nan% test-suite.../Prolangs-C/bison/mybison.test NaN 1.00 nan% test-suite...arks/mafft/pairlocalalign.test NaN 30.00 nan% Reviewers: efriedma, zoecarver, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D82204	2020-06-23 21:01:39 +01:00
Simon Pilgrim	36bc10e74a	[Transforms] Ensure we include CommandLine.h if we declare any cl::opt flags	2020-06-23 12:11:51 +01:00
Roman Lebedev	d57e9aca01	[IndVarSimplify] Don't replace IV user with unsafe loop-invariant (PR45360) Summary: As [[ https://bugs.llvm.org/show_bug.cgi?id=45360 \| PR45360 ]] reports, with new cost-model we can sometimes end up being able to expand `udiv`/`urem` instructions. And that exposes at least one instance of when we do that regardless of whether or not it is safe to do. In this particular case, it's `SimplifyIndvar::replaceIVUserWithLoopInvariant()`. It seems to me, we simply need to check with `isSafeToExpandAt()` first. The test isn't great. I'm not sure how to make it only run `-indvars`. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=45360 \| PR45360 ]]. Reviewers: mkazantsev, reames, helloqirun Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82108	2020-06-23 13:53:15 +03:00
Florian Hahn	a822ec75cc	[DSE,MSSA] Treat passed by value args as invisible to caller. This updates the MemorySSA backed implementation to treat arguments passed by value similar to allocas: in they are assumed to be invisible in the caller. This is similar to how they are treated in legacy DSE. Reviewers: efriedma, asbirlea, george.burgess.iv Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82222	2020-06-23 08:58:51 +01:00
Michael Liao	f95850ce9c	[SROA] Teach SROA to perform no-op pointer conversion. Summary: - When promoting a pointer from memory to register, SROA skips pointers from different address spaces. However, as `ptrtoint` and `inttoptr` are defined as no-op casts if that integer type has the same as the pointer value, generate the pair of `ptrtoint`/`inttoptr` (no-op cast) sequence to convert pointers from different address spaces if they have the same size. Reviewers: arsenm, chandlerc, lebedev.ri Subscribers: Differential Revision: https://reviews.llvm.org/D81943	2020-06-23 01:49:27 -04:00
Max Kazantsev	9bff376e5c	[InstCombine] Replace selects with Phis We can sometimes replace a select with a Phi node if all of its values are available on respective incoming edges. Differential Revision: https://reviews.llvm.org/D82005 Reviewed By: nikic	2020-06-23 12:12:59 +07:00
Sanjay Patel	8953ecf22b	[InstCombine] reassociate diff of sums into sum of diffs This is the integer sibling to D81491. (a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3]) Removing the "experimental" from these intrinsics is likely not too far away.	2020-06-22 20:47:09 -04:00
Sanjay Patel	54143e2bd5	[VectorCombine] do not use magic number for undef mask element; NFC	2020-06-22 20:47:09 -04:00
Arthur Eubanks	d335c1317b	Fix dynamic alloca detection in CloneBasicBlock Summary: Simply check AI->isStaticAlloca instead of reimplementing checks for static/dynamic allocas. Reviewers: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82328	2020-06-22 15:06:28 -07:00
Lei Zhang	315bd96437	Use std::make_tuple instead initializer list Hopefully this pleases GCC-5 and fixes the build error: LowerExpectIntrinsic.cpp:62:53: error: converting to 'std::tuple<unsigned int, unsigned int>' from initializer list would use explicit constructor 'constexpr std::tuple<_T1, _T2>::tuple(_U1&&, _U2&&) [with _U1 = llvm:🆑:opt<unsigned int>&; _U2 = llvm:🆑:opt<unsigned int>&; <template-parameter-2-3> = void; _T1 = unsigned int; _T2 = unsigned int]' return {LikelyBranchWeight, UnlikelyBranchWeight}; Differential Revision: https://reviews.llvm.org/D82325	2020-06-22 15:43:40 -04:00
Zhi Zhuang	37fb860301	Add support of __builtin_expect_with_probability Add a new builtin-function __builtin_expect_with_probability and intrinsic llvm.expect.with.probability. The interface is __builtin_expect_with_probability(long expr, long expected, double probability). It is mainly the same as __builtin_expect besides one more argument indicating the probability of expression equal to expected value. The probability should be a constant floating-point expression and be in range [0.0, 1.0] inclusive. It is similar to builtin-expect-with-probability function in GCC built-in functions. Differential Revision: https://reviews.llvm.org/D79830	2020-06-22 10:21:28 -07:00
Hiroshi Yamauchi	9e1decf743	[PGO][PGSO] Enable non-cold size opts under partial profile sample PGO. Summary: Similar to D81020. Follow up D78949. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82053	2020-06-22 10:12:48 -07:00
Sanjay Patel	9934cc544c	[VectorCombine] make helper function for shift-shuffle; NFC This will probably be useful for other extract patterns.	2020-06-22 12:23:52 -04:00
Florian Hahn	328c8642e2	[DSE,MSSA] Reorder DSE blocking checks. Currently we stop exploring candidates too early in some cases. In particular, we can continue checking the defining accesses of non-removable MemoryDefs and defs without analyzable write location (read clobbers are already ruled out using MemorySSA at this point).	2020-06-22 17:16:34 +01:00
Sanjay Patel	98c2f4eea5	[VectorCombine] add helper to replace uses and rename The tests are regenerated to show a path that missed renaming, but there should be no functional difference from this patch.	2020-06-22 09:58:49 -04:00
Sanjay Patel	de65b356dc	[VectorCombine] add/use pass-level IRBuilder This saves creating/destroying a builder every time we perform some transform. The tests show instruction ordering diffs resulting from always inserting at the root instruction now, but those should be benign.	2020-06-22 09:01:29 -04:00
Sanjay Patel	cce625f73d	[VectorCombine] improve IR debugging by providing/salvaging value names The tests are regenerated to show the diffs, but there should be no functional change from this patch.	2020-06-22 08:35:47 -04:00
Serguei Katkov	eae0d2e9b2	Revert "[Peeling] Extend the scope of peeling a bit" This reverts commit `29b2c1ca72`. The patch causes the DT verifier failure like: DominatorTree is different than a freshly computed one! Not sure the patch itself it wrong but revert to investigate the failure.	2020-06-22 17:48:29 +07:00
Florian Hahn	0e19ff02d8	[DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC).	2020-06-22 10:58:53 +01:00
Serguei Katkov	29b2c1ca72	[Peeling] Extend the scope of peeling a bit Currently we allow peeling of the loops if there is a exiting latch block and all other exits are blocks ending with deopt. Actually we want that exit would end up with deopt unconditionally but it is not required that exit itself ends with deopt. Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev Reviewed By: apilipenko Subscribers: hiraditya, zzheng, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D81140	2020-06-22 12:17:44 +07:00
Sanjay Patel	6bdd531af5	[VectorCombine] create class for pass to hold analyses, etc; NFC This doesn't change anything currently, but it would make sense to create a class-level IRBuilder instead of recreating that everywhere. As we expand to more optimizations, we will probably also want to hold things like the DataLayout or other constant refs in here too.	2020-06-21 16:07:33 -04:00
Florian Hahn	40569db7b3	[DSE,MSSA] Move reachability check to main loop. As we traverse the CFG backwards, we could end up reaching unreachable blocks. For unreachable blocks, we won't have computed post order numbers and because DomAccess is reachable, unreachable blocks cannot be on any path from it. This fixes a crash with unreachable blocks.	2020-06-21 16:38:10 +01:00
clfbbn	10b0539772	[Attributor][NFC] Fix indentation Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, kuter, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82260	2020-06-21 15:43:32 +08:00
Wenlei He	7c8a6936bf	[Remarks] Add callsite locations to inline remarks Summary: Add call site location info into inline remarks so we can differentiate inline sites. This can be useful for inliner tuning. We can also reconstruct full hierarchical inline tree from parsing such remarks. The messege of inline remark is also tweaked so we can differentiate SampleProfileLoader inline from CGSCC inline. Reviewers: wmi, davidxl, hoy Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82213	2020-06-20 23:32:10 -07:00
Eric Christopher	dc20419351	Rename function to more accurately reflect what it does.	2020-06-20 14:37:29 -07:00
Sanjay Patel	741e20f3d6	[VectorCombine] fix assert for type of compare operand As shown in the post-commit comment for D81661 - we need to loosen the type assertion to allow scalarization of a compare for vectors of pointers.	2020-06-20 15:20:17 -04:00
Sanjay Patel	7b201bfcac	[InstCombine] remove unused parameter and add assert; NFC	2020-06-20 11:47:00 -04:00
Sanjay Patel	d84cdb81ed	[InstCombine] fabs(X) / fabs(X) -> X / X Also, consolidate related folds so we don't miss/repeat these.	2020-06-20 10:20:21 -04:00
Eric Christopher	10563e16aa	[Analysis/Transforms/Sanitizers] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:42:26 -07:00
Eric Christopher	858d385578	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:24:57 -07:00
Fangrui Song	2a4317bfb3	[SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D82244	2020-06-19 22:22:47 -07:00
Yevgeny Rouban	6429471e8b	[IR] Convert profile metadata in createCallMatchingInvoke() When an invoke instruction is converted to a call its profile metadata is dropped because it has incompatible format (see commit `16ad6eeb94`). This patch adds an attempt to convert profile data to format of the call instruction. This used to work well before the commit `dcfa78a4cc`. Reviewers: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D82071	2020-06-20 12:10:31 +07:00
Eric Christopher	b6536e549d	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-19 15:12:18 -07:00
Sanjay Patel	216a37bb46	[VectorCombine] refactor extract-extract logic; NFCI	2020-06-19 14:52:27 -04:00
Sanjay Patel	6d864097a2	[VectorCombine] fix crash while transforming constants This is a variation of the proposal in D82049 with an extra test.	2020-06-19 12:30:32 -04:00
Florian Hahn	f9d8e33c32	[SCCP] Turn sext into zext for non-negative ranges. This patch updates SCCP/IPSCCP to use the computed range info to turn sexts into zexts, if the value is known to be non-negative. We already to a similar transform in CorrelatedValuePropagation, but it seems like we can catch a lot of additional cases by doing it in SCCP/IPSCCP as well. The transform is limited to ranges that are known to not include undef. Currently constant ranges from conditions are treated as potentially containing undef, due to PR46144. Once we flip this, the transform will be more effective in practice. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81756	2020-06-19 10:17:55 +01:00
Tyker	b7338fb1a6	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-19 10:32:26 +02:00
Matt Arsenault	b13f6b0fe0	BypassSlowDivision: Fix dropping debug info I don't know anything about debug info, but this seems like more work should be necessary. This constructs a new IRBuilder and reconstructs the original divides rather than moving the original. One problem this has is if a div/rem pair are handled, both end up with the same debugloc. I'm not sure how to fix this, since this uses a cache when it sees the same input operands again, which will have the first instance's location attached.	2020-06-18 17:27:19 -04:00
Christopher Tetreault	8d11ec66b6	[SVE] Remove calls to VectorType::getNumElements from Transforms/Utils Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82057	2020-06-18 13:39:14 -07:00
Sanjay Patel	46a285ad9e	[IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC The predicate can always be used to distinguish between icmp and fcmp, so we don't need to keep repeating this check in the callers.	2020-06-18 15:47:06 -04:00
Davide Italiano	8cdd2a158c	[SimplifyCFG] Update debug location when folding branch to common destination Sometimes a dead block gets folded and the debug information is still retained. This manifests as jumpy stepping in lldb, see the bugzilla PR for an end-to-end C testcase. Fixes https://bugs.llvm.org/show_bug.cgi?id=46008 Differential Revision: https://reviews.llvm.org/D82062	2020-06-18 12:33:32 -07:00
serge-sans-paille	4dd332723d	Fix return status of LoopDistribute Move code that may update the IR after precondition, so that if precondition fail, the IR isn't modified. Differential Revision: https://reviews.llvm.org/D81225	2020-06-18 20:13:18 +02:00
Arthur Eubanks	91ef930526	[GlobalOpt] Remove preallocated calls when possible When possible (e.g. internal linkage), strip preallocated attribute off parameters/arguments. This requires removing the "preallocated" operand bundle from the call site, replacing @llvm.call.preallocated.arg() with an alloca and a bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since @llvm.call.preallocated.arg() can be called multiple times with the same arg index, we create an alloca per arg index. We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was and a @llvm.stackrestore() after the preallocated call to prevent the stack from blowing up. This is valid because the argument would normally not exist on the stack after the call before the transformation. This does not currently handle all possible preallocated calls. We will need to figure out where to put @llvm.stackrestore() in the cases where there is no obvious place to put it, for example conditional preallocated calls, invokes. This sort of transformation may need to be moved to somewhere more accessible to accomodate similar transformations (like inlining) in the future. Reviewers: efriedma, hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80951	2020-06-18 09:56:13 -07:00
Florian Hahn	1669fddc9f	[Matrix] Use alignment info when lowering loads/stores. This patch updates LowerMatrixIntrinsics to preserve the alignment specified at the original load/stores and the align attribute for the pointer argument of the column.major.load/store intrinsics. We can always use the specified alignment for the load of the first column. For subsequent columns, the alignment may need to be reduced. For ConstantInt strides, compute the offset for the start of the column in bytes and use commonAlignment to get the largest valid alignment. For non-ConstantInt strides, we need to take the common alignment of the initial alignment and the element size in bytes. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D81960	2020-06-18 13:19:31 +01:00
Florian Hahn	d88acd8f7d	[Matrix] Preserve volatile when loading loads/stores. Currently the matrix lowering turns volatile loads/stores into non-volatile ones. This patch updates the lowering to preserve the volatile bit. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D81498	2020-06-18 12:14:19 +01:00
Florian Hahn	6d18c2067e	[Matrix] Update load/store intrinsics. This patch adjust the load/store matrix intrinsics, formerly known as llvm.matrix.columnwise.load/store, to improve the naming and allow passing of extra information (volatile). The patch performs the following changes: * Rename columnwise.load/store to column.major.load/store. This is more expressive and also more in line with the naming in Clang. * Changes the stride arguments from i32 to i64. The stride can be larger than i32 and this makes things more uniform with the way things are handled in Clang. * A new boolean argument is added to indicate whether the load/store is volatile. The lowering respects that when emitting vector load/store instructions * MatrixBuilder is updated to require both Alignment and IsVolatile arguments, which are passed through to the generated intrinsic. The alignment is set using the `align` attribute. The changes are grouped together in a single patch, to have a single commit that breaks the compatibility. We probably should be fine with updating the intrinsics, as we did not yet officially support them in the last stable release. If there are any concerns, we can add auto-upgrade rules for the columnwise intrinsics though. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse Reviewed By: anemet, nicolasvasilache Differential Revision: https://reviews.llvm.org/D81472	2020-06-18 09:44:52 +01:00
serge-sans-paille	f9c7e3136e	Correctly report modified status for HWAddressSanitizer Differential Revision: https://reviews.llvm.org/D81238	2020-06-18 10:27:44 +02:00
Mehdi Amini	77b79d79c0	Remove "unused" member ModuleSlice from `struct OpenMPOpt` This is fixing warning from clang: warning: private field 'ModuleSlice' is not used [-Wunused-private-field] SmallPtrSetImpl<Function *> &ModuleSlice; ^ Differential Revision: https://reviews.llvm.org/D82027	2020-06-18 03:02:26 +00:00
Eric Christopher	a8dad30388	Revert "Remove unused class variable ModuleSlice." as it was used in debug only code. This reverts commit `07a1749081`.	2020-06-17 14:45:17 -07:00
Eric Christopher	07a1749081	Remove unused class variable ModuleSlice.	2020-06-17 14:33:29 -07:00
Roman Lebedev	84b4f5a6a6	[InstCombine] Negator: while there, add detection for cycles during negation I don't have any testcases showing it happening, and i haven't succeeded in creating one, but i'm also not positive it can't ever happen, and i recall having something that looked like that in the very beginning of Negator creation. But since we now already have a negation cache, we can now detect such cases practically for free. Let's do so instead of "relying" on stack overflow :D	2020-06-17 22:47:20 +03:00
Roman Lebedev	e3d8cb1e1d	[InstCombine] Negator: cache negation results (PR46362) It is possible that we can try to negate the same value multiple times. For example, PHI nodes may happen to have multiple incoming values (all of which must be the same value) for the same incoming basic block. It may happen that we try to negate such a PHI node, and succeed, and that might result in having now-different incoming values.. To avoid that, and in general to reduce the amount of duplicated work we might be doing, let's introduce a cache where we'll track results of negating each value. The added test was previously failing -verify after -instcombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=46362	2020-06-17 22:47:20 +03:00
Roman Lebedev	c4166f3d84	[NFC][InstCombine] Negator: add thin negate() wrapped before visit()	2020-06-17 22:47:20 +03:00
Roman Lebedev	2b85147337	[NFC][InstCombine] Negator: do not include unneeded "llvm/IR/DerivedTypes.h" header	2020-06-17 22:47:19 +03:00
Nick Desaulniers	88c965ba14	BreakCriticalEdges for callbr indirect dests Summary: llvm::SplitEdge was failing an assertion that the BasicBlock only had one successor (for BasicBlocks terminated by CallBrInst, we typically have multiple successors). It was surprising that the earlier call to SplitCriticalEdge did not handle the critical edge (there was an early return). Removing that triggered another assertion relating to creating a BlockAddress for a BasicBlock that did not (yet) have a parent, which is a simple order of operations issue in llvm::SplitCriticalEdge (a freshly constructed BasicBlock must be inserted into a Function's basic block list to have a parent). Thanks to @nathanchance for the report. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018 Reviewers: craig.topper, jyknight, void, fhahn, efriedma Reviewed By: efriedma Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D81607	2020-06-17 11:45:06 -07:00
sstefan1	7cfd267c51	[OpenMPOPT][NFC] Introducing OMPInformationCache. Summary: Introduction of OpenMP-specific information cache based on Attributor's `InformationCache`. This should make it easier to share information between them. Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku Subscribers: yaxunl, hiraditya, guansong, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81798	2020-06-17 16:56:45 +02:00
Simon Pilgrim	a5f1f9c9b8	ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC. Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency. Add implicit header dependency to source files where necessary.	2020-06-17 15:48:23 +01:00
Sjoerd Meijer	c1034d044a	Follow up of rGe345d547a0d5, and attempt to pacify buildbot: "error: 'get' is deprecated: The base class version of get with the scalable argument defaulted to false is deprecated." Changed VectorType::get() -> FixedVectorType::get().	2020-06-17 13:24:09 +01:00
Sjoerd Meijer	e345d547a0	Recommit "[LV] Emit @llvm.get.active.lane.mask for tail-folded loops" Fixed ARM regression test. Please see the original commit message rG47650451738c for details.	2020-06-17 13:12:15 +01:00
David Green	076e08aa45	[LSR] Filter for postinc formulae In more complicated loops we can easily hit the complexity limits of loop strength reduction. If we do and filtering occurs, it's all too easy to remove the wrong formulae for post-inc preferring accesses due to it attempting to maximise register re-use. The patch adds an alternative filtering step when the target is preferring postinc to pick postinc formulae instead, hopefully lowering the complexity to below the limit so that aggressive filtering is not needed. There is also a change in here to stop considering existing addrecs as free under postinc. We should already be modelling them as a reg so don't want it to cause us to get the cost wrong. (I'm not sure that code makes sense in general, but there are X86 tests specifically for it where it seems to be helping so have left it around for the standard non-post-inc case). Differential Revision: https://reviews.llvm.org/D80273	2020-06-17 12:32:04 +01:00
Sam Parker	5bf0858c0b	Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant" I originally reverted the patch because it was causing performance issues, but now I think it's just enabling simplify-cfg to do something that I don't want instead :) Sorry for the noise. This reverts commit `3e39760f8e`.	2020-06-17 11:38:59 +01:00
Hans Wennborg	16ad6eeb94	[IR] Don't copy profile metadata in createCallMatchingInvoke() The invoke instruction can have profile metadata with branch_weights, which does not make sense for a call instruction and will be rejected by the verifier. Differential revision: https://reviews.llvm.org/D81996	2020-06-17 11:18:23 +02:00
serge-sans-paille	1cafd8a5d1	Fix LoopIdiomRecognize pass return status Introduce an helper class to aggregate the cleanup in case of rollback. Differential Revision: https://reviews.llvm.org/D81230	2020-06-17 11:12:03 +02:00
Sjoerd Meijer	d4e183f686	Revert "[LV] Emit @llvm.get.active.mask for tail-folded loops" This reverts commit `4765045173` while I investigate the build bot failures.	2020-06-17 10:09:54 +01:00
Florian Hahn	773353be4e	[SCCP] Move common code to simplify basic block to helper (NFC). Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81755	2020-06-17 10:03:43 +01:00
Sjoerd Meijer	4765045173	[LV] Emit @llvm.get.active.mask for tail-folded loops This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised loops if the intrinsic is supported by the backend, which is checked by querying TargetTransform hook emitGetActiveLaneMask. This intrinsic creates a mask representing active and inactive vector lanes, which is used by the masked load/store instructions that are created for tail-folded loops. The semantics of @llvm.get.active.mask are described here in LangRef: https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics This intrinsic is also used to provide a hint to the backend. That is, the second argument of the intrinsic represents the back-edge taken count of the loop. For MVE, for example, we use that to set up tail-predication, which is a new form of predication in MVE for vector loops that implicitely predicates the last vector loop iteration by implicitely setting active/inactive lanes, i.e. the tail loop is predicated. In order to set up a tail-predicated vector loop, we need to know the number of data elements processed by the vector loop, which corresponds the the tripcount of the scalar loop, which we can now reconstruct using @llvm.get.active.mask. Differential Revision: https://reviews.llvm.org/D79100	2020-06-17 09:53:58 +01:00
Christopher Tetreault	ff628f5f5e	[SVE] Eliminate calls to default-false VectorType::get() from Vectorize Reviewers: efriedma, fhahn, spatel, sdesmalen, kmclaughlin Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81521	2020-06-16 12:50:13 -07:00
Sanjay Patel	ed67f5e7ab	[VectorCombine] scalarize compares with insertelement operand(s) Generalize scalarization (recently enhanced with D80885) to allow compares as well as binops. Similar to binops, we are avoiding scalarization of a loaded value because that could avoid a register transfer in codegen. This requires 1 extra predicate that I am aware of: we do not want to scalarize the condition value of a vector select. That might also invert a transform that we do in instcombine that prefers a vector condition operand for a vector select. I think this is the final step in solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 Differential Revision: https://reviews.llvm.org/D81661	2020-06-16 13:48:10 -04:00
Tyker	d7deef1206	Revert "[AssumeBundles] add cannonicalisation to the assume builder" This reverts commit `90c50cad19`.	2020-06-16 14:34:55 +02:00
Tyker	90c50cad19	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-16 13:12:35 +02:00
sstefan1	e099c7b64a	[NFC][OpenMPOpt] Provide function-specific foreachUse.	2020-06-16 12:33:15 +02:00
Jay Foad	6fdd5a28b7	Revert "[IR] Clean up dead instructions after simplifying a conditional branch" This reverts commit `69bdfb075b`. Reverting to investigate https://bugs.llvm.org/show_bug.cgi?id=46343	2020-06-16 10:32:15 +01:00
Gui Andrade	b0ffa8befe	[MSAN] Pass Origin by parameter to __msan_warning functions Summary: Normally, the Origin is passed over TLS, which seems like it introduces unnecessary overhead. It's in the (extremely) cold path though, so the only overhead is in code size. But with eager-checks, calls to __msan_warning functions are extremely common, so this becomes a useful optimization. This can save ~5% code size. Reviewers: eugenis, vitalybuka Reviewed By: eugenis, vitalybuka Subscribers: hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D81700	2020-06-15 17:49:18 -07:00
Florian Hahn	120c059292	[DSE,MSSA] Port partial store merging. Port partial constant store merging logic to MemorySSA backed DSE. The heavy lifting is done by the existing helper function. It is used in context where we already ensured that the later instruction can eliminate the earlier one, if it is a complete overwrite.	2020-06-15 18:41:46 +01:00
Florian Hahn	71a91b9837	[DSE] Hoist partial store merging code into function (NFC). Hoist the general logic into a new function, because it can be re-used by the MemorySSA backed DSE as well.	2020-06-15 17:44:24 +01:00
Florian Hahn	8c61f13a0f	[DSE,MSSA] Delete instructions after printing it. Also enables a now-passing test case, that exposed a crash caused by the wrong order.	2020-06-15 16:01:36 +01:00
Sam Parker	2596da3174	[CostModel] getCFInstrCost in getUserCost. Have BasicTTI call the base implementation so that both agree on the default behaviour, which the default being a cost of '1'. This has required an X86 specific implementation as it seems to be very reliant on those instructions being free. Changes are also made to AMDGPU so that their implementations distinguish between cost kinds, so that the unrolling isn't affected. PowerPC also has its own implementation to prevent changes to the reg-usage vectorizer test. The cost model test changes now reflect that ret instructions are not generally free. Differential Revision: https://reviews.llvm.org/D79164	2020-06-15 09:28:46 +01:00
Max Kazantsev	60da4369a1	[NFC] Bail early simplifying unconditional branches	2020-06-15 13:59:53 +07:00
Sam Parker	3e39760f8e	Revert "Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant"" This reverts commit `23291b9863`. This caused performance regressions.	2020-06-15 07:46:28 +01:00
Whitney Tsang	5225cd43e8	[LoopUnroll] Allow loops with multiple exiting blocks where loop latch is not necessary one of them. Summary: Currently LoopUnrollPass already allow loops with multiple exiting blocks, but it is only allowed when the loop latch is one of the exiting blocks. When the loop latch is not an exiting block, then only single exiting block is supported. When possible, the single loop latch or the single exiting block terminator is optimized to an unconditional branch in the unrolled loop. This patch allows loops with multiple exiting blocks even if the loop latch is not one of them. However, the optimization of exiting block terminator to unconditional branch is not done when there exists more than one exiting block. Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour Reviewed By: efriedma Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D81053	2020-06-14 18:44:18 +00:00
Sanjay Patel	098e48a6a1	[PassManager] restore early-cse to vector cleanup As noted in D80236 - the early-cse pass was included here before: D75145 / rG71a316883d50 But it got moved outside of the "extra" option there, then it got dropped while adjusting -vector-combine: rG6438ea45e053 rG57bb4787d72f So this is restoring the behavior and adding a test to prevent accidental changes again. I don't see an equivalent option for the new pass manager.	2020-06-14 10:04:53 -04:00
Sanjay Patel	b5fb26951a	[InstCombine] reassociate FP diff of sums into sum of diffs (a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3]) This should be the last step in solving PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953 We started emitting reduction intrinsics with: D80867/ rGe50059f6b6b3 So it's a relatively easy pattern match now to re-order those ops. Also, I have not seen any complaints for the switch to intrinsics yet, so I'll propose to remove the "experimental" tag from the intrinsics soon. Differential Revision: https://reviews.llvm.org/D81491	2020-06-14 09:09:03 -04:00
Sanjay Patel	aeb5044801	[InstCombine] allow undef elements when comparing vector constants for min/max bailout This is a hacky, but low-risk fix to avoid the infinite loop in PR46271: https://bugs.llvm.org/show_bug.cgi?id=46271 As discussed there, the problem is that FoldOpIntoSelect() can get into a conflict with a transform that wants to pull a 'not' op through min/max via SimplifyDemandedVectorElts(). We need to relax our matching of min/max to include undefined elements in vector constants to avoid that. Alternatively, we could improve or cripple the demanded elements analysis, but that could create even more problems. The likely better, safer alternative will be to create min/max intrinsics, so we can remove all of the hacks related to min/max matching in instcombine. Differential Revision: https://reviews.llvm.org/D81698	2020-06-14 09:02:47 -04:00
Roman Lebedev	e987ee6318	[NFCI][AggressiveInstCombiner] Add `STATISTIC()`s for transforms	2020-06-13 23:53:16 +03:00
Florian Hahn	97e7147e34	[DSE,MSSA] Fix location order in isOverwrite call. isOverwrite expects the later location as first argument and the earlier result later. The adjusted call is intended to check whether CC overwrites DefLoc.	2020-06-13 20:39:00 +01:00
Eric Christopher	b422fe7d62	Temporarily revert "[MemCpyOptimizer] Simplify API of processStore and processMem* functions" as it seems to be causing some internal crashes in AA after email with the author. This reverts commit `f79e6a8847`.	2020-06-12 14:01:27 -07:00
Roman Lebedev	7aeb41b3c8	[NFCI] VectorCombine: add statistic for bitcast(shuf()) -> shuf(bitcast()) xform	2020-06-12 23:10:53 +03:00
Roman Lebedev	55eb714a0e	[NFC] OpenMPOpt: add a statistic for num of parallel regions deleted	2020-06-12 23:10:53 +03:00
Marco Elver	8af7fa07aa	[ASan][NFC] Refactor redzone size calculation Refactor redzone size calculation. This will simplify changing the redzone size calculation in future. Note that AddressSanitizer.cpp violates the latest LLVM style guide in various ways due to capitalized function names. Only code related to the change here was changed to adhere to the style guide. No functional change intended. Reviewed By: andreyknvl Tags: #llvm Differential Revision: https://reviews.llvm.org/D81367	2020-06-12 15:33:00 +02:00
Florian Hahn	4495a6b141	[BreakCritEdges] Add option to opt-out of perserving loop-simplify. This patch adds a new option to CriticalEdgeSplittingOptions to control whether loop-simplify form must be preserved. It is them used by GVN to indicate that loop-simplify form does not have to be preserved. This fixes a crash exposed by `189efe295b`. If the critical edge we are splitting goes from a block inside a loop to a block outside the loop, splitting the edge will create a new exit block. As a result, the new block will branch to the original exit block, which will add a non-loop predecessor, breaking loop-simplify form. To preserve loop-simplify form, the predecessor blocks of the original exit are split, but that does not work for blocks with indirectbr terminators. If preserving loop-simplify form is requested, bail out , before making any changes. Reviewers: reames, hfinkel, davide, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81582	2020-06-12 11:47:13 +01:00
Florian Hahn	3a846d4d92	[VPlan] Reject loops without computable backedge taken counts getOrCreateTripCount is used to generate code for the outer loop, but it requires a computable backedge taken counts. Check that in the VPlan native path. Reviewers: Ayal, gilr, rengolin, sguggill Reviewed By: sguggill Differential Revision: https://reviews.llvm.org/D81088	2020-06-12 10:31:18 +01:00
EgorBo	012909dcaf	[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0" Summary: "X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two) However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression: "X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj This is my first contribution to LLVM so I hope I didn't mess things up Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79369	2020-06-12 10:20:06 +03:00
Yevgeny Rouban	707836ed4e	[JumpThreading] Handle zero !prof branch_weights Avoid division by zero in updatePredecessorProfileMetadata(). Reviewers: yamauchi Tags: #llvm Differential Revision: https://reviews.llvm.org/D81499	2020-06-12 11:55:15 +07:00
Alina Sbirlea	519b019a0a	Verify MemorySSA after all updates. Verify after completing all updates. Resolves PR46275.	2020-06-11 18:48:41 -07:00
Sanjay Patel	039ff29ef6	[VectorCombine] remove unused parameters; NFC	2020-06-11 19:15:03 -04:00
Stanislav Mekhanoshin	a98d618f6e	Fixed assertion in SROA if block has ho successors BasicBlock::isLegalToHoistInto() asserts if block does not have successors. The case is degenarate but assertion still needs to be avoided. https://bugs.llvm.org/show_bug.cgi?id=46280 Differential Revision: https://reviews.llvm.org/D81674	2020-06-11 15:15:19 -07:00
serge-sans-paille	bff09876d7	Fix return status of DataFlowSanitizer pass Take into account added functions, global values and attribute change. Differential Revision: https://reviews.llvm.org/D81239	2020-06-11 16:05:17 +02:00
Jay Foad	69bdfb075b	[IR] Clean up dead instructions after simplifying a conditional branch Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when removePredecessor calls PHINode::removeIncomingValue. Differential Revision: https://reviews.llvm.org/D80206	2020-06-11 14:53:01 +01:00
Jay Foad	f45c65aa41	Revert "[IR] Clean up dead instructions after simplifying a conditional branch" This reverts commit `4494e45316`. It caused problems for sanitizer buildbots.	2020-06-11 14:22:16 +01:00
Jay Foad	4494e45316	[IR] Clean up dead instructions after simplifying a conditional branch Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Differential Revision: https://reviews.llvm.org/D80206	2020-06-11 13:28:10 +01:00
Jay Foad	f79e6a8847	[MemCpyOptimizer] Simplify API of processStore and processMem* functions Previously these functions either returned a "changed" flag or a "repeat instruction" flag, and could also modify an iterator to control which instruction would be processed next. Simplify this by always returning a "changed" flag, and handling all of the "repeat instruction" functionality by modifying the iterator. No functional change intended except in this case: // If the source and destination of the memcpy are the same, then zap it. ... where the previous code failed to process the instruction after the zapped memcpy. Differential Revision: https://reviews.llvm.org/D81540	2020-06-11 12:48:09 +01:00
Chris Jackson	4707bc2177	[DebugInfo] Refactor SalvageDebugInfo and SalvageDebugInfoForDbgValues - Simplify the salvaging interface and the algorithm in InstCombine Reviewers: vsk, aprantl, Orlando, jmorse, TWeaver Reviewed by: Orlando Differential Revision: https://reviews.llvm.org/D79863	2020-06-11 11:13:46 +01:00
Craig Topper	94b1404587	[InstCombine] Remove some repeated calls to getOperand. NFCI We had alread loaded operand 1 and 2 of the select as TV and FV using the more the readable getTrueValue/getFalseValue.	2020-06-10 16:54:50 -07:00
serge-sans-paille	9daccb7a47	Correctly update Changed status for SimplifyCFG Interestingly, this leads to better output in one of the test case. Differential Revision: https://reviews.llvm.org/D81237	2020-06-10 16:54:15 +02:00
Kuter Dinel	70330edc4d	Reland: [Attributor] Split the Attributor::run() into multiple functions. Summary: This patch splits the Attributor::run() function into multiple functions. Simple Logic changes to make this possible: # Moved iteration count verification earlier. # NumFinalAAs get set a little bit later. Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81022	2020-06-10 13:21:22 +00:00
Marco Elver	d3f89314ff	[KernelAddressSanitizer] Make globals constructors compatible with kernel [v2] [ v1 was reverted by `c6ec352a6b` due to modpost failing; v2 fixes this. More info: https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783 ] This makes -fsanitize=kernel-address emit the correct globals constructors for the kernel. We had to do the following: * Disable generation of constructors that rely on linker features such as dead-global elimination. * Only instrument globals not in explicit sections. The kernel uses sections for special globals, which we should not touch. * Do not instrument globals that are prefixed with "__" nor that are aliased by a symbol that is prefixed with "__". For example, modpost relies on specially named aliases to find globals and checks their contents. Unfortunately modpost relies on size stored as ELF debug info and any padding of globals currently causes the debug info to cause size reported to be with redzone which throws modpost off. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493 Tested: * With 'clang/test/CodeGen/asan-globals.cpp'. * With test_kasan.ko, we can see: BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan] * allyesconfig, allmodconfig (x86_64) Reviewed By: glider Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81390	2020-06-10 15:08:42 +02:00
sstefan1	3013f2d329	Revert "[Attributor] Split the Attributor::run() into multiple functions." This reverts commit `0ee47cc92f`.	2020-06-10 10:10:49 +00:00
stefan	0ee47cc92f	[Attributor] Split the Attributor::run() into multiple functions. Summary: This patch splits the Attributor::run() function into multiple functions. Simple Logic changes to make this possible: # Moved iteration count verification earlier. # NumFinalAAs get set a little bit later. Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81022	2020-06-10 09:48:58 +00:00
Florian Hahn	67671024c8	[DSE,MSSA] Relax post-dom restriction for objs visible after return. This patch relaxes the post-dominance requirement for accesses to objects visible after the function returns. Instead of requiring the killing def to post-dominate the access to eliminate, the set of 'killing blocks' (= blocks that completely overwrite the original access) is collected. If all paths from the access to eliminate and an exit block go through a killing block, the access can be removed. To check this property, we first get the common post-dominator block for the killing blocks. If this block does not post-dominate the access block, there may be a path from DomAccess to an exit block not involving any killing block. Otherwise we have to check if there is a path from the DomAccess to the common post-dominator, that does not contain a killing block. If there is no such path, we can remove DomAccess. For this check, we start at the common post-dominator and then traverse the CFG backwards. Paths are terminated when we hit a killing block or a block that is not executed between DomAccess and a killing block according to the post-order numbering (if the post order number of a block is greater than the one of DomAccess, the block cannot be in in a path starting at DomAccess). This gives the following improvements on the total number of stores after DSE for MultiSource, SPEC2K, SPEC2006: Tests: 237 Same hash: 206 (filtered out) Remaining: 31 Metric: dse.NumRemainingStores Program base new100 diff test-suite...CFP2000/188.ammp/188.ammp.test 3624.00 3544.00 -2.2% test-suite...ch/g721/g721encode/encode.test 128.00 126.00 -1.6% test-suite.../Benchmarks/Olden/mst/mst.test 73.00 72.00 -1.4% test-suite...CFP2006/433.milc/433.milc.test 3202.00 3163.00 -1.2% test-suite...000/186.crafty/186.crafty.test 5062.00 5010.00 -1.0% test-suite...-typeset/consumer-typeset.test 40460.00 40248.00 -0.5% test-suite...Source/Benchmarks/sim/sim.test 642.00 639.00 -0.5% test-suite...nchmarks/McCat/09-vor/vor.test 642.00 644.00 0.3% test-suite...lications/sqlite3/sqlite3.test 35664.00 35563.00 -0.3% test-suite...T2000/300.twolf/300.twolf.test 7202.00 7184.00 -0.2% test-suite...lications/ClamAV/clamscan.test 19475.00 19444.00 -0.2% test-suite...INT2000/164.gzip/164.gzip.test 2199.00 2196.00 -0.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 2380.00 2378.00 -0.1% test-suite.../Benchmarks/Bullet/bullet.test 39335.00 39309.00 -0.1% test-suite...:: External/Povray/povray.test 36951.00 36927.00 -0.1% test-suite...marks/7zip/7zip-benchmark.test 67396.00 67356.00 -0.1% test-suite...6/464.h264ref/464.h264ref.test 31497.00 31481.00 -0.1% test-suite...006/453.povray/453.povray.test 51441.00 51416.00 -0.0% test-suite...T2006/401.bzip2/401.bzip2.test 4450.00 4448.00 -0.0% test-suite...Applications/kimwitu++/kc.test 23481.00 23471.00 -0.0% test-suite...chmarks/MallocBench/gs/gs.test 6286.00 6284.00 -0.0% test-suite.../CINT2000/254.gap/254.gap.test 13719.00 13715.00 -0.0% test-suite.../Applications/SPASS/SPASS.test 30345.00 30338.00 -0.0% test-suite...006/450.soplex/450.soplex.test 15018.00 15016.00 -0.0% test-suite...ications/JM/lencod/lencod.test 27780.00 27777.00 -0.0% test-suite.../CINT2006/403.gcc/403.gcc.test 105285.00 105276.00 -0.0% There might be potential to pre-compute some of the information of which blocks are on the path to an exit for each block, but the overall benefit might be comparatively small. On the set of benchmarks, 15738 times out of 20322 we reach the CFG check, the CFG check is successful. The total number of iterations in the CFG check is 187810, so on average we need less than 10 steps in the check loop. Bumping the threshold in the loop from 50 to 150 gives a few small improvements, but I don't think they warrant such a big bump at the moment. This is all pending further tuning in the future. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: george.burgess.iv Differential Revision: https://reviews.llvm.org/D78932	2020-06-10 10:39:25 +01:00
Vitaly Buka	5a3b380f49	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit `69c5ff4668`. This reverts commit `603d58b5e4`. This reverts commit `ba10bedf56`. This reverts commit `39b3c41b65`.	2020-06-10 02:32:50 -07:00
Whitney Tsang	01e64c9712	[LoopFusion] Update second loop guard non loop successor phis incoming blocks. Summary: The current LoopFusion forget to update the incoming block of the phis in second loop guard non loop successor from second loop guard block to first loop guard block. A test case is provided to better understand the problem. Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D81421	2020-06-09 21:14:51 +00:00
Christopher Tetreault	765ac39db2	[SVE] Eliminate calls to default-false VectorType::get() from Scalar Reviewers: efriedma, kmclaughlin, sdesmalen, fhahn, bkramer, anna, gchatelet, c-rhodes, david-arm, fpetrogalli Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80336	2020-06-09 14:09:02 -07:00
Simon Pilgrim	5dc4e7c2b9	[VectorCombine] scalarizeBinop - support an all-constant src vector operand scalarizeBinop currently folds vec_bo((inselt VecC0, V0, Index), (inselt VecC1, V1, Index)) -> inselt(vec_bo(VecC0, VecC1), scl_bo(V0,V1), Index) This patch extends this to account for cases where one of the vec_bo operands is already all-constant and performs similar cost checks to determine if the scalar binop with a constant still makes sense: vec_bo((inselt VecC0, V0, Index), VecC1) -> inselt(vec_bo(VecC0, VecC1), scl_bo(V0,extractelt(V1,Index)), Index) Fixes PR42174 Differential Revision: https://reviews.llvm.org/D80885	2020-06-09 19:02:05 +01:00
Simon Pilgrim	8233439fdb	[InstCombine] Ensure allocation alignment mask is within range before applying as an attribute Fixes OSS-Fuzz #23214 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23214	2020-06-09 17:31:55 +01:00
serge-sans-paille	5b08bd0eb4	Fix MemCpyOptimizer return status Differential Revision: https://reviews.llvm.org/D81229	2020-06-09 14:24:33 +02:00

1 2 3 4 5 ...

24536 Commits