llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	4511f3fa86	[SCEV] Ensure that isHighCostExpansion takes into account what is being divided A SCEV is not low-cost just because you can divide it by a power of 2. We need to also check what we are dividing to make sure it too is not a high-code expansion. This helps to not expand the exit value of certain loops, helping not to bloat the code. The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116, and looks a lot like the other tests in replace-loop-exit-folds.ll. Differential Revision: https://reviews.llvm.org/D58435 llvm-svn: 355393	2019-03-05 12:12:18 +00:00
Sanjoy Das	719e78631d	PHI nodes are not `FPMathOperator` s Reviewers: chandlerc, arsenm Reviewed By: arsenm Subscribers: wdng, arsenm, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58887 llvm-svn: 355362	2019-03-05 01:15:08 +00:00
Sanjay Patel	2a70703770	[ValueTracking] do not try to peek through bitcasts in computeKnownBitsFromAssume() There are no tests for this case, and I'm not sure how it could ever work, so I'm just removing this option from the matcher. This should fix PR40940: https://bugs.llvm.org/show_bug.cgi?id=40940 llvm-svn: 355292	2019-03-03 18:59:33 +00:00
Fangrui Song	5fa53d1593	[DemandedBits] Remove some redundancy in the work list InputIsKnownDead check is shared by all operands. Compute it once. For non-integer instructions, use Visited.insert(I).second to replace a find() and an insert(). llvm-svn: 355290	2019-03-03 14:50:01 +00:00
Fangrui Song	981f216d1d	[DemandedBits] Optimize a find()+insert pattern with try_emplace and APInt::operator\|= llvm-svn: 355284	2019-03-03 11:12:57 +00:00
Florian Hahn	98f11a7d75	[SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR. In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) \|\| !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259	2019-03-02 02:31:44 +00:00
Florian Hahn	3c7e92b5d6	[SCEV] Remove undef check for SCEVConstant (NFC) The value stored in SCEVConstant is of type ConstantInt*, which can never be UndefValue. So we should never hit that code. Reviewers: mkazantsev, sanjoy Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58851 llvm-svn: 355257	2019-03-02 01:57:28 +00:00
Nikita Popov	ed3ca9272f	[ValueTracking] Known bits support for unsigned saturating add/sub We have two sources of known bits: 1. For adds leading ones of either operand are preserved. For sub leading zeros of LHS and leading ones of RHS become leading zeros in the result. 2. The saturating math is a select between add/sub and an all-ones/ zero value. As such we can carry out the add/sub known bits calculation, and only preseve the known one/zero bits respectively. Differential Revision: https://reviews.llvm.org/D58329 llvm-svn: 355223	2019-03-01 20:07:04 +00:00
Jonas Hahnfeld	e071cd86df	Hide two unused debugging methods, NFCI. GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205	2019-03-01 17:15:21 +00:00
Rong Xu	a6ff69f6dd	[PGO] Context sensitive PGO (part 2) Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131	2019-02-28 19:55:07 +00:00
Nikita Popov	af2b0bef43	[ValueTracking] More accurate unsigned sub overflow detection Second part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a - b overflows iff a < b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and b subject to the known bits constraint. llvm-svn: 355109	2019-02-28 18:04:20 +00:00
Bjorn Pettersson	d30f308a9f	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099	2019-02-28 15:45:29 +00:00
Nikita Popov	6c57395fb4	[ValueTracking] More accurate unsigned add overflow detection Part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a + b overflows iff a > ~b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and ~b subject to the known bits constraint. llvm-svn: 355072	2019-02-28 08:11:20 +00:00
Richard Trieu	b37a70f40e	Fix IR/Analysis layering issue with OptBisect OptBisect is in IR due to LLVMContext using it. However, it uses IR units from Analysis as well. This change moves getDescription functions from OptBisect to their respective IR units. Generating names for IR units will now be up to the callers, keeping the Analysis IR units in Analysis. To prevent unnecessary string generation, isEnabled function is added so that callers know when the description needs to be generated. Differential Revision: https://reviews.llvm.org/D58406 llvm-svn: 355068	2019-02-28 04:00:55 +00:00
Alina Sbirlea	fcfa7c5f92	[MemorySSA] Make insertDef insert corresponding phi nodes. Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040	2019-02-27 22:20:22 +00:00
Sanjay Patel	9dada83d6c	[InstSimplify] remove zero-shift-guard fold for general funnel shift As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html We can't remove the compare+select in the general case because we are treating funnel shift like a standard instruction (as opposed to a special instruction like select/phi). That means that if one of the operands of the funnel shift is poison, the result is poison regardless of whether we know that the operand is actually unused based on the instruction's particular semantics. The motivating case for this transform is the more specific rotate op (rather than funnel shift), and we are preserving the fold for that case because there is no chance of introducing extra poison when there is no anonymous extra operand to the funnel shift. llvm-svn: 354905	2019-02-26 18:26:56 +00:00
Simon Pilgrim	a066f1f9e6	[Vectorizer] Add vectorization support for fixed smul/umul intrinsics This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790	2019-02-25 15:42:02 +00:00
Jordan Rupprecht	6387fa2715	[NFC] Fix typos: preceeding -> preceding llvm-svn: 354715	2019-02-23 01:28:32 +00:00
Chijun Sima	70e97163e0	[DTU] Refine the interface and logic of applyUpdates Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669	2019-02-22 13:48:38 +00:00
Sanjay Patel	68171e3cd6	[InstSimplify] use any-zero matcher for fcmp folds The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 The previous attempt at this in rL354406 had a logic bug that actually triggered a regression test failure, but I failed to notice it the first time. llvm-svn: 354467	2019-02-20 14:34:00 +00:00
Sanjay Patel	49f97395ab	Revert "[InstSimplify] use any-zero matcher for fcmp folds" This reverts commit `058bb83513`. Forgot to update another test affected by this change. llvm-svn: 354408	2019-02-20 00:20:38 +00:00
Sanjay Patel	058bb83513	[InstSimplify] use any-zero matcher for fcmp folds The m_APFloat matcher does not work with anything but strict splat vector constants, so we could miss these folds and then trigger an assertion in instcombine: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201 llvm-svn: 354406	2019-02-20 00:09:50 +00:00
Sam Parker	0b53e8454b	[BPI] Look through bitcasts in calcZeroHeuristic Constant hoisting may have hidden a constant behind a bitcast so that it isn't folded into its users. However, this prevents BPI from calculating some of its heuristics that are based upon constant values. So, I've added a simple helper function to look through these casts. Differential Revision: https://reviews.llvm.org/D58166 llvm-svn: 354119	2019-02-15 11:50:21 +00:00
Nick Desaulniers	6a84cd3b8e	Revert "[INLINER] allow inlining of address taken blocks" This reverts commit `19e95fe611`. llvm-svn: 354082	2019-02-14 23:42:21 +00:00
Nick Desaulniers	19e95fe611	[INLINER] allow inlining of address taken blocks as long as their uses does not contain calls to functions that capture the argument (potentially allowing the blockaddress to "escape" the lifetime of the caller). TODO: - add more tests - fix crash in llvm::updateCGAndAnalysisManagerForFunctionPass when invoking Transforms/Inline/blockaddress.ll llvm-svn: 354079	2019-02-14 23:35:53 +00:00
Max Kazantsev	24383cd7bb	Make widenable condition transparent for MemoryWriteTracking Side effects of widenable condition intrinsic are modelled via InaccessibleMemOnly, and there is no way to say that it isn't really writing any memory. This patch teaches MemoryWriteTracking ignore this intrinsic. llvm-svn: 354021	2019-02-14 11:10:29 +00:00
Max Kazantsev	b3168a400f	Teach isGuaranteedToTransferExecutionToSuccessor about widenable conditions Widenable condition intrinsic is guaranteed to return value, notify the isGuaranteedToTransferExecutionToSuccessor function about it. llvm-svn: 354020	2019-02-14 11:10:21 +00:00
Max Kazantsev	4a1c02987e	[NFC] Simplify code & reduce nest slightly llvm-svn: 353832	2019-02-12 11:31:46 +00:00
Evandro Menezes	f4a369596f	[TargetLibraryInfo] Update run time support for Windows It seems that, since VC19, the `float` C99 math functions are supported for all targets, unlike the C89 ones. According to the discussion at https://reviews.llvm.org/D57625. llvm-svn: 353758	2019-02-11 22:12:01 +00:00
Alina Sbirlea	d77edc00a8	[MemorySSA] Remove verifyClobberSanity. Summary: This verification may fail after certain transformations due to BasicAA's fragility. Added a small explanation and a testcase that triggers the assert in checkClobberSanity (before its removal). Addresses PR40509. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, llvm-commits, Prazek Tags: #llvm Differential Revision: https://reviews.llvm.org/D57973 llvm-svn: 353739	2019-02-11 19:51:21 +00:00
Michael Kruse	77a614a6e1	Refactor setAlreadyUnrolled() and setAlreadyVectorized(). Loop::setAlreadyUnrolled() and LoopVectorizeHints::setLoopAlreadyUnrolled() both add loop metadata that stops the same loop from being transformed multiple times. This patch merges both implementations. In doing so we fix 3 potential issues: * setLoopAlreadyUnrolled() kept the llvm.loop.vectorize/interleave.* metadata even though it will not be used anymore. This already caused problems such as http://llvm.org/PR40546. Change the behavior to the one of setAlreadyUnrolled which deletes this loop metadata. * setAlreadyUnrolled() used to create a new LoopID by calling MDNode::get with nullptr as the first operand, then replacing it by the returned references using replaceOperandWith. It is possible that MDNode::get would instead return an existing node (due to de-duplication) that then gets modified. To avoid, use a fresh TempMDNode that does not get uniqued with anything else before replacing it with replaceOperandWith. * LoopVectorizeHints::matchesHintMetadataName() only compares the suffix of the attribute to set the new value for. That is, when called with "enable", would erase attributes such as "llvm.loop.unroll.enable", "llvm.loop.vectorize.enable" and "llvm.loop.distribute.enable" instead of the one to replace. Fortunately, function was only called with "isvectorized". Differential Revision: https://reviews.llvm.org/D57566 llvm-svn: 353738	2019-02-11 19:45:44 +00:00
Evandro Menezes	4b86c474ff	[TargetLibraryInfo] Update run time support for Windows It seems that the run time for Windows has changed and supports more math functions than it used to, especially on AArch64, ARM, and AMD64. Fixes PR40541. Differential revision: https://reviews.llvm.org/D57625 llvm-svn: 353733	2019-02-11 19:02:28 +00:00
Chandler Carruth	9beadff6a5	Move CFLGraph and the AA summary code over to the new `CallBase` instruction base class rather than the `CallSite` wrapper. llvm-svn: 353676	2019-02-11 09:25:41 +00:00
Chandler Carruth	2d2a4359a2	Remove `CallSite` from the CodeMetrics analysis, moving it to the new `CallBase` and simpler APIs therein. llvm-svn: 353673	2019-02-11 09:03:32 +00:00
Chandler Carruth	dac20a8254	[CallSite removal] Port InstSimplify over to use `CallBase` both in its interface and implementation. Port code with: `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353662	2019-02-11 07:54:10 +00:00
Chandler Carruth	751d95fb9b	[CallSite removal] Migrate ConstantFolding APIs and implementation to `CallBase`. Users have been updated. You can see how to update any out-of-tree usages: pass `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353661	2019-02-11 07:51:44 +00:00
Craig Topper	784929d045	Implementation of asm-goto support in LLVM This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563	2019-02-08 20:48:56 +00:00
Sergey Dmitriev	807960e6ef	[CodeExtractor] Update function's assumption cache after extracting blocks from it Summary: Assumption cache's self-updating mechanism does not correctly handle the case when blocks are extracted from the function by the CodeExtractor. As a result function's assumption cache may have stale references to the llvm.assume calls that were moved to the outlined function. This patch fixes this problem by removing extracted llvm.assume calls from the function’s assumption cache. Reviewers: hfinkel, vsk, fhahn, davidxl, sanjoy Reviewed By: hfinkel, vsk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57215 llvm-svn: 353500	2019-02-08 06:55:18 +00:00
Sam Parker	67756c09f2	[LSR] Generate cross iteration indexes Modify GenerateConstantOffsetsImpl to create offsets that can be used by indexed addressing modes. If formulae can be generated which result in the constant offset being the same size as the recurrence, we can generate a pre-indexed access. This allows the pointer to be updated via the single pre-indexed access so that (hopefully) no add/subs are required to update it for the next iteration. For small cores, this can significantly improve performance DSP-like loops. Differential Revision: https://reviews.llvm.org/D55373 llvm-svn: 353403	2019-02-07 13:32:54 +00:00
Alina Sbirlea	6cba96ed52	[LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with MemorySSA. Summary: Experimentally we found that promotion to scalars carries less benefits than sinking and hoisting in LICM. When using MemorySSA, we build an AliasSetTracker on demand in order to reuse the current infrastructure. We only build it if less than AccessCapForMSSAPromotion exist in the loop, a cap that is by default set to 250. This value ensures there are no runtime regressions, and there are small compile time gains for pathological cases. A much lower value (20) was found to yield a single regression in the llvm-test-suite and much higher benefits for compile times. Conservatively we set the current cap to a high value, but we will explore lowering it when MemorySSA is enabled by default. Reviewers: sanjoy, chandlerc Subscribers: nemanjai, jlebar, Prazek, george.burgess.iv, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56625 llvm-svn: 353339	2019-02-06 20:25:17 +00:00
Alina Sbirlea	910c6bef3e	[AliasSetTracker] Pass MustAlias to addPointer more often. Summary: Pass the alias info to addPointer when available. Will save an alias() call for must sets when adding a known Must or May alias. [Part of a series of cleanup patches] Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D56613 llvm-svn: 353335	2019-02-06 19:55:12 +00:00
Philip Reames	00ae46ba52	[AliasSetTracker] Minor style tweak to avoid a variable w/two distinct live ranges [NFC] llvm-svn: 353267	2019-02-06 03:46:40 +00:00
Richard Trieu	5f436fc57a	Move DomTreeUpdater from IR to Analysis DomTreeUpdater depends on headers from Analysis, but is in IR. This is a layering violation since Analysis depends on IR. Relocate this code from IR to Analysis to fix the layering violation. llvm-svn: 353265	2019-02-06 02:52:52 +00:00
Alina Sbirlea	b9c1bc6d3c	[BasicAA] Cache nonEscapingLocalObjects for alias() calls. Summary: Use a small cache for Values tested by nonEscapingLocalObject(). Since the calls to PointerMayBeCaptured are fairly expensive, this saves a good amount of compile time for anything relying heavily on BasicAA.alias() calls. This uses the same approach as the AliasCache, i.e. the cache is reset after each alias() call. The cache is not used or updated by modRefInfo calls since it's harder to know when to reset the cache. Testcases that show improvements with this patch are too large to include. Example compile time improvement: 7s to 6s. Reviewers: chandlerc, sunfish Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57627 llvm-svn: 353245	2019-02-05 23:52:08 +00:00
Evandro Menezes	e5bb58b115	[TargetLibraryInfo] Regroup run time functions for Windows (NFC) Regroup supported and unsupported functions by precision and C standard. llvm-svn: 353213	2019-02-05 20:24:21 +00:00
Hiroshi Inoue	02a2bb2f54	[NFC] fix trivial typos in comments llvm-svn: 353147	2019-02-05 08:30:48 +00:00
Evandro Menezes	98f356cd74	Revert "[PATCH] [TargetLibraryInfo] Update run time support for Windows" This reverts accidental commit `ff5527718d`. llvm-svn: 353118	2019-02-04 23:34:50 +00:00
Evandro Menezes	ff5527718d	[PATCH] [TargetLibraryInfo] Update run time support for Windows It seems that the run time for Windows has changed and supports more math functions than before. Since LLVM requires at least VS2015, I assume that this is the run time that would be redistributed with programs built with Clang. Thus, I based this update on the header file `math.h` that accompanies it. This patch addresses the PR40541. Unfortunately, I have no access to a Windows development environment to validate it. llvm-svn: 353114	2019-02-04 23:29:41 +00:00
David Callahan	fd3e7a9320	Adjust cardinality of internal inliner thresholds Summary: While compiling openJDK11 (also other workloads), some make files would pass both CFLAGS and LDFLAGS at link step ; resulting in duplicate options on the command line when one is using LTO and trying to influence the inliner. Most of the internal flags are ZeroOrMore, this diff changes the remaining ones. Reviewers: david2050, twoh, modocache Reviewed By: twoh Subscribers: mehdi_amini, dexonsmith, eraman, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57537 Patch by: Abdoul-Kader Keita llvm-svn: 353071	2019-02-04 18:46:25 +00:00
Max Kazantsev	437ee05885	[SCEV] Do not bother creating separate SCEVUnknown for unreachable nodes Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we have a huge amounts of such code, we will be littering SE with these nodes. We could just state that they all are undef and save some memory. Differential Revision: https://reviews.llvm.org/D57567 Reviewed By: sanjoy llvm-svn: 353017	2019-02-04 05:04:19 +00:00

1 2 3 4 5 ...

8396 Commits