llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	8d0fdd36a3	[IR] CmpInst: Add getFlippedSignednessPredicate() And refactor a few places to use it	2020-11-06 11:31:09 +03:00
Giorgis Georgakoudis	700d2417d8	[CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers CodeExtractor handles bitcasts in the extracted region that have lifetime markers users in the outer region as outputs. That creates unnecessary alloca/reload instructions and extra lifetime markers. The patch identifies those cases, and replaces uses in out-of-region lifetime markers with new bitcasts in the outer region. Example ``` define void @foo() { entry: %0 = alloca i32 br label %extract extract: %1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 4, i8* %1) call void @use(i32* %0) br label %exit exit: call void @use(i32* %0) call void @llvm.lifetime.end.p0i8(i64 4, i8* %1) ret void } ``` Current extraction ``` define void @foo() { entry: %.loc = alloca i8, align 8 %0 = alloca i32, align 4 br label %codeRepl codeRepl: ; preds = %entry %lt.cast = bitcast i8* %.loc to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast) %lt.cast1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1) call void @foo.extract(i32* %0, i8** %.loc) %.reload = load i8, i8* %.loc, align 8 call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast) br label %exit exit: ; preds = %codeRepl call void @use(i32* %0) call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload) ret void } define internal void @foo.extract(i32* %0, i8** %.out) { newFuncRoot: br label %extract exit.exitStub: ; preds = %extract ret void extract: ; preds = %newFuncRoot %1 = bitcast i32* %0 to i8* store i8* %1, i8** %.out, align 8 call void @use(i32* %0) br label %exit.exitStub } ``` Extraction with patch ``` define void @foo() { entry: %0 = alloca i32, align 4 br label %codeRepl codeRepl: ; preds = %entry %lt.cast1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1) call void @foo.extract(i32* %0) br label %exit exit: ; preds = %codeRepl call void @use(i32* %0) %lt.cast = bitcast i32* %0 to i8* call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast) ret void } define internal void @foo.extract(i32* %0) { newFuncRoot: br label %extract exit.exitStub: ; preds = %extract ret void extract: ; preds = %newFuncRoot %1 = bitcast i32* %0 to i8* call void @use(i32* %0) br label %exit.exitStub } ``` Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D90689	2020-11-05 17:01:08 -08:00
Sjoerd Meijer	7eb70158e4	[IndVarSimplify][SimplifyIndVar] Move WidenIV to Utils/SimplifyIndVar. NFCI. This moves WidenIV from IndVarSimplify to Utils/SimplifyIndVar so that we have createWideIV available as a generic helper utility. I.e., this is not only useful in IndVarSimplify, but could be useful for loop transformations. For example, motivation for this refactoring is the loop flatten transformation: if induction variables in a loop nest can be widened, we can avoid having to perform certain overflow checks, enabling this transformation. Differential Revision: https://reviews.llvm.org/D90421	2020-11-05 16:52:47 +00:00
Florian Hahn	be0578f0b4	[GVN] Fix MemorySSA update when replacing assume(false) with stores. When replacing an assume(false) with a store, we have to be more careful with the order we insert the new access. This patch updates the code to look at the accesses in the block to find a suitable insertion point. Alterantively we could check the defining access of the assume, but IIRC there has been some discussion about making assume() readnone, so looking at the access list might be more future proof. Fixes PR48072. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90784	2020-11-05 12:09:32 +00:00
Simon Pilgrim	4b2be681f4	[InstCombine] Remove orphan InstCombinerImpl method declarations. NFCI.	2020-11-05 10:13:16 +00:00
Arnold Schwaighofer	ea5989b43a	Start of an llvm.coro.async implementation This patch adds the `async` lowering of coroutines. This will be used by the Swift frontend to lower async functions. In contrast to the `retcon` lowering the frontend needs to be in control over control-flow at suspend points as execution might be suspended at these points. This is very much work in progress and the implementation will change as it evolves with the frontend. As such the documentation is lacking detail as some of it might change. rdar://70097093 Reapply with fix for memory sanitizer failure and sphinx failure. Differential Revision: https://reviews.llvm.org/D90612	2020-11-04 10:29:21 -08:00
Arnold Schwaighofer	42f1916640	Revert "Start of an llvm.coro.async implementation" This reverts commit `ea606cced0`. This patch causes memory sanitizer failures sanitizer-x86_64-linux-fast.	2020-11-04 08:26:20 -08:00
Arnold Schwaighofer	ea606cced0	Start of an llvm.coro.async implementation This patch adds the `async` lowering of coroutines. This will be used by the Swift frontend to lower async functions. In contrast to the `retcon` lowering the frontend needs to be in control over control-flow at suspend points as execution might be suspended at these points. This is very much work in progress and the implementation will change as it evolves with the frontend. As such the documentation is lacking detail as some of it might change. rdar://70097093 Differential Revision: https://reviews.llvm.org/D90612	2020-11-04 07:32:29 -08:00
Roman Lebedev	93f3d7f7b3	[Reassociate] Guard `add`-like `or` conversion into an `add` with profitability check This is slightly better compile-time wise, since we avoid potentially-costly knownbits analysis that will ultimately not allow us to actually do anything with said `add`.	2020-11-04 16:10:34 +03:00
Martin Storsjö	36cf1e7d0e	Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts" This reverts commit `59b22e495c`. That commit broke building for ARM and AArch64, reproducible like this: $ cat apedec-reduced.c a; b(e) { int c; unsigned d = f(); c = d >> 32 - e; return c; } g() { int h = i(); if (a) h = h << a \| b(a); return h; } $ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed. Same thing for e.g. an armv7-linux-gnueabihf target.	2020-11-04 08:39:32 +02:00
Xun Li	7f34aca083	[musttail] Unify musttail call preceding return checking There is already an API in BasicBlock that checks and returns the musttail call if it precedes the return instruction. Use it instead of manually checking in each place. Differential Revision: https://reviews.llvm.org/D90693	2020-11-03 11:39:27 -08:00
Roman Lebedev	70472f34b2	[Reassociate] Convert `add`-like `or`'s into an `add`'s to allow reassociation InstCombine is quite aggressive in doing the opposite transform, folding `add` of operands with no common bits set into an `or`, and that not many things support that new pattern.. In this case, teaching Reassociate about it is easy, there's preexisting art for `sub`/`shl`: just convert such an `or` into an `add`: https://rise4fun.com/Alive/Xlyv	2020-11-03 22:30:51 +03:00
Sanne Wouda	2ec26d3a23	Revert "Add loop distribution to the LTO pipeline" This reverts commit `6e80318eec`.	2020-11-03 19:29:27 +00:00
Sanne Wouda	6e80318eec	Add loop distribution to the LTO pipeline The LoopDistribute pass is missing from the LTO pipeline, so -enable-loop-distribute has no effect during post-link. The pre-link loop distribution doesn't seem to survive the LTO pipeline either. With this patch (and -flto -mllvm -enable-loop-distribute) we see a 43% uplift on SPEC 2006 hmmer for AArch64. The rest of SPECINT 2006 is unaffected. Differential Revision: https://reviews.llvm.org/D89896	2020-11-03 18:54:24 +00:00
Jameson Nash	59a6ab28c4	[GVN] small improvements to comments	2020-11-03 13:21:48 -05:00
Roman Lebedev	c009d11bda	[InstCombine] Perform C-(X+C2) --> (C-C2)-X transform before using Negator In particular, it makes it fire for C=0, because negator doesn't want to perform that fold since in general it's not beneficial.	2020-11-03 16:06:52 +03:00
Roman Lebedev	e465f9c303	[InstCombine] Negator: - (C - %x) --> %x - C (PR47997) This relaxes one-use restriction on that `sub` fold, since apparently the addition of Negator broke preexisting `C-(C2-X) --> X+(C-C2)` (with C=0) fold.	2020-11-03 16:06:51 +03:00
Florian Hahn	d68bed0fa9	[SCCP] Handle bitcast of vector constants. Vectors where all elements have the same known constant range are treated as a single constant range in the lattice. When bitcasting such vectors, there is a mis-match between the width of the lattice value (single constant range) and the original operands (vector). Go to overdefined in that case. Fixes PR47991.	2020-11-03 12:58:39 +00:00
Simon Pilgrim	59b22e495c	[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns. This should allow us to begin resolving PR46896 et al. Differential Revision: https://reviews.llvm.org/D90625	2020-11-03 10:49:49 +00:00
Florian Hahn	d9cbf39a37	[SLP] Pass VecPred argument to getCmpSelInstrCost. Check if all compares in VL have the same predicate and pass it to getCmpSelInstrCost, to improve cost-modeling on targets that only support compare/select combinations for certain uniform predicates. This leads to additional vectorization in some cases ``` Same hash: 217 (filtered out) Remaining: 19 Metric: SLP.NumVectorInstructions Program base slp2 diff test-suite...marks/SciMark2-C/scimark2.test 11.00 26.00 136.4% test-suite...T2006/445.gobmk/445.gobmk.test 79.00 135.00 70.9% test-suite...ediabench/gsm/toast/toast.test 54.00 71.00 31.5% test-suite...telecomm-gsm/telecomm-gsm.test 54.00 71.00 31.5% test-suite...CI_Purple/SMG2000/smg2000.test 426.00 542.00 27.2% test-suite...ch/g721/g721encode/encode.test 30.00 24.00 -20.0% test-suite...000/186.crafty/186.crafty.test 116.00 138.00 19.0% test-suite...ications/JM/ldecod/ldecod.test 697.00 765.00 9.8% test-suite...6/464.h264ref/464.h264ref.test 822.00 886.00 7.8% test-suite...chmarks/MallocBench/gs/gs.test 154.00 162.00 5.2% test-suite...nsumer-lame/consumer-lame.test 621.00 651.00 4.8% test-suite...lications/ClamAV/clamscan.test 223.00 231.00 3.6% test-suite...marks/7zip/7zip-benchmark.test 680.00 695.00 2.2% test-suite...CFP2000/177.mesa/177.mesa.test 2121.00 2129.00 0.4% test-suite...:: External/Povray/povray.test 2406.00 2412.00 0.2% test-suite...TimberWolfMC/timberwolfmc.test 634.00 634.00 0.0% test-suite...CFP2006/433.milc/433.milc.test 1036.00 1036.00 0.0% test-suite.../Benchmarks/nbench/nbench.test 321.00 321.00 0.0% test-suite...ctions-flt/Reductions-flt.test NaN 5.00 nan% ``` Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D90124	2020-11-03 10:16:43 +00:00
Max Kazantsev	46b2e85f0f	[NFC] Refactor code in IndVars, preparing for further improvement	2020-11-03 15:08:12 +07:00
Max Kazantsev	a44b7322a2	[NFC] Split lambda into 2 parts for further reuse	2020-11-03 14:13:55 +07:00
Max Kazantsev	f847094c24	[IndVars] Use knowledge about execution on last iteration when removing checks If we know that some check will not be executed on the last iteration, we can use this fact to eliminate its check. Differential Revision: https://reviews.llvm.org/D88210 Reviwed By: ebrevnov	2020-11-03 13:38:58 +07:00
Alina Sbirlea	f514b32a89	[LICM] Add assert of AST/MSSA exclusiveness. The API `canSinkOrHoistInst` may be called by LoopSink. Add assert to avoid having two analyses passed in.	2020-11-02 18:04:43 -08:00
Akira Hatanaka	b0f1d7d562	Remove unused parameter	2020-11-02 17:40:06 -08:00
Ettore Tiotto	4274cbba1c	[PartialInliner]: Handle code regions in a switch stmt cases This patch enhances computeOutliningColdRegionsInfo() to allow it to consider regions containing a single basic block and a single predecessor as candidate for partial inlining. Reviewed By: fhann Differential Revision: https://reviews.llvm.org/D89911	2020-11-02 14:32:45 -05:00
Simon Pilgrim	55f15f99cb	[AggressiveInstCombine] foldGuardedRotateToFunnelShift - generalize rotation to funnel shift matcher. Replace matchRotate with a more general matchFunnelShift - at the moment this is still just used for rotation patterns.	2020-11-02 17:09:17 +00:00
Fangrui Song	98b9338588	[Debugify] Port -debugify-each to NewPM Preemptively switch 2 tests to the new PM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D90365	2020-11-02 08:16:43 -08:00
Florian Hahn	b3b993a7ad	Reland "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts the revert commit `408c4408fa`. This version of the patch includes a fix for a crash caused by treating ICmp/FCmp constant expressions as instructions. Original message: On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV.	2020-11-02 15:39:29 +00:00
Teresa Johnson	0949f96dc6	[MemProf] Pass down memory profile name with optional path from clang Similar to -fprofile-generate=, add -fmemory-profile= which takes a directory path. This is passed down to LLVM via a new module flag metadata. LLVM in turn provides this name to the runtime via the new __memprof_profile_filename variable. Additionally, always pass a default filename (in $cwd if a directory name is not specified vi the = form of the option). This is also consistent with the behavior of the PGO instrumentation. Since the memory profiles will generally be fairly large, it doesn't make sense to dump them to stderr. Also, importantly, the memory profiles will eventually be dumped in a compact binary format, which is another reason why it does not make sense to send these to stderr by default. Change the existing memprof tests to specify log_path=stderr when that was being relied on. Depends on D89086. Differential Revision: https://reviews.llvm.org/D89087	2020-11-01 17:38:23 -08:00
Florian Hahn	ca38652b9a	[VPlan] Assert no users remaining when deleting a VPValue. When deleting a VPValue, all users must already by deleted. Add an assertion to make sure and catch violations.	2020-11-01 17:44:53 +00:00
Florian Hahn	aab71d4443	[DSE] Use same logic as legacy impl to check if free kills a location. This patch updates DSE + MemorySSA to use the same check as the legacy implementation to determine if a location is killed by a free call. This changes the existing behavior so that a free does not kill locations before the start of the freed pointer. This should fix PR48036.	2020-10-31 20:09:25 +00:00
Florian Hahn	799033d8c5	Reland "[SLP] Consider alternatives for cost of select instructions." This reverts the revert commit `a1b53db324`. This patch includes a fix for a reported issue, caused by matchSelectPattern returning UMIN for selects of pointers in some cases by looking to some connected casts. For now, ensure integer instrinsics are only returned for selects of ints or int vectors.	2020-10-31 16:52:36 +00:00
Simon Pilgrim	538fdb0189	[InstCombine] foldSelectRotate - generalize to foldSelectFunnelShift This is the last of the rotate->funnel shift InstCombine generalizations for PR46896 We still have foldGuardedRotateToFunnelShift to deal with in AggressiveInstCombine Differential Revision: https://reviews.llvm.org/D90382	2020-10-31 12:32:34 +00:00
Simon Pilgrim	4da6a48399	[CSE] Make some basic EarlyCSE::StackNode helper methods const. NFCI. Fixes a number of cppcheck remarks.	2020-10-31 12:16:48 +00:00
Nikita Popov	27f647d117	[Inliner] Consistently apply callsite noalias metadata Previously, !noalias and !alias.scope metadata on the call site was applied as part of CloneAliasScopeMetadata(), which short-circuits if the callee does not use any noalias metadata itself. However, these two things have no relation to each other. Consistently apply !noalias and !alias.scope metadata by integrating this into an existing function that handled !llvm.access.group and !llvm.mem.parallel_loop_access metadata. The handling for all of these metadata kinds essentially the same.	2020-10-31 10:54:45 +01:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Florian Hahn	a1b53db324	Revert "[SLP] Consider alternatives for cost of select instructions." This reverts commit `1922570489`. This appears to cause a crash in the following example a, b, c; l() { int e = a, f = l, g, h, i, j; float d = c, k = b; for (;;) for (; g < f; g++) { k[h] = d[i]; k[h - 1] = d[j]; h += e << 1; i += e; } } clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.	2020-10-30 21:26:14 +00:00
Florian Hahn	408c4408fa	Revert "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts commit `73f01e3df5`. This appears to break http://lab.llvm.org:8011/#/builders/85/builds/383.	2020-10-30 21:26:14 +00:00
Peter Collingbourne	3d049bce98	hwasan: Support for outlined checks in the Linux kernel. Add support for match-all tags and GOT-free runtime calls, which are both required for the kernel to be able to support outlined checks. This requires extending the access info to let the backend know when to enable these features. To make the code easier to maintain introduce an enum with the bit field positions for the access info. Allow outlined checks to be enabled with -mllvm -hwasan-inline-all-checks=0. Kernels that contain runtime support for outlined checks may pass this flag. Kernels lacking runtime support will continue to link because they do not pass the flag. Old versions of LLVM will ignore the flag and continue to use inline checks. With a separate kernel patch [1] I measured the code size of defconfig + tag-based KASAN, as well as boot time (i.e. time to init launch) on a DragonBoard 845c with an Android arm64 GKI kernel. The results are below: code size boot time before 92824064 6.18s after 38822400 6.65s [1] https://linux-review.googlesource.com/id/I1a30036c70ab3c3ee78d75ed9b87ef7cdc3fdb76 Depends on D90425 Differential Revision: https://reviews.llvm.org/D90426	2020-10-30 14:25:40 -07:00
Peter Collingbourne	0930763b4b	hwasan: Move fixed shadow behind opaque no-op cast as well. This is a workaround for poor heuristics in the backend where we can end up materializing the constant multiple times. This is particularly bad when using outlined checks because we materialize it for every call (because the backend considers it trivial to materialize). As a result the field containing the shadow base value will always be set so simplify the code taking that into account. Differential Revision: https://reviews.llvm.org/D90425	2020-10-30 13:23:52 -07:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Pedro Tammela	86e0c1acdb	[NFC][Reg2Mem] modernize loops iterators This patch updates the Reg2Mem loops to use more modern iterators. Differential Revision: https://reviews.llvm.org/D90122	2020-10-30 16:50:07 +00:00
Pedro Tammela	70a495c7f0	[NFC][LoopSimplify] modernize for loops over LoopInfo This patch modifies two for loops to use the range based syntax. Since they are equivalent, this patch is tagged NFC. Differential Revision: https://reviews.llvm.org/D90069	2020-10-30 16:50:07 +00:00
Michael Liao	c82403d025	[gvn] PRE needs to skip convergent intrinsics/calls. - As convergent intrinsics/calls could only be moved to control-equivalent blocks, or more precisely the same divergent branch, PRE needs to skip them. Differential Revision: https://reviews.llvm.org/D90391	2020-10-30 11:24:40 -04:00
Evgeniy Brevnov	3d31adaec4	[DSE] Improve partial overlap detection Currently isOverwrite returns OW_MaybePartial even for accesss known not to overlap. This is not a big problem for legacy implementation (since isPartialOverwrite follows isOverwrite and clarifies the result). Contrary SSA based version does a lot of work to later find out that accesses don't overlap. Besides negative impact on compile time we quickly reach MemorySSAPartialStoreLimit and miss optimization opportunities. Note: In fact, I think it would be cleaner implementation if isOverwrite returned fully clarified result in the first place whithout need to call isPartialOverwrite. This can be done as a follow up. What do you think? Reviewed By: fhahn, asbirlea Differential Revision: https://reviews.llvm.org/D90371	2020-10-30 22:23:20 +07:00
Simon Pilgrim	ed577892cf	Use cast<> instead of dyn_cast<> as we dereference the pointers immediately. NFCI. Fix clang static analyzer warnings - we're better off relying on cast<> asserting on failure rather than a null dereference crash.	2020-10-30 15:20:40 +00:00
Florian Hahn	aa1a198a64	[VPlan] Use isa<> instead getVPRecipeID in getFirstNonPhi (NFC). As per the comment in VPRecipeBase, clients should not rely on getVPRecipeID, as it may change in the future. It should only be used in classof implementations. Use isa instead in getFirstNonPhi.	2020-10-30 14:56:06 +00:00
Simon Pilgrim	b7c91a9b8e	[SCEV] SCEVExpander::InsertNoopCastOfTo - reduce scope of pointer type. NFCI. By reducing the scope of the dyn_cast<PointerType> we can make this a cast<PointerType> and avoid clang static analyzer null deference warnings.	2020-10-30 14:55:09 +00:00
Florian Hahn	73f01e3df5	[TTI] Add VecPred argument to getCmpSelInstrCost. On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV. Reviewed By: dmgreen, RKSimon Differential Revision: https://reviews.llvm.org/D90070	2020-10-30 13:49:08 +00:00
Simon Pilgrim	9e154f1aca	[SROA] Pass Twine by const reference. NFCI. Fixes clang-tidy warnings.	2020-10-30 11:36:58 +00:00
Max Kazantsev	bd341bafbf	[NFC] Simplify code in IndVars	2020-10-30 17:49:32 +07:00
Florian Hahn	05e4f7bde9	[DSE] Remove noop stores after killing stores for a MemoryDef. Currently we fail to eliminate some noop stores if there is a kill-able store between the starting def and the load. This is because we eliminate noop stores first. In practice it seems like eliminating noop stores after the main elimination for a def covers slightly more cases. This patch improves the number of stores slightly in 2 cases for X86 -O3 -flto Same hash: 235 (filtered out) Remaining: 2 Metric: dse.NumRedundantStores Program base patch diff test-suite...ce/Benchmarks/PAQ8p/paq8p.test 2.00 3.00 50.0% test-suite...006/453.povray/453.povray.test 18.00 21.00 16.7% There might be other phase ordering issues, but it appears that they do not show up in the test-suite/SPEC2000/SPEC2006. We can always tune the ordering later. Partly fixes PR47887. Reviewed By: asbirlea, zoecarver Differential Revision: https://reviews.llvm.org/D89650	2020-10-30 09:40:15 +00:00
Roman Lebedev	81fc53a36a	[SCEV] Introduce SCEVPtrToIntExpr (PR46786) And use it to model LLVM IR's `ptrtoint` cast. This is essentially an alternative to D88806, but with no chance for all the problems it caused due to having the cast as implicit there. (see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3) As we've established by now, there are at least two reasons why we want this: * It will allow SCEV to actually model the `ptrtoint` casts and their operands, instead of treating them as `SCEVUnknown` * It should help with initial problem of PR46786 - this should eventually allow us to not loose pointer-ness of an expression in more cases As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 \| PR46786 ]], in principle, we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()` should sink the cast as far down into the expression as possible, so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`. But i think that it isn't the best solution, because it doesn't really matter from memory consumption side - there probably won't be that many `SCEVPtrToIntExpr`s for it to matter, and it allows for much better discoverability. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89456	2020-10-30 11:13:35 +03:00
Vitaly Buka	36fa658db5	[NFC] Fix "ambiguous overload for ‘operator=’" From D89768	2020-10-30 00:43:32 -07:00
Vitaly Buka	1455259546	[NFC] Fix "ambiguous overload for ‘operator=’"	2020-10-30 00:36:50 -07:00
Xun Li	9f5a2beadc	[Coroutine] Properly determine whether an alloca should live on the frame The existing logic in determining whether an alloca should live on the frame only looks explicit def-use relationships. However a value defined by an alloca may be implicitly needed across suspension points, either because an alias has across-suspension-point def-use relationship, or escaped by store/call/memory intrinsics. To properly handle all these cases, we have to properly visit the alloca pointer up-front. Thie patch extends the exisiting alloca use visitor to determine whether an alloca should live on the frame. Differential Revision: https://reviews.llvm.org/D89768	2020-10-29 23:56:05 -07:00
Stefanos Baziotis	a3345300b6	[LCSSA] Doc for special treatment of PHIs Differential Revision: https://reviews.llvm.org/D89739	2020-10-29 22:50:07 +02:00
Nikita Popov	20b386aae0	[LoopUtils] Fix neutral value for vector.reduce.fadd Use -0.0 instead of 0.0 as the start value. The previous use of 0.0 was fine for all existing uses of this function though, as it is always generated with fast flags right now, and thus nsz.	2020-10-29 21:45:13 +01:00
Florian Hahn	1922570489	[SLP] Consider alternatives for cost of select instructions. Some architectures do not have general vector select instructions (e.g. AArch64). But some cmp/select patterns can be vectorized using other instructions/intrinsics. One example is using min/max instructions for certain patterns. This patch updates the cost calculations for selects in the SLP vectorizer to consider using min/max intrinsics. This patch does not change SLP vectorizer's codegen itself to actually generate those intrinsics, but relies on the backends to lower the vector cmps & selects. This keeps things simple on the SLP side and works well in practice for AArch64. This exposes additional SLP vectorization opportunities in some benchmarks on AArch64 (-O3 -flto). Metric: SLP.NumVectorInstructions Program base slp diff test-suite...ications/JM/ldecod/ldecod.test 502.00 697.00 38.8% test-suite...ications/JM/lencod/lencod.test 1023.00 1414.00 38.2% test-suite...-typeset/consumer-typeset.test 56.00 65.00 16.1% test-suite...6/464.h264ref/464.h264ref.test 804.00 822.00 2.2% test-suite...006/453.povray/453.povray.test 3335.00 3357.00 0.7% test-suite...CFP2000/177.mesa/177.mesa.test 2110.00 2121.00 0.5% test-suite...:: External/Povray/povray.test 2378.00 2382.00 0.2% Reviewed By: RKSimon, samparker Differential Revision: https://reviews.llvm.org/D89969	2020-10-29 20:39:50 +00:00
Dávid Bolvanský	7a2abf5aca	[InferAttrs] Add nocapture/writeonly to string/mem libcalls One step closer to fix PR47644. Differential Revision: https://reviews.llvm.org/D89645	2020-10-29 20:06:43 +01:00
Simon Pilgrim	dcb3dc101d	[InstCombine] visitShl - ensure inner shifts have inrange amounts Noticed when fixing OSS Fuzz #26716	2020-10-29 15:28:15 +00:00
Max Kazantsev	a5b2e795c3	[NFC][SCEV] Refactor monotonic predicate checks to return enums instead of bools This patch gets rid of output parameter which is not needed for most users and prepares this API for further refactoring.	2020-10-29 16:01:25 +07:00
Johannes Doerfert	d39f574dcc	[Attributor][FIX] Properly promote arguments pointers to arrays When we promote pointer arguments we did compute a wrong offset and use a wrong type for the array case. Bug reported and reduced by Whitney Tsang <whitneyt@ca.ibm.com>.	2020-10-29 00:45:32 -05:00
Fangrui Song	39856d5d0b	[Debugify] Move global namespace functions into llvm:: Also move exportDebugifyStats from tools/opt to Debugify.cpp	2020-10-28 19:11:41 -07:00
Florian Hahn	53f4c4b2cc	[InstCombine] Do not introduce bitcasts for swifterror arguments. The following constraints hold for swifterror values: A swifterror value (either the parameter or the alloca) can only be loaded and stored from, or used as a swifterror argument. This patch updates instcombine to not try to convert a bitcast of a function into a bitcast of a swifterror argument. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D90258	2020-10-28 21:52:12 +00:00
Benjamin Kramer	207cf71fa9	Revert "[OpenMP] Add Passing in Original Declaration Names To Mapper API" This reverts commit `d981c7b758` and `a87d7b3d44`. Test fails under msan.	2020-10-28 13:58:14 +01:00
Max Kazantsev	160a453138	Return "[IndVars] Remove monotonic checks with unknown exit count" This reverts commit `e038b60d91`. This reverts commit `a0d84d8031`. This revert was a mistake. The reason of the failures was "Use uint64_t for branch weights instead of uint32_t" Differential Revision: https://reviews.llvm.org/D87832	2020-10-28 18:51:40 +07:00
Florian Hahn	b82f80057d	[DSE] Use walker to skip noalias stores between current & clobber def. Instead of getting the defining access we should be able to use getClobberingMemoryAccess to skip non-aliasing MemoryDefs. No additional checks should be needed, because we only remove the starting def if it matches the defining access of the load. All we need to worry about is that there are no (may)alias stores between the starting def and the load and getClobberingMemoryAccess should guarantee that. Partly fixes PR47887. This improves the number of redundant stores removed in some cases (numbers below for MultiSource, SPEC2000, SPEC2006 on X86 with -flto -O3). Same hash: 226 (filtered out) Remaining: 11 Metric: dse.NumRedundantStores Program base patch1 diff test-suite...:: External/Povray/povray.test 1.00 5.00 400.0% test-suite...chmarks/MallocBench/gs/gs.test 1.00 3.00 200.0% test-suite...0/253.perlbmk/253.perlbmk.test 21.00 37.00 76.2% test-suite...0.perlbench/400.perlbench.test 24.00 37.00 54.2% test-suite.../Applications/SPASS/SPASS.test 3.00 4.00 33.3% test-suite...006/453.povray/453.povray.test 15.00 18.00 20.0% test-suite...T2006/445.gobmk/445.gobmk.test 27.00 29.00 7.4% test-suite.../CINT2006/403.gcc/403.gcc.test 136.00 137.00 0.7% test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0% test-suite.../Benchmarks/Bullet/bullet.test NaN 3.00 nan% test-suite.../Benchmarks/Ptrdist/bc/bc.test NaN 1.00 nan% Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89647	2020-10-28 11:01:25 +00:00
Luqman Aden	4c0a016927	Rename EHPersonality::MSVC_Win64SEH to EHPersonality::MSVC_TableSEH. NFC. The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining vs table-based exception handling. x86-32 is the only arch for which Windows uses the former. 32-bit ARM would use what is called Win64SEH today, which is a bit confusing so instead let's just rename it to be a bit more clear. Reviewed By: compnerd, rnk Differential Revision: https://reviews.llvm.org/D90117	2020-10-27 23:22:13 -07:00
Kazu Hirata	b2f05fae80	[JumpThreading] Remove extraneous calls to setEdgeProbability This patch removes extraneous calls to setEdgeProbability introduced in `c91487769d`. The follow-up patch, `a7b662d0f4`, has since fixed BranchProbabilityInfo::eraseBlock, so we don't need to worry about getting stale values from getEdgeProbability. Also, since getEdgeProbability(BB, BB->getSingleSuccessor()) returns edge probability 1/1 by default for BB with exactly one successor edge, we don't need to explicitly call setEdgeProbability. This patch introduces almost no functional change, but we do end up reducing debug messages from setEdgeProbability. Differential Revision: https://reviews.llvm.org/D90284	2020-10-27 21:12:54 -07:00
Johannes Doerfert	d13daa4018	[Attributor] Finalize the CGUpdater after each SCC This matches the new PM model.	2020-10-27 22:07:56 -05:00
Johannes Doerfert	50d34958df	[Attributor][NFC] Introduce a debug counter for `AA::manifest` This will simplify debugging and tracking down problems.	2020-10-27 22:07:56 -05:00
Johannes Doerfert	1d57b7f503	[Attributor][NFC] Print the right value in debug output	2020-10-27 22:07:55 -05:00
Johannes Doerfert	1c2531c9e1	[Attributor][FIX] Delete all unreachable static functions Before we used to only mark unreachable static functions as dead if all uses were known dead. Now we optimistically assume uses to be dead until proven otherwise.	2020-10-27 22:07:55 -05:00
Johannes Doerfert	bfe05b1aff	[Attributor][FIX] Do not attach range metadata to the wrong Instruction If we are looking at a call site argument it might be a load or call which is in a different context than the call site argument. We cannot simply use the call site argument range for the call or load. Bug reported and reduced by Whitney Tsang <whitneyt@ca.ibm.com>.	2020-10-27 22:07:55 -05:00
Johannes Doerfert	724fcce109	[Attributor][NFC] Clang-format	2020-10-27 22:07:55 -05:00
Johannes Doerfert	d504f7b91a	[Attributor][NFC] Hoist call out of a lambda The call is not free, unsure if this is needed but it does not make it worse either.	2020-10-27 22:07:54 -05:00
Johannes Doerfert	30e5a1f0be	[Attributor][FIX] Properly check uses in the call not uses of the call In the AANoAlias logic we determine if a pointer may have been captured before a call. We need to look at other uses in the call not uses of the call. The new code is not perfect as it does not allow trivial cases where the call has multiple arguments but it is at least not unsound and a TODO was added.	2020-10-27 22:07:54 -05:00
Johannes Doerfert	cb813ab66a	[Attributor][NFC] Improve time trace output	2020-10-27 22:07:54 -05:00
Kazu Hirata	c91487769d	[JumpThreading] Set edge probabilities when creating basic blocks This patch teaches the jump threading pass to set edge probabilities whenever the pass creates new basic blocks. Without this patch, the compiler sometimes produces non-deterministic results. The non-determinism comes from the jump threading pass using stale edge probabilities in BranchProbabilityInfo. Specifically, when the jump threading pass creates a new basic block, we don't initialize its outgoing edge probability. Edge probabilities are maintained in: DenseMap<Edge, BranchProbability> Probs; in class BranchProbabilityInfo, where Edge is an ordered pair of BasicBlock * and a successor index declared as: using Edge = std::pair<const BasicBlock *, unsigned>; Probs maps edges to their corresponding probabilities. Now, we rarely remove entries from this map, so if we happen to allocate a new basic block at the same address as a previously deleted basic block with an edge probability assigned, the newly created basic block appears to have an edge probability, albeit a stale one. This patch fixes the problem by explicitly setting edge probabilities whenever the jump threading pass creates new basic blocks. Differential Revision: https://reviews.llvm.org/D90106	2020-10-27 16:07:27 -07:00
Joseph Huber	a87d7b3d44	[OpenMP] Add Passing in Original Declaration Names To Mapper API Summary: This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;". See clang/test/OpenMP/target_map_names.cpp for an example of the generated output for a given map clause. Reviewers: jdoervert Differential Revision: https://reviews.llvm.org/D89802	2020-10-27 16:09:19 -04:00
Nicolai Hähnle	e025d09b21	Revert multiple patches based on "Introduce CfgTraits abstraction" These logically belong together since it's a base commit plus followup fixes to less common build configurations. The patches are: Revert "CfgInterface: rename interface() to getInterface()" This reverts commit `a74fc48158`. Revert "Wrap CfgTraitsFor in namespace llvm to please GCC 5" This reverts commit `f2a06875b6`. Revert "Try to make GCC5 happy about the CfgTraits thing" This reverts commit `03a5f7ce12`. Revert "Introduce CfgTraits abstraction" This reverts commit `c0cdd22c72`.	2020-10-27 20:33:30 +01:00
Nicolai Hähnle	ce6900c6cb	Revert "DomTree: Extract (mostly) read-only logic into type-erased base classes" This reverts commit `848a68a032`.	2020-10-27 20:33:29 +01:00
Vedant Kumar	5a3ef55a52	[Utils] Skip RemoveRedundantDbgInstrs in MergeBlockIntoPredecessor (PR47746) This patch changes MergeBlockIntoPredecessor to skip the call to RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to some compile-time issues spotted in LoopUnroll and SimplifyCFG. The call to RemoveRedundantDbgInstrs appears to have changed the worst-case behavior of the merging utility. Loosely speaking, it seems to have gone from O(#phis) to O(#insts). It might not be possible to mitigate this by scanning a block to determine whether there are any debug intrinsics to remove, since such a scan costs O(#insts). So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly little fallout from this, and most of it can be addressed by doing RemoveRedundantDbgInstrs later. The exception is (the block-local version of) SimplifyCFG, where it might just be too expensive to call RemoveRedundantDbgInstrs. Differential Revision: https://reviews.llvm.org/D88928	2020-10-27 10:12:59 -07:00
Raphael Isemann	e038b60d91	Revert "[IndVars] Remove monotonic checks with unknown exit count" This reverts commit `c6ca26c0bf`. This breaks stage2 builds due to hitting this assert: ``` Assertion failed: (WeightSum <= UINT32_MAX && "Expected weights to scale down to 32 bits"), function calcMetadataWeights ``` when compiling AArch64RegisterBankInfo.cpp in LLVM.	2020-10-27 15:31:37 +01:00
Raphael Isemann	a0d84d8031	Revert "[NFC] Factor away lambda's redundant parameter" This reverts commit `fdc845b361`. It seems to be a follow-up to c6372b3fb495 which will be reverted.	2020-10-27 15:30:52 +01:00
Simon Pilgrim	bce770ffa6	Revert rG0905bd5c2fa42bd4c "[InstCombine] collectBitParts - add trunc support." This reverts commit `0905bd5c2f`. Causing failures in multistage buildbots that I need to investigate	2020-10-27 13:43:54 +00:00
Nico Weber	2a4e704c92	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `e5766f25c6`. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Simon Pilgrim	0905bd5c2f	[InstCombine] collectBitParts - add trunc support. This should allow us to remove the rather limited matchOrConcat fold and just use recognizeBSwapOrBitReverseIdiom.	2020-10-27 13:14:54 +00:00
Roman Lebedev	0ac56e8eaa	[InstCombine] Fold `(X >>? C1) << C2` patterns to shift+bitmask (PR37872) This is essentially finalizes a revert of rL155136, because nowadays the situation has improved, SCEV can model all these patterns well, and we canonicalize rotate-like patterns into a funnel shift intrinsics in InstCombine. So this should not cause any pessimization. I've verified the canonicalize-{a,l}shr-shl-to-masking.ll transforms with alive, which confirms that we can freely preserve exact-ness, and no-wrap flags. Profs: * base: https://rise4fun.com/Alive/gPQ * exact-ness preservation: https://rise4fun.com/Alive/izi * nuw preservation: https://rise4fun.com/Alive/DmD * nsw preservation: https://rise4fun.com/Alive/SLN6N * nuw nsw preservation: https://rise4fun.com/Alive/Qp7 Refs. https://reviews.llvm.org/D46760	2020-10-27 14:42:53 +03:00
Florian Hahn	f067bc3c0a	[LoopRotation] Allow loop header duplication if vectorization is forced. -Oz normally does not allow loop header duplication so this loop wouldn't be vectorized. However the vectorization pragma should override this and allow for loop rotation. rdar://problem/49281061 Original patch by Adam Nemet. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D59832	2020-10-27 09:28:01 +00:00
Max Kazantsev	fdc845b361	[NFC] Factor away lambda's redundant parameter	2020-10-27 12:56:52 +07:00
Serguei Katkov	b69919b537	[GVN LoadPRE] Add an option to disable splitting backedge GVN Load PRE can split the backedge causing breaking the loop structure where the latch contains the conditional branch with for example induction variable. Different optimizations expect this form of the loop, so it is better to preserve it for some time. This CL adds an option to control an ability to split backedge. Default value is true so technically it is NFC and current behavior is not changed. Reviewers: fedor.sergeev, mkazantsev, nikic, reames, fhahn Reviewed By: mkazasntsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D89854	2020-10-27 11:59:52 +07:00
Max Kazantsev	c6ca26c0bf	[IndVars] Remove monotonic checks with unknown exit count Even if the exact exit count is unknown, we can still prove that this exit will not be taken. If we can prove that the predicate is monotonic, fulfilled on first & last iteration, and no overflow happened in between, then the check can be removed. Differential Revision: https://reviews.llvm.org/D87832 Reviewed By: apilipenko	2020-10-27 11:35:16 +07:00
Arthur Eubanks	42f76e193b	Reland [AlwaysInliner] Pass callee AAResults to InlineFunction() Test copied from noalias-calls.ll with small changes. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89609	2020-10-26 20:40:46 -07:00
Arthur Eubanks	e5766f25c6	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Arthur Eubanks	4af5ba1726	Revert "[AlwaysInliner] Pass callee AAResults to InlineFunction()" This reverts commit `504fbec7a6`. Test failure.	2020-10-26 20:23:38 -07:00
Arthur Eubanks	504fbec7a6	[AlwaysInliner] Pass callee AAResults to InlineFunction() Test copied from noalias-calls.ll with small changes. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89609	2020-10-26 20:10:09 -07:00
Arthur Eubanks	3dd1c72458	Port -objc-arc-expand to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90182	2020-10-26 20:05:10 -07:00

1 2 3 4 5 ...

25686 Commits