llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	e2759f110b	[SCEV] Apply guards to max with non-unitary steps. We already apply loop-guards when computing the maximum with unitary steps. This extends the code to also do so when dealing with non-unitary steps. This allows us to infer a tighter maximum in some cases. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102267	2021-05-13 09:47:29 +01:00
Jordan Rupprecht	fec2945998	Revert "[GVN] Clobber partially aliased loads." This reverts commit `6c57044231`. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst , unsigned int, llvm::Type , llvm::Instruction , const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```	2021-05-11 16:08:53 -07:00
Stanislav Mekhanoshin	22d295f695	[AMDGPU] Constant fold Intrinsic::amdgcn_perm Differential Revision: https://reviews.llvm.org/D102203	2021-05-10 16:23:11 -07:00
Florian Hahn	93a9a8a8d9	[VecLib] Add support for vector fns from Darwin's libsystem. This patch adds support for Darwin's libsystem math vector functions to TLI. Darwin's libsystem provides a range of vector functions for libm functions. This initial patch only adds the 2 x double and 4 x float versions, which are available on both X86 and ARM64. On X86, wider vector versions are supported as well. Reviewed By: jroelofs Differential Revision: https://reviews.llvm.org/D101856	2021-05-10 21:19:58 +01:00
Andy Kaylor	7086025d65	[Dependence Analysis] Enable delinearization of fixed sized arrays Patch by Artem Radzikhovskyy! Allow delinearization of fixed sized arrays if we can prove that the GEP indices do not overflow the array dimensions. The checks applied are similar to the ones that are used for delinearization of parametric size arrays. Make sure that the GEP indices are non-negative and that they are smaller than the range of that dimension. Changes Summary: - Updated the LIT tests with more exact values, as we are able to delinearize and apply more exact tests - profitability.ll - now able to delinearize in all cases, no need to use -da-disable-delinearization-checks flag and run the test twice - loop-interchange-optimization-remarks.ll - in one of the cases we are able to delinearize without using -da-disable-delinearization-checks - SimpleSIVNoValidityCheckFixedSize.ll - removed unnecessary "-da-disable-delinearization-checks" flag. Now can get the exact answer without it. - SimpleSIVNoValidityCheckFixedSize.ll and PreliminaryNoValidityCheckFixedSize.ll - made negative tests more explicit, in order to demonstrate the need for "-da-disable-delinearization-checks" flag Differential Revision: https://reviews.llvm.org/D101486	2021-05-10 10:30:15 -07:00
Nikita Popov	d26ca78c18	[SCEV] Handle and/or in applyLoopGuards() applyLoopGuards() already combines conditions from multiple nested guards. However, it cannot use multiple conditions on the same guard, combined using and/or. Add support for this by recursing into either `and` or `or`, depending on the direction of the branch. Differential Revision: https://reviews.llvm.org/D101692	2021-05-09 21:34:28 +02:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
Florian Hahn	6c99e63120	[SCEV] By more careful when traversing phis in isImpliedViaMerge. I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of Consider the case in exit_cond_depends_on_inner_loop. At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1). The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the previous iteration. Hence we incorrectly determine that the previous value was <= -1, which may not be true. I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow). So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard. Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101829	2021-05-07 19:52:29 +01:00
Krzysztof Parzyszek	50cf0a1d1a	Allow empty value list in propagateMetadata(Inst, ArrayOf...) This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.	2021-05-07 13:20:50 -05:00
Fangrui Song	d8aba75a76	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
Whitney Tsang	1006ac3963	[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717	2021-05-07 16:04:18 +00:00
Joseph Tremoulet	bc302bfbef	BasicAA: Recognize inttoptr as isEscapeSource Pointers escape when converted to integers, so a pointer produced by converting an integer to a pointer must not be a local non-escaping object. Reviewed By: nikic, nlopes, aqjune Differential Revision: https://reviews.llvm.org/D101541	2021-05-07 07:48:50 -07:00
Peilin Guo	911a541620	[LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion getValueFromCondition() uses a Visited set to record the intermediate value. However, it uses a postorder way to compute the value first and update the Visited set later. Thus it will be trapped into an infinite recursion if there exists IRs that use no dominated by its def as in this example: %tmp3 = or i1 undef, %tmp4 %tmp4 = or i1 undef, %tmp3 To prevent this, we can insert an Overdefined placeholder into the set before computing the actual value. Reviewed by: nikic Differential Revision: https://reviews.llvm.org/D101273	2021-05-07 16:05:50 +08:00
Mircea Trofin	97ab068034	[NPM] Do not run function simplification pipeline unnecessarily The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1]. This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since. The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios. [1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions \| compile time tracker ]] run with the option enabled. Differential Revision: https://reviews.llvm.org/D98103	2021-05-06 12:24:33 -07:00
Bjorn Pettersson	3ee826594a	Make dependency between certain analysis passes transitive (reapply) LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and LoopAccessLegacyAnalysis all cache pointers to their nestled required analysis passes. One need to use addRequiredTransitive to describe that the nestled passes can't be freed until those analysis passes no longer are used themselves. There is still a bit of a mess considering the getLazyBPIAnalysisUsage and getLazyBFIAnalysisUsage functions. Those functions are used from both Transform, CodeGen and Analysis passes. I figure it is OK to use addRequiredTransitive also when being used from Transform and CodeGen passes. On the other hand, I figure we must do it when used from other Analysis passes. So using addRequiredTransitive should be more correct here. An alternative solution would be to add a bool option in those functions to let the user tell if it is a analysis pass or not. Since those lazy passes will be obsolete when new PM has conquered the world I figure we can leave it like this right now. Intention with the patch is to fix PR49950. It at least solves the problem for the reproducer in PR49950. However, that reproducer need five passes in a specific order, so there are lots of various "solutions" that could avoid the crash without actually fixing the root cause. This is a reapply of commit `3655f0757f`, that was reverted in `33ff3c2049` due to problems with assertions in the polly lit tests. That problem is supposed to be solved by also adjusting ScopPass to explicitly preserve LazyBlockFrequencyInfo and LazyBranchProbabilityInfo (it already preserved OptimizationRemarkEmitter which depends on those lazy passes). Differential Revision: https://reviews.llvm.org/D100958	2021-05-05 15:17:55 +02:00
Bjorn Pettersson	33ff3c2049	Revert "Make dependency between certain analysis passes transitive" This reverts commit `3655f0757f`. It caused assertion failures related to setLastUser in polly builds.	2021-05-04 19:08:41 +02:00
Bjorn Pettersson	3655f0757f	Make dependency between certain analysis passes transitive LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and LoopAccessLegacyAnalysis all cache pointers to their nestled required analysis passes. One need to use addRequiredTransitive to describe that the nestled passes can't be freed until those analysis passes no longer are used themselves. There is still a bit of a mess considering the getLazyBPIAnalysisUsage and getLazyBFIAnalysisUsage functions. Those functions are used from both Transform, CodeGen and Analysis passes. I figure it is OK to use addRequiredTransitive also when being used from Transform and CodeGen passes. On the other hand, I figure we must do it when used from other Analysis passes. So using addRequiredTransitive should be more correct here. An alternative solution would be to add a bool option in those functions to let the user tell if it is a analysis pass or not. Since those lazy passes will be obsolete when new PM has conquered the world I figure we can leave it like this right now. Intention with the patch is to fix PR49950. It at least solves the problem for the reproducer in PR49950. However, that reproducer need five passes in a specific order, so there are lots of various "solutions" that could avoid the crash without actually fixing the root cause. Differential Revision: https://reviews.llvm.org/D100958	2021-05-04 11:50:08 +02:00
Simon Moll	1db4dbba24	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert `02c5ba8679` Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
Arthur Eubanks	d14d84af2f	[NewPM] Only invalidate modified functions' analyses in CGSCC passes Previously, any change in any function in an SCC would cause all analyses for all functions in the SCC to be invalidated. With this change, we now manually invalidate analyses for functions we modify, then let the pass manager know that all function analyses should be preserved. So far this only touches the inliner, argpromotion, funcattrs, and updateCGAndAnalysisManager(), since they are the most used. Slight compile time improvements: http://llvm-compile-time-tracker.com/compare.php?from=326da4adcb8def2abdd530299d87ce951c0edec9&to=8942c7669f330082ef159f3c6c57c3c28484f4be&stat=instructions Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D100917	2021-05-03 17:21:44 -07:00
Philip Reames	e38ccb729b	Recommit "Generalize getInvertibleOperand recurrence handling slightly" This was reverted because of a reported problem. It turned out this patch didn't introduce said problem, it just exposed it more widely. `15a4233` fixes the root issue, so this simple a) rebases over that, and b) adds a much more extensive comment explaining why that weakened assert is correct. Original commit message follows: Follow up to D99912, specifically the revert, fix, and reapply thereof. This generalizes the invertible recurrence logic in two ways: * By allowing mismatching operand numbers of the phi, we can recurse through a pair of phi recurrences whose operand orders have not been canonicalized. * By allowing recurrences through operand 1, we can invert these odd (but legal) recurrence. Differential Revision: https://reviews.llvm.org/D100884	2021-05-03 16:40:56 -07:00
Sanjay Patel	15a42339fe	[ValueTracking] soften assert for invertible recurrence matching There's a TODO comment in the code and discussion in D99912 about generalizing this, but I wasn't sure how to implement that, so just going with a potential minimal fix to avoid crashing. The test is a reduction beyond useful code (there's no user of %user...), but it is based on https://llvm.org/PR50191, so this is asserting on real code. Differential Revision: https://reviews.llvm.org/D101772	2021-05-03 15:57:40 -04:00
Juneyoung Lee	d4d1caafc8	Fix MSan crash after `1977c53b`	2021-05-02 13:44:43 +09:00
Arthur Eubanks	07a9df5993	[NFC] Use getParamByValType instead of pointee type To reduce dependence on pointee types for opaque pointers. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101706	2021-05-01 21:22:41 -07:00
Juneyoung Lee	7257e6a68a	[ValueTracking] ctpop propagates poison This is a patch that adds ctpop intrinsics to propagatesPoison. Splitted from D101191	2021-05-02 13:04:37 +09:00
Juneyoung Lee	64e768e816	[ValueTracking] Improve impliesPoison to look into overflow intrinsics This update supports the following transformation: ``` select(extract(mul_with_overflow(a, _), _), (a == 0), false) => and(extract(mul_with_overflow(a, _), _), (a == 0)) ``` which is correct because if `a` was poison the select's condition was also poison. This update is splitted from D101423.	2021-05-02 12:03:55 +09:00
Juneyoung Lee	1977c53b2a	[InstCombine] Fold overflow bit of [u\|s]mul.with.overflow in a poison-safe way As discussed in D101191, this patch adds a poison-safe folding of overflow bit check: ``` %Op0 = icmp ne i4 %X, 0 %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = select i1 %Op0, i1 %Op1, i1 false => %Y.fr = freeze %Y %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y.fr) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = %Op1 ``` https://alive2.llvm.org/ce/z/zgPUGT https://alive2.llvm.org/ce/z/h2gZ_6 Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`. In this case, LLVM is already good because `%ret` is already successfully folded into `and`, triggering the pre-existing optimization in InstSimplify: https://godbolt.org/z/v6qena15K Differential Revision: https://reviews.llvm.org/D101423	2021-05-02 11:54:12 +09:00
Nikita Popov	db9d00c5e7	[LVI] Handle mask not equal zero conditions If V & Mask != 0, we know that at least one of the bits in Mask must be set, so the value must be >= the lowest bit in Mask.	2021-05-01 23:08:49 +02:00
Nikita Popov	cc58e8918b	[SCEV] Simplify backedge count clearing (NFC) This seems to be a leftover from when the BackedgeTakenInfo stored multiple exit counts with manual memory management. At some point this was switchted to a simple vector, and there should be no need to micro-manage the clearing anymore. We can simply drop the loop from the map and the the destructor do its job.	2021-05-01 17:50:01 +02:00
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit `43bc584dc0`. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Nikita Popov	fe230dc197	[ValueTracking] Slightly clean up programUndefinedIfUndefOrPoison() (NFC) Use contains() to check set membership, and adjust an oddly structured loop.	2021-04-30 23:05:41 +02:00
Nikita Popov	2cd7868605	[ValueTracking] Limit scan when checking poison UB (PR50155) The current code can scan an unlimited number of instructions, if the containing basic block is very large. The test case from PR50155 contains a basic block with approximately 100k instructions. To avoid this, limit the number of instructions we inspect. At the same time, drop the limit on the number of basic blocks, as this will be implicitly limited by the number of instructions as well.	2021-04-30 23:04:49 +02:00
Duncan P. N. Exon Smith	518d955f9d	Support: Stop using F_{None,Text,Append} compatibility synonyms, NFC Stop using the compatibility spellings of `OF_{None,Text,Append}` left behind by `1f67a3cba9`. A follow-up will remove them. Differential Revision: https://reviews.llvm.org/D101650	2021-04-30 11:00:03 -07:00
Simon Moll	43bc584dc0	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Roman Lebedev	ba5b015b0d	[InlineCost] CallAnalyzer: use TTI info for extractvalue - they are free (PR50099) It seems incorrect to use TTI data in some places, and override it in others. In this case, TTI says that `extractvalue` are free, yet we bill them. While this doesn't address https://bugs.llvm.org/show_bug.cgi?id=50099 yet, it reduces the cost from 55 to 50 while the threshold is 45. Differential Revision: https://reviews.llvm.org/D101228	2021-04-30 13:55:11 +03:00
Arthur Eubanks	a3a798d49d	[InlineCost] Remove visitUnaryInstruction() The simplifyInstruction() in visitUnaryInstruction() does not trigger for all of check-llvm. Looking at all delegates to UnaryInstruction in InstVisitor, the only instructions that either don't have a visitor in CallAnalyzer, or redirect to UnaryInstruction, are VAArgInst and Alloca. VAArgInst will never get simplified, and visitUnaryInstruction(Alloca) would always return false anyway. Reviewed By: mtrofin, lebedev.ri Differential Revision: https://reviews.llvm.org/D101577	2021-04-29 20:33:30 -07:00
jasonliu	7049fbf960	[XCOFF] Handle the case when personality routine is an alias Summary: Personality routine could be an alias to another personality routine. Fix the situation when we compile the file that contains the personality routine and the file also have functions that need to refer to the personality routine. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D101401	2021-04-29 22:03:30 +00:00
Philip Reames	a047837b90	Revert "Generalize getInvertibleOperand recurrence handling slightly" This reverts commit `0c01b37eeb` while a problem reported is investigated.	2021-04-29 13:06:26 -07:00
Sanjay Patel	1089158c5a	[ConstantFolding] propagate poison through vector reduction intrinsics	2021-04-29 12:54:20 -04:00
Sanjay Patel	71597d40e8	[ConstantFolding] refactor helper for vector reductions; NFC We should handle other cases (undef/poison), so reduce the duplication of repeated switches.	2021-04-29 12:09:22 -04:00
Craig Topper	25391cec3a	[RISCV] Teach computeKnownBits that vsetvli returns number less than 2^31. This seems like a reasonable upper bound on VL. WG discussions for the V spec would probably allow us to use 2^16 as an upper bound on VLEN, but this is good enough for now. This allows us to remove sext and zext if user happens to assign the size_t result into an int and then uses it as a VL intrinsic argument which is size_t. Reviewed By: frasercrmck, rogfer01, arcbbb Differential Revision: https://reviews.llvm.org/D101472	2021-04-29 08:07:59 -07:00
Philip Reames	0c01b37eeb	Generalize getInvertibleOperand recurrence handling slightly Follow up to D99912, specifically the revert, fix, and reapply thereof. This generalizes the invertible recurrence logic in two ways: * By allowing mismatching operand numbers of the phi, we can recurse through a pair of phi recurrences whose operand orders have not been canonicalized. * By allowing recurrences through operand 1, we can invert these odd (but legal) recurrence. Differential Revision: https://reviews.llvm.org/D100884	2021-04-28 14:38:07 -07:00
Philip Reames	0cc3e10f5e	[SCEV] Avoid range intersection idiom in getRangeForUnkownRecurrence [NFC] Addresses a review comment from D101181	2021-04-28 12:48:17 -07:00
Philip Reames	a836de0bde	[SCEV] Compute ranges for ashr recurrences Straight forward extension to the recently added infrastructure which was pioneered with shl. This was originally posted as part of D99687, but split off for ease of review. (I also decided to exclude the unknown start sign case explicitly for simplicity of understanding.) Differential Revision: https://reviews.llvm.org/D101181	2021-04-28 12:36:20 -07:00
Florian Hahn	1ed7f8ede5	[LAA] Support pointer phis in loop by analyzing each incoming pointer. SCEV does not look through non-header PHIs inside the loop. Such phis can be analyzed by adding separate accesses for each incoming pointer value. This results in 2 more loops vectorized in SPEC2000/186.crafty and avoids regressions when sinking instructions before vectorizing. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101286	2021-04-28 20:19:40 +01:00
Arthur Eubanks	cbce28f07e	[ConstFold] Use const-folded operands in more places Previously we were const folding operands but not passing them. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101394	2021-04-27 14:30:19 -07:00
Nikita Popov	e45168c4fa	[SCEV] Handle uge/ugt predicates in applyLoopGuards() These can be handled the same way as ule/ult, just using umax instead of umin. This is useful in cases where the umax prevents the upper bound from overflowing. Differential Revision: https://reviews.llvm.org/D101196	2021-04-27 22:41:05 +02:00
Andy Kaylor	0a82d885a4	[Dependence Analysis] Fix ExactSIV producing wrong analysis Patch by Artem Radzikhovskyy! Symptom: ExactSIV test produced incorrect analysis of dependencies see LIT tests Bug: At the end of the algorithm when determining dependence direction original author forgot to divide intermediate results by gcd and round result toward zero Although this bug can be fixed with significantly fewer changes I opted to write the code in such a way that reflects the original algorithm that Banerjee proposed, for easier reference in the future. This surprisingly results in shorter code, and fewer quotient and max/min calculations. Changes Summary: - fixed findGCD to return valid x and y so that they match the function description where: ax - by = gcd(a,b) - Fixed ExactSIV test, to produce proper results - Documented the extension of Banerjee's algorithm that the original code author introduced. Banerjee's original algorithm only tested whether Dst depends on Src, the extension also allows us to test whether Src depends on Dst, in one pass. - ExactRDIV test worked fine. Since it uses findGCD(), it needed to be updated.Since ExactRDIV test has very few changes from the core algorithm of ExactSIV I modified the test to have consistent format as ExactSIV. - Updated the LIT tests to be testing for correct values. Differential Revision: https://reviews.llvm.org/D100331	2021-04-27 12:24:00 -07:00
dfukalov	e4c606acaf	[TTI] NFC: Change getScalarizationOverhead and getOperandsScalarizationOverhead to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101283	2021-04-27 08:51:48 +03:00
Hongtao Yu	30bb5be389	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2. As a follow-up to D95982, this patch continues unblocking optimizations that are blocked by pseudu probe instrumention. The optimizations unblocked are: - In-block load propagation. - In-block dead store elimination - Memory copy optimization that turns stores to consecutive memories into a memset. These optimizations are local to a block, so they shouldn't affect the profile quality. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100075	2021-04-26 16:52:33 -07:00
Vineet Kumar	84d16e2055	Implementation for TargetTransformInfo::hasActiveVectorLength() This patch adds the missing implementation for TargetTransformInfo::hasActiveVectorLength() without which using hasActiveVectorLength() causes linker error. Patch by Vineet Kumar! Differential Revision: https://reviews.llvm.org/D100941	2021-04-26 21:20:05 +00:00

1 2 3 4 5 ...

10520 Commits