llvm-project

Commit Graph

Author	SHA1	Message	Date
Chen Zheng	1e0b6c1df0	[LSR] ignore profitable chain when reg num is not major cost. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D89665	2020-10-23 09:35:48 -04:00
Roman Lebedev	e0567582b8	[NFCI][SCEV] Always refer to enum SCEVTypes as enum, not integer The main tricky thing here is forward-declaring the enum: we have to specify it's underlying data type. In particular, this avoids the danger of switching over the SCEVTypes, but actually switching over an integer, and not being notified when some case is not handled. I have updated most of such switches to be exaustive and not have a default case, where it's pretty obvious to be the intent, however not all of them.	2020-10-20 00:10:22 +03:00
Roman Lebedev	d083d55c2c	[NFC][SCEV] Rename SCEVCastExpr into SCEVIntegralCastExpr All existing SCEV cast types operate on integers. D89456 will add SCEVPtrToIntExpr cast expression type. I believe this is best for consistency. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89455	2020-10-19 10:59:53 +03:00
Markus Lavin	06758c6a61	[DebugInfo] Improve dbg preservation in LSR. Use SCEV to salvage additional @llvm.dbg.value that have turned into referencing undef after transformation (and traditional salvageDebugInfo). Before transformation compute SCEV for each @llvm.dbg.value in the loop body and store it (along side its current DIExpression). After transformation update those @llvm.dbg.value now referencing undef by comparing its stored SCEV to the SCEV of the current loop-header PHI-nodes. Allow match with offset by inserting compensation code in the DIExpression. Includes fix for the nullptr deref that caused the original commit to be reverted in `9d63029770`. Fixes : PR38815 Differential Revision: https://reviews.llvm.org/D87494	2020-10-08 13:16:43 +02:00
Nikita Popov	9d63029770	Revert "[DebugInfo] Improve dbg preservation in LSR." This reverts commit `a3caf7f610`. The ReleaseLTO-g test-suite configuration has been failing to build since this commit, because clang segfaults while building 7zip.	2020-10-05 19:02:30 +02:00
Markus Lavin	a3caf7f610	[DebugInfo] Improve dbg preservation in LSR. Use SCEV to salvage additional @llvm.dbg.value that have turned into referencing undef after transformation (and traditional salvageDebugInfo). Before transformation compute SCEV for each @llvm.dbg.value in the loop body and store it (along side its current DIExpression). After transformation update those @llvm.dbg.value now referencing undef by comparing its stored SCEV to the SCEV of the current loop-header PHI-nodes. Allow match with offset by inserting compensation code in the DIExpression. Fixes : PR38815 Differential Revision: https://reviews.llvm.org/D87494	2020-10-05 09:55:16 +02:00
Stefanos Baziotis	89c1e35f3c	[LoopInfo] empty() -> isInnermost(), add isOutermost() Differential Revision: https://reviews.llvm.org/D82895	2020-09-22 23:28:51 +03:00
Florian Hahn	57ae9bb932	[LSR] Preserve MSSA when using SplitCriticalEdge. LSR claims to MemorySSA, but we also have to make sure it is preserved when splitting critical edges. This can be done by passing MSSAU to SplitCriticalEdge. Fixes PR47557.	2020-09-21 09:51:26 +01:00
Andrew Wei	78071fb524	[LSR] Canonicalize a formula before insert it into the list In GenerateConstantOffsetsImpl, we may generate non canonical Formula if BaseRegs of that Formula is updated and includes a recurrent expr reg related with current loop while its ScaledReg is not. Patched by: mdchen Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D86939	2020-09-08 13:14:53 +08:00
Florian Hahn	f75564ad4e	Reland "[SCEVExpander] Add option to preserve LCSSA directly." This reverts the revert commit `dc28675768`. It includes a fix for Polly, which uses SCEVExpander on IR that is not in LCSSA form. Set PreserveLCSSA = false in that case, to ensure we do not introduce LCSSA phis where there were none before.	2020-07-29 20:41:53 +01:00
Florian Hahn	dc28675768	Revert "[SCEVExpander] Add option to preserve LCSSA directly." This reverts commit `99166fd4fb`, because it breaks the polly builders. polly/test/Isl/CodeGen/invariant_load_escaping_second_scop.ll fails because a apparently unnecessary LCSSA phi node is introduced. Make the bots green again, while I take a closer look.	2020-07-29 19:19:04 +01:00
Florian Hahn	99166fd4fb	[SCEVExpander] Add option to preserve LCSSA directly. This patch teaches SCEVExpander to directly preserve LCSSA. As it is currently, SCEV does not look through PHI nodes in loops, as it might break LCSSA form. Once SCEVExpander can preserve LCSSA form, it should be safe for SCEV to look through PHIs. To preserve LCSSA form, this patch uses formLCSSAForInstructions on operands of newly created instructions, if the definition is inside a different loop than the new instruction. The final value we return from expandCodeFor may also need LCSSA phis, depending on the insert point. As no user for it exists there yet, create a temporary instruction at the insert point, which can be passed to formLCSSAForInstructions. This temporary instruction is removed after LCSSA construction. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D71538	2020-07-29 15:07:37 +01:00
David Green	076e08aa45	[LSR] Filter for postinc formulae In more complicated loops we can easily hit the complexity limits of loop strength reduction. If we do and filtering occurs, it's all too easy to remove the wrong formulae for post-inc preferring accesses due to it attempting to maximise register re-use. The patch adds an alternative filtering step when the target is preferring postinc to pick postinc formulae instead, hopefully lowering the complexity to below the limit so that aggressive filtering is not needed. There is also a change in here to stop considering existing addrecs as free under postinc. We should already be modelling them as a reg so don't want it to cause us to get the cost wrong. (I'm not sure that code makes sense in general, but there are X86 tests specifically for it where it seems to be helping so have left it around for the standard non-post-inc case). Differential Revision: https://reviews.llvm.org/D80273	2020-06-17 12:32:04 +01:00
Simon Pilgrim	5006e551d3	LoopAnalysisManager.h - reduce includes to forward declarations. NFC. Move implicit include dependencies down to header/source files.	2020-06-06 14:06:46 +01:00
Florian Hahn	bcbd26bfe6	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. This patch was originally committed as `b8a3c34eee`, but broke the modules build, as LoopAccessAnalysis was using the Expander. The code-gen part of LAA was moved to lib/Transforms recently, so this patch can be landed again. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-05-20 10:53:40 +01:00
Pierre-vh	2668775f66	[LSR][ARM] Add new TTI hook to mark some LSR chains as profitable This patch adds a new TTI hook to allow targets to tell LSR that a chain including some instruction is already profitable and should not be optimized. This patch also adds an implementation of this TTI hook for ARM so LSR doesn't optimize chains that include the VCTP intrinsic. Differential Revision: https://reviews.llvm.org/D79418	2020-05-13 14:18:28 +01:00
David Green	146d44c251	[LSR] Don't require register reuse under postinc LSR has some logic that tries to aggressively reuse registers in formula. This can lead to sub-optimal decision in complex loops where the backend it trying to use shouldFavorPostInc. This disables the re-use in those situations. Differential Revision: https://reviews.llvm.org/D79301	2020-05-05 16:04:50 +01:00
David Green	38e532278e	[LSR] Add masked load and store handling This teaches Loop Strength Reduction the details about masked load and store address operands, so that it can have a better time optimising them as it would for normal loads and stores. Differential Revision: https://reviews.llvm.org/D75371	2020-03-04 18:36:10 +00:00
Sumanth Gundapaneni	9897daa6bf	Update LSR's logic that identifies a post-increment SCEV value. One of the checks has been removed as it seem invalid. The LoopStep size is always almost a 32-bit. Differential Revision: https://reviews.llvm.org/D75079	2020-03-02 16:34:18 -06:00
Alina Sbirlea	0d90d2457c	[LoopStrengthReduce] Teach LoopStrengthReduce to preserve MemorySSA is available.	2020-01-24 10:13:52 -08:00
Alina Sbirlea	1d09174290	[LoopStrengthReduce] Reuse utility method to clean dead instructions. [NFCI] Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility method, which sets to nullptr the instructions that are not trivially dead. Use the new method in LoopStrengthReduce. Alternative: add a bool to the same method; this option adds a marginal amount of overhead to the other callers, and the method needs to be updated to return a bool status when it removes/doesn't remove instructions.	2020-01-23 16:27:32 -08:00
Florian Hahn	b8a3c34eee	Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)." This reverts commit `51ef53f3bd`, as it breaks some bots.	2020-01-04 18:44:38 +00:00
Florian Hahn	51ef53f3bd	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-01-04 18:29:35 +00:00
Reid Kleckner	05da2fe521	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211	2019-11-13 16:34:37 -08:00
Zi Xuan Wu	9802268ad3	recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634	2019-10-12 02:53:04 +00:00
Jinsong Ji	9912232b46	Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit `9f41deccc0`. This reverts commit `18b6fe07bc`. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091	2019-10-08 17:32:56 +00:00
Zi Xuan Wu	9f41deccc0	[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017	2019-10-08 03:28:33 +00:00
Simon Pilgrim	2441455bc8	[LSR] Silence static analyzer null dereference warnings with assertions. NFCI. Add assertions to make it clear that GenerateIVChain / NarrowSearchSpaceByPickingWinnerRegs should succeed in finding non-null values llvm-svn: 372518	2019-09-22 17:59:24 +00:00
Teresa Johnson	9c27b59cec	Change TargetLibraryInfo analysis passes to always require Function Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284	2019-09-07 03:09:36 +00:00
Chen Zheng	a2d74d3d90	[PowerPC] exclude more icmps in LSR which is converted in later hardware loop pass Differential Revision: https://reviews.llvm.org/D64795 llvm-svn: 366976	2019-07-25 01:22:08 +00:00
Rui Ueyama	49a3ad21d6	Fix parameter name comments using clang-tidy. NFC. This patch applies clang-tidy's bugprone-argument-comment tool to LLVM, clang and lld source trees. Here is how I created this patch: $ git clone https://github.com/llvm/llvm-project.git $ cd llvm-project $ mkdir build $ cd build $ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \ -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \ -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm $ ninja $ parallel clang-tidy -checks='-,bugprone-argument-comment' \ -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \ ::: ../llvm/lib//.{cpp,h} ../clang/lib/*/.{cpp,h} ../lld/*/.{cpp,h} llvm-svn: 366177	2019-07-16 04:46:31 +00:00
Chen Zheng	469f30abab	[PowerPC] Hardware Loop branch instruction's condition may not be icmp. This fixes pr42492. Differential Revision: https://reviews.llvm.org/D64124 llvm-svn: 365104	2019-07-04 01:51:47 +00:00
Chen Zheng	dfdccbb26b	[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop. Differential Revision: https://reviews.llvm.org/D63477 llvm-svn: 364993	2019-07-03 01:49:03 +00:00
Amara Emerson	e5248e6b41	Revert "[LSR] Tweak setup cost depth threshold to 10." Changing the threshold might not be the best long term approach. Revert for now. llvm-svn: 360589	2019-05-13 15:37:18 +00:00
Amara Emerson	b6af291772	[LSR] Tweak setup cost depth threshold to 10. The original change introduced a depth limit of 7 which caused a 22% regression in the Swift MapReduceLazyCollection & Ackermann benchmarks. This new threshold still ensures that the original test case doesn't hang. rdar://50359639 llvm-svn: 360444	2019-05-10 17:29:35 +00:00
David Green	63a2aa715a	[LSR] Limit the recursion for setup cost In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958	2019-04-23 08:52:21 +00:00
Denis Bakhvalov	cfd25a4b0e	Test commit by Denis Bakhvalov Change-Id: I4d85123a157d957434902fb14ba50926b2d56212 llvm-svn: 358619	2019-04-17 22:27:30 +00:00
Quentin Colombet	fda0426888	[LSR] Rewrite misses some fixup locations if it splits critical edge If LSR split critical edge during rewriting phi operands and phi node has other pending fixup operands, we need to update those pending fixups. Otherwise formulae will not be implemented completely and some instructions will not be eliminated. llvm.org/PR41445 Differential Revision: https://reviews.llvm.org/D60645 Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com> llvm-svn: 358457	2019-04-15 22:23:46 +00:00
Florian Hahn	45682fd633	[LSR] Fix signed overflow in GenerateCrossUseConstantOffsets. For the attached test case, unchecked addition of immediate starts and ends overflows, as they can be arbitrary i64 constants. Proof: https://rise4fun.com/Alive/Plqc Reviewers: qcolombet, gilr, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59218 llvm-svn: 357217	2019-03-28 22:17:29 +00:00
Florian Hahn	d9e88f7b7f	[LSR] Check for signed overflow in NarrowSearchSpaceByDetectingSupersets. We are adding a sign extended IR value to an int64_t, which can cause signed overflows, as in the attached test case, where we have a formula with BaseOffset = -1 and a constant with numeric_limits<int64_t>::min(). If the addition would overflow, skip the simplification for this formula. Note that the target triple is required to trigger the failure. Reviewers: qcolombet, gilr, kparzysz, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59211 llvm-svn: 356256	2019-03-15 12:17:36 +00:00
Sam Parker	a86ff8640d	Fix for buildbots Remove unused private field. llvm-svn: 356135	2019-03-14 11:38:55 +00:00
Sam Parker	eb0b8019e8	[NFC][LSR] Cleanup Cost API Create members for Loop, ScalarEvolution, DominatorTree, TargetTransformInfo and Formula. Differential Revision: https://reviews.llvm.org/D58389 llvm-svn: 356131	2019-03-14 11:05:07 +00:00
David Green	ffc922ec35	[LSR] Attempt to increase the accuracy of LSR's setup cost In some loops, we end up generating loop induction variables that look like: {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1} As opposed to the simpler: {(zext i16 (%i0 * %i1) to i32),+,-1} i.e we count up from -limit to 0, not the simpler counting down from limit to 0. This is because the scores, as LSR calculates them, are the same and the second is filtered in place of the first. We end up with a redundant SUB from 0 in the code. This patch tries to make the calculation of the setup cost a little more thoroughly, recursing into the scev members to better approximate the setup required. The cost function for comparing LSR costs is: return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds, C1.ScaleCost, C1.ImmCost, C1.SetupCost) < std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds, C2.ScaleCost, C2.ImmCost, C2.SetupCost); So this will only alter results if none of the other variables turn out to be different. Differential Revision: https://reviews.llvm.org/D58770 llvm-svn: 355597	2019-03-07 13:44:40 +00:00
Max Kazantsev	20b9189975	[NFC] Rename DontDeleteUselessPHIs --> KeepOneInputPHIs llvm-svn: 353801	2019-02-12 07:09:29 +00:00
Sam Parker	67756c09f2	[LSR] Generate cross iteration indexes Modify GenerateConstantOffsetsImpl to create offsets that can be used by indexed addressing modes. If formulae can be generated which result in the constant offset being the same size as the recurrence, we can generate a pre-indexed access. This allows the pointer to be updated via the single pre-indexed access so that (hopefully) no add/subs are required to update it for the next iteration. For small cores, this can significantly improve performance DSP-like loops. Differential Revision: https://reviews.llvm.org/D55373 llvm-svn: 353403	2019-02-07 13:32:54 +00:00
Max Kazantsev	d5e595b7a6	[LSR] Check SCEV on isZero() after extend. PR40514 When LSR first adds SCEVs to BaseRegs, it only does it if `isZero()` has returned false. In the end, in invocation of `InsertFormula`, it asserts that all values there are still not zero constants. However between these two points, it makes some transformations, in particular extends them to wider type. SCEV does not give us guarantee that if `S` is not a constant zero, then `sext(S)` is also not a constant zero. It might have missed some optimizing transforms when it was calculating `S` and then made them when it took `sext`. For example, it may happen if previously optimizing transforms were limited by depth or somehow else. This patch adds a bailout when we may end up with a zero SCEV after extension. Differential Revision: https://reviews.llvm.org/D57565 Reviewed By: samparker llvm-svn: 353136	2019-02-05 04:30:37 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Sam Parker	d6ebf0108e	[LoopStrengthReduce] ComplexityLimit as an option Convert ComplexityLimit into a command line value. Differential Revision: https://reviews.llvm.org/D54899 llvm-svn: 347843	2018-11-29 08:34:22 +00:00
Gil Rapaport	7b88bab386	[LSR] Combine unfolded offset into invariant register LSR reassociates constants as unfolded offsets when the constants fit as immediate add operands, which currently prevents such constants from being combined later with loop invariant registers. This patch modifies GenerateCombinations() to generate a second formula which includes the unfolded offset in the combined loop-invariant register. This commit fixes a bug in the original patch (committed at r345114, reverted at r345123). Differential Revision: https://reviews.llvm.org/D51861 llvm-svn: 346390	2018-11-08 09:01:19 +00:00
Gil Rapaport	c523036fd2	Revert r345114 Investigating fails. llvm-svn: 345123	2018-10-24 08:41:22 +00:00

1 2 3 4 5 ...

794 Commits