llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	f5f7e4e6f4	[LICM] Add test for PR57780 (NFC)	2022-09-20 13:07:11 +02:00
Sebastian Peryt	99c9b37d11	[NFC][1/n] Remove -enable-new-pm=0 flags from lit tests This is the first patch in a series intended for removing flag -enable-new-pm=0 from lit tests. This is part of a bigger effort of completely removing legacy code related to legacy pass manager in favor of currently default new pass manager. In this patch flag has been removed only from tests where no significant change has been required because checks has been duplicated for both PMs. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D134150	2022-09-19 09:57:37 -07:00
Mingming Liu	8aa800614b	[AArch64][CostModel] Detects that {extract,insert}-element at lane 0 has the same cost as the other lane for vector instructions in the IR. Currently, {extract,insert}-element has zero cost at lane 0 [1]. However, there is a cost (by fmov instruction [2], or ext/ins instruction) to move values from SIMD registers to GPR registers, when the element is used explicitly as integers. See https://godbolt.org/z/faPE1nTn8, when fmov is generated for d* register -> x* register conversion. Implementation-wise, add a private method `AArch64TTIImpl::getVectorInstrCostHelper` as a helper function. This way, instruction-based method could share the core logic (e.g., returning zero cost if type is legalized to scalar). [1] `2cf320d41e/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (L1853)` [2] `2cf320d41e/llvm/lib/Target/AArch64/AArch64InstrInfo.td (L8150-L8157)` Differential Revision: https://reviews.llvm.org/D128302	2022-09-09 09:47:30 -07:00
Nikita Popov	5b1df2e951	[LICM] Regenerate test checks (NFC)	2022-09-09 15:30:17 +02:00
Nikita Popov	4ab77d1677	[LICM] Allow promotion with non-load/store users If there are non-load/store users of the promoted pointer, we currently abort promotion. However, having such users isn't really relevant to the transform. We already separately check that a) there are no instructions that modref the promoted pointer and b) that a pointer capture disables store promotion. In the affected @test_captured_in_loop test case we have a readnone capture of the promoted pointer, which means that load promotion can be performed (while store promotion cannot). Differential Revision: https://reviews.llvm.org/D133485	2022-09-09 13:09:59 +02:00
Nikita Popov	52f7eb3151	[LICM] Add test for sret with conditional store (NFC)	2022-09-08 14:53:06 +02:00
Nikita Popov	10dfcf1f87	[LICM] Add test for missed load promotion opportunity (NFC)	2022-09-02 11:36:07 +02:00
Nikita Popov	639d912282	[LICM] Allow load-only scalar promotion in the presence of unwinding Currently, we bail out of scalar promotion if the loop may unwind and the memory may be visible on unwind. This is because we can't insert stores of the promoted value on unwind edges. However, nowadays scalar promotion also has support for only promoting loads, while leaving stores in place. This kind of promotion is safe even in the presence of unwinding. Differential Revision: https://reviews.llvm.org/D133111	2022-09-02 09:27:13 +02:00
Nikita Popov	26347adf96	[LICM] Regenerate test checks (NFC)	2022-09-01 16:06:38 +02:00
Nikita Popov	315aef667e	[LICM] Fix thread safety checks for promotion of byval args This code was relying on a very subtle contract: The expectation was that for non-allocas, the unwind safety check would already perform a capture check, so we don't need to perform it later. This held true when this unwind safety was only handled for allocas and noalias calls, but became incorrect when byval support was added. To avoid this kind of issue, just remove the dependency between the unwind and thread-safety checks entirely. At worst, this means we perform a redundant capture check. If this should turn out to be problematic for compile-time, we can cache that query in a more explicit way.	2022-09-01 15:33:46 +02:00
Nikita Popov	20524a3c94	[LICM] Add another byval capture test (NFC) Variant with capture after the loop, in which case promotion is safe.	2022-09-01 15:18:10 +02:00
Nikita Popov	e1826326af	[LICM] Add test for byval scalar promotion miscompile (NFC)	2022-09-01 15:03:20 +02:00
Max Kazantsev	c52d447713	[Test] Mode test for pr56243 from LICM to LoopSimplifyCFG	2022-07-18 12:37:01 +07:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Mingming Liu	b242e8502c	[AArch64][NFC] Prepare test cases (for D128302) to show more accurate cost estimation of extract-element could generate better assembly code. Pre-commit the test cases (for D128302) to show that more accurate cost estimation of extract-element could generate better code. Differential Revision: https://reviews.llvm.org/D128945	2022-07-07 09:39:29 -07:00
Nikita Popov	40a4078e14	[BasicBlockUtils] Allow splitting predecessors with callbr terminators SplitBlockPredecessors currently asserts if one of the predecessor terminators is a callbr. This limitation was originally necessary, because just like with indirectbr, it was not possible to replace successors of a callbr. However, this is no longer the case since D67252. As the requirement nowadays is that callbr must reference all blockaddrs directly in the call arguments, and these get automatically updated when setSuccessor() is called, we no longer need this limitation. The only thing we need to do here is use replaceSuccessorWith() instead of replaceUsesOfWith(), because only the former does the necessary blockaddr updating magic. I believe there's other similar limitations that can be removed, e.g. related to critical edge splitting. Differential Revision: https://reviews.llvm.org/D129205	2022-07-07 09:13:25 +02:00
Nikita Popov	cf7502a1eb	[LICM] Check opt output in test (NFC) Check what the test actually produces, not just that it doesn't crash.	2022-07-06 16:21:36 +02:00
Nikita Popov	560e694d48	[AST] Don't assert instruction reads/writes memory (PR51333) This function is well-defined for an instruction that doesn't access memory (and thus trivially doesn't alias anything in the AST), so drop the assert. We can end up with a readnone call here if we originally created a MemoryDef for an indirect call, which was later replaced with a direct readnone call. Fixes https://github.com/llvm/llvm-project/issues/51333. Differential Revision: https://reviews.llvm.org/D127947	2022-07-01 17:04:48 +02:00
Max Kazantsev	abb8bf3671	[Test] Add XFAIL test for PR56243 This test demonstrates how sinking down gc.relocate may lead to breach of LCSSA form by tokens and, consecutively, end up with SSA breach by LoopSimplifyCFG which creates fake edges and is unable to update missing LCSSA phis for tokens used outside of the loop.	2022-06-29 19:46:17 +07:00
Congzhe Cao	b941857b40	[LoopInterchange] New cost model for loop interchange This is another attempt to land this patch. The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis. Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern. Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced. Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D124926	2022-06-28 00:08:37 -04:00
Nuno Lopes	6ef9a2ad01	[LICM] Use poison to replace unreachable values instead of undef [NFC]	2022-06-26 14:56:35 +01:00
Evgenii Stepanov	878309cc54	Revert "[LoopInterchange] New cost model for loop interchange" llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed integer overflow: 6148914691236517209 * 100 cannot be represented in type 'long' https://lab.llvm.org/buildbot/#/builders/5/builds/25185 This reverts commit `1b24fe34b0`.	2022-06-23 16:10:53 -07:00
Congzhe Cao	1b24fe34b0	[LoopInterchange] New cost model for loop interchange This is the second attempt to land this patch. The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis. Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern. Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced. Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D124926	2022-06-23 16:34:57 -04:00
Serguei Katkov	24e16e4af2	[SSAUpdaterImpl] Do not generate phi node with all the same incoming values If all available vals to basic block are the same - do not build new phi node and just use this value. Reviewed By: sameerds Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D126525	2022-06-03 12:24:33 +07:00
Daniil Suchkov	f1940a5895	Revert "[LoopInterchange] New cost model for loop interchange" Reverting the commit due to numerous buildbot failures. This reverts commit `006334470d`.	2022-06-03 00:52:08 +00:00
Congzhe Cao	006334470d	[LoopInterchange] New cost model for loop interchange This patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis. Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern. Updates made to test cases are mostly minor changes and some corrections. Test coverage for loop interchange is not reduced. Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D124926	2022-06-02 19:07:14 -04:00
Florian Hahn	0776c48f9b	Recommit "[LICM] Only create load in ph when promoting load or store doesn't exec." This reverts the revert commit `ad95255b92`. The updated version also creates a load when the store may not execute. In those cases, we still need to introduce a load in a function where there may not have been one before, so this doesn't completely resolve issue #51248. Original message: When only a store is sunk, there is no need to create a load in the pre-header, as the result of the load will never get used. The dead load can can introduce UB, if the function is marked as writeonly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123473	2022-05-29 21:57:14 +01:00
Max Kazantsev	143ca15106	Fix comment in test. NFC	2022-05-24 17:22:16 +07:00
Max Kazantsev	1968f765c3	[Test] Add LICM test for PR55672 showing problem with freeze instruction	2022-05-24 17:17:46 +07:00
Florian Hahn	3497a4f396	[LICM] Add test to exercise assertion from D123473. Add a test case that triggers an assertion with earlier versions of D123473.	2022-05-05 10:49:52 +01:00
Florian Hahn	ce3bb82e45	[LICM] Add test for writeonly fn with noalias call. Add an additional test for D123473.	2022-04-22 21:37:08 +01:00
Florian Hahn	5e54a413de	[LICM] Add additional writeonly tests, check attributes. Add additional test coverage for D123473.	2022-04-20 18:49:37 +01:00
Florian Hahn	ad95255b92	Revert "[LICM] Only create load in pre-header when promoting load." This reverts commit `4bf3b7dc92`. This might be causing another buildbot failure.	2022-04-13 20:24:28 +02:00
Florian Hahn	4bf3b7dc92	Recommit "[LICM] Only create load in pre-header when promoting load." This reverts the revert commit `1ddc719680`. This version of the patch sets the initial available value to poison, which resolves an issue with the SSAUpdater breaking LCSSA form.	2022-04-13 17:20:39 +02:00
Florian Hahn	1ddc719680	Revert "[LICM] Only create load in pre-header when promoting load." This reverts commit `42229b96bf`. This appears to cause crashes on multiple bots.	2022-04-11 17:37:23 +02:00
Florian Hahn	42229b96bf	[LICM] Only create load in pre-header when promoting load. When only a store is sunk, there is no need to create a load in the pre-header, as the result of the load will never get used. The dead load can can introduce UB, if the function is marked as writeonly. Fixes #51248. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123473	2022-04-11 16:45:18 +02:00
Florian Hahn	d6cf181a8d	[LICM] Add additional test for load hoisting, simplify existing one.	2022-04-11 13:27:39 +02:00
Florian Hahn	b42c054744	[LICM] Add test for PR51248. Test for #51248. LICM introduces an unused load in a writeonly function.	2022-04-10 22:36:03 +02:00
Florian Hahn	9a63978b85	[LICM] Trim unneeded functions from test, add promote-able load. Clean up the test a bit. Also add a promote-able load, to make sure LICM always has to hoist the load.	2022-04-10 22:25:05 +02:00
Nikita Popov	5cefe7d9f5	[LoopSink] Require MemorySSA This makes MemorySSA in LoopSink required, and removes the AST-based implementation, as well as the related support code in LICM. Differential Revision: https://reviews.llvm.org/D123288	2022-04-08 09:49:44 +02:00
Nikita Popov	afb526b3f4	[LICM] Handle store of pointer to itself (PR54495) Rather than iterating over users and comparing operands, iterate over uses and check operand number. Otherwise, we'll end up promoting a store twice if it has two equal operands. This can only happen with opaque pointers, as otherwise both operands differ by a level of indirection, so a bitcast would have to be involved. Fixes https://github.com/llvm/llvm-project/issues/54495.	2022-03-22 14:00:07 +01:00
Florian Hahn	5ab421fb4e	[LICM] Add allowspeculation pass options. This adds a new option to control AllowSpeculation added in D119965 when using `-passes=...`. This allows reproducing #54023 using opt. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D121944	2022-03-18 16:51:57 +00:00
Florian Hahn	bc00f47c01	[LoopSink] Do not try to sink phi nodes. Skip phi nodes in the preheader. They may not be considered loop invariant by the assertion below. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D121010	2022-03-06 11:16:22 +00:00
Nikita Popov	46f9e45ef0	[Statepoint] Update gc.statepoint calls in tests with elementtype (NFC) This updates tests for the LangRef change in D117890.	2022-02-04 14:15:41 +01:00
Nikita Popov	44cfc3a816	[LICM] Generalize unwinding check during scalar promotion This extract a common isNotVisibleOnUnwind() helper into AliasAnalysis, which handles allocas, byval arguments and noalias calls. After D116998 this could also handle sret arguments. We have similar logic in DSE and MemCpyOpt, which will be switched to use this helper as well. The noalias call case is a bit different from the others, because it also requires that the object is not captured. The caller is responsible for doing the appropriate check. Differential Revision: https://reviews.llvm.org/D117000	2022-01-26 11:15:03 +01:00
Nikita Popov	dee0c268ef	[LICM] Add additional tests for promotion with unwinding (NFC)	2022-01-26 11:07:09 +01:00
Nick Desaulniers	79ebc3b0dd	[llvm][test] rewrite callbr to use i rather than X constraint NFC In D115311, we're looking to modify clang to emit i constraints rather than X constraints for callbr's indirect destinations. Prior to doing so, update all of the existing tests in llvm/ to match. Reviewed By: void, jyknight Differential Revision: https://reviews.llvm.org/D115410	2022-01-11 11:31:08 -08:00
Nikita Popov	aff9f2dc01	[LICM] Regenerate test checks (NFC)	2022-01-11 11:25:19 +01:00
Nikita Popov	41a522779d	[LICM] Check for noalias call instead of alloc like fn When determining whether the memory is local to the function (and we can thus introduce spurious writes without thread-safety issues), check for a noalias call rather than the hardcoded list of memory allocation functions. Noalias calls are the more general way to determine allocation functions, as long as we're only interested in the property that the returned value is distinct from any other accessible memory. Differential Revision: https://reviews.llvm.org/D116728	2022-01-06 14:38:19 +01:00
Nikita Popov	0fa174398b	[LICM] Add test for noalias call (NFC) Add a test with a noalias call that is not a known allocation function.	2022-01-06 11:46:27 +01:00

1 2 3 4 5 ...

456 Commits