llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	55cf09ae26	[ValueTracking] Simplify llvm::isPointerOffset() We still need the code after stripAndAccumulateConstantOffsets() since it doesn't handle GEPs of scalable types and non-constant but identical indexes. Differential Revision: https://reviews.llvm.org/D120523	2022-03-14 09:32:36 -07:00
Florian Hahn	7662d1687b	[MemCpyOpt] Check all access for MemoryUses in writtenBetween. Currently writtenBetween can miss clobbers of Loc between End and Start, if End is a MemoryUse. To guarantee we see all write clobbers of Loc between Start and End for MemoryUses, restrict to Start and End being in the same block and check all accesses between them. This fixes 2 mis-compiles illustrated in llvm/test/Transforms/MemCpyOpt/memcpy-byval-forwarding-clobbers.ll Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D119929	2022-02-21 16:54:30 +00:00
Florian Hahn	3ba42a564a	[MemCpyOpt] Add non-local memcpy test with memory phi.	2022-02-18 11:59:24 +00:00
Nikita Popov	97c151de3d	[MemCpyOpt] Fix broken check lines (NFC) These are leftovers from when there were separate MSSA/non-MSSA check lines.	2022-02-16 14:39:43 +01:00
Florian Hahn	85fd97e3b9	[MemCpyOpt] Add tests with incorrect memcpy->byval forwarding. Add a few test cases with clobbers (writes and lifetime.end) that should prevent memcpy->byval forwarding, but those clobbers are missed at the moment.	2022-02-16 11:35:14 +00:00
Arthur Eubanks	a9029a33ff	[OpaquePtr][ValueTracking] Check GEP source element type in isPointerOffset() Fixes a MemCpyOpt miscompile with opaque pointers. This function can be further cleaned up, but let's just fix the miscompile first. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D119652	2022-02-13 10:35:38 -08:00
Arthur Eubanks	d050010ea2	[test][MemCpyOpt] Rename test function	2022-02-12 17:44:51 -08:00
Arthur Eubanks	12ba0659b4	[test][MemCpyOpt] Precommit test	2022-02-12 17:38:45 -08:00
Nikita Popov	6b69985da4	[MemCpyOpt] Use helper for unwind check This extends support to byval arguments. It would be further extended to handle the case of non-captured noalias returns.	2022-01-26 12:43:31 +01:00
Nikita Popov	ed4efee2a3	[MemCpyOpt] Add additiona call slot unwind tests (NFC) Test a possibly unwinding call with a byval and sret argument.	2022-01-26 11:49:24 +01:00
Nikita Popov	0d20407d1a	Reapply [MemCpyOpt] Look through pointer casts when checking capture This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679. ----- The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.	2022-01-20 09:30:21 +01:00
Nikita Popov	655a7024db	Reapply [MemCpyOpt] Make capture check during call slot optimization more precise This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679. ----- Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first. In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. Differential Revision: https://reviews.llvm.org/D115615	2022-01-20 09:30:20 +01:00
Nikita Popov	d7bff2e9d2	[MemCpyOpt] Fix metadata merging during call slot optimization Call slot optimization currently merges the metadata between the call and the load. However, we also need to merge in the metadata of the store. Part of the reason why we might have gotten away with this previously is that usually the load and the store are the same instruction (a memcpy), this can only happen if call slot optimization occurs on an actual load/store pair. This addresses the issue reported in https://reviews.llvm.org/D115615#3251386. Differential Revision: https://reviews.llvm.org/D117679	2022-01-20 09:25:13 +01:00
Nikita Popov	0db30adcfb	[MemCpyOpt] Test invalid noalias metadata after call slot opt (NFC)	2022-01-19 15:51:10 +01:00
Hans Wennborg	53a51acc36	Revert "[MemCpyOpt] Make capture check during call slot optimization more precise" This casued a miscompile due to call slot optimization replacing a call argument without considering the call's !noalias metadata, see discussion on the code review. > Call slot optimization is currently supposed to be prevented if > the call can capture the source pointer. Due to an implementation > bug, this check currently doesn't trigger if a bitcast of the source > pointer is passed instead. I'm somewhat afraid of the fallout of > fixing this bug (due to heavy reliance on call slot optimization > in rust), so I'd like to strengthen the capture reasoning a bit first. > > In particular, I believe that the capture is fine as long as a) > the call itself cannot depend on the pointer identity, because > neither dest has been captured before/at nor src before the > call and b) there is no potential use of the captured pointer > before the lifetime of the source alloca ends, either due to > lifetime.end or a return from a function. At that point the > potentially captured pointer becomes dangling. > > Differential Revision: https://reviews.llvm.org/D115615 Also reverting the dependent commit: > [MemCpyOpt] Look through pointer casts when checking capture > > The user scanning loop above looks through pointer casts, so we > also need to strip pointer casts in the capture check. Previously > the source was incorrectly considered not captured if a bitcast > was passed to the call. This reverts commit `487a34ed9d` and `00e6869463`.	2022-01-18 17:41:49 +01:00
Nikita Popov	00e6869463	[MemCpyOpt] Look through pointer casts when checking capture The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.	2022-01-05 09:50:33 +01:00
Nikita Popov	487a34ed9d	[MemCpyOpt] Make capture check during call slot optimization more precise Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first. In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. Differential Revision: https://reviews.llvm.org/D115615	2022-01-05 09:39:25 +01:00
Nikita Popov	c2e77c9122	[MemCpyOpt] Add additional call slot capture tests (NFC)	2022-01-05 09:33:04 +01:00
Nikita Popov	396370e889	[MemCpyOpt] Add additional call slot capture tests (NFC) One test shows a miscompile when bitcasts are involved, the others cases where we can perform the optimization despite a capture.	2021-12-13 10:57:06 +01:00
Nikita Popov	90ec6dff86	[OpaquePtr] Forbid mixing typed and opaque pointers Currently, opaque pointers are supported in two forms: The -force-opaque-pointers mode, where all pointers are opaque and typed pointers do not exist. And as a simple ptr type that can coexist with typed pointers. This patch removes support for the mixed mode. You either get typed pointers, or you get opaque pointers, but not both. In the (current) default mode, using ptr is forbidden. In -opaque-pointers mode, all pointers are opaque. The motivation here is that the mixed mode introduces additional issues that don't exist in fully opaque mode. D105155 is an example of a design problem. Looking at D109259, it would probably need additional work to support mixed mode (e.g. to generate GEPs for typed base but opaque result). Mixed mode will also end up inserting many casts between i8* and ptr, which would require significant additional work to consistently avoid. I don't think the mixed mode is particularly valuable, as it doesn't align with our end goal. The only thing I've found it to be moderately useful for is adding some opaque pointer tests in between typed pointer tests, but I think we can live without that. Differential Revision: https://reviews.llvm.org/D109290	2021-09-10 15:18:23 +02:00
Fraser Cormack	7fb66d4035	[MemCpyOpt] Fix a variety of scalable-type crashes This patch fixes a variety of crashes resulting from the `MemCpyOptPass` casting `TypeSize` to a constant integer, whether implicitly or explicitly. Since the `MemsetRanges` requires a constant size to work, all but one of the fixes in this patch simply involve skipping the various optimizations for scalable types as cleanly as possible. The optimization of `byval` parameters, however, has been updated to work on scalable types in theory. In practice, this optimization is only valid when the length of the `memcpy` is known to be larger than the scalable type size, which is currently never the case. This could perhaps be done in the future using the `vscale_range` attribute. Some implicit casts have been left as they were, under the knowledge they are only called on aggregate types. These should never be scalably-sized. Reviewed By: nikic, tra Differential Revision: https://reviews.llvm.org/D109329	2021-09-08 11:21:36 +01:00
Nikita Popov	88003cea1c	[MemCpyOpt] Remove MemDepAnalysis-based implementation The MemorySSA-based implementation has been enabled for a few months (since D94376). This patch drops the old MDA-based implementation entirely. I've kept this to only the basic cleanup of dropping various conditions -- the code could be further cleaned up now that there is only one implementation. Differential Revision: https://reviews.llvm.org/D102113	2021-08-07 22:35:44 +02:00
Artem Belevich	6a9cf21f5a	[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA. Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that there are users that do not expect LLVM to materialize `memset` intrinsic. While other passes can do that, too, MemCpyOpt triggers it more frequently and breaks sanitizers and some downstream users. For now introduce a flag to force-enable the flag and opt-in only CUDA compilation with NVPTX back-end. Differential Revision: https://reviews.llvm.org/D106401	2021-08-06 11:13:52 -07:00
Michael Liao	d1cacd5928	[MemCpyOpt] Teach memcpyopt to handle loads from the constant memory. - Loads from the constant memory (either explicit one or as the source of memory transfer intrinsics) won't alias any stores. Reviewed By: asbirlea, efriedma Differential Revision: https://reviews.llvm.org/D107605	2021-08-06 12:43:52 -04:00
Nikita Popov	bb15861e14	[MemCpyOpt] Relax libcall checks Rather than blocking the whole MemCpyOpt pass if the libcalls are not available, only disable creation of new memset/memcpy intrinsics where only load/stores were used previously. This only affects the store merging and load-store conversion optimization. Other optimizations are derived from existing intrinsics, which are well-defined in the absence of libcalls -- not having the libcalls just means that call simplification won't convert them to intrinsics. This is a weaker variation of D104801, which dropped these checks entirely. Ideally we would not couple emission of intrinsics to libcall availability at all, but as the intrinsics may be legalized to libcalls we need to be a bit careful right now. Differential Revision: https://reviews.llvm.org/D106769	2021-08-04 21:17:51 +02:00
Philip Reames	e75a2dfe20	[tests] Stablize tests for possible change in deref semantics There's a potential change in dereferenceability attribute semantics in the nearish future. See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context. This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics. Note that for many of these cases, O3 would infer exactly these attributes on the test IR. This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory. There's a couple other tests which need more one-off attention, they'll be handled in another change.	2021-07-14 13:05:43 -07:00
Jon Roelofs	37b6e03c18	[Intrinsics] Make MemCpyInlineInst a MemCpyInst This opens up more optimization opportunities in passes that already handle MemCpyInst's. Differential revision: https://reviews.llvm.org/D105247	2021-07-02 10:25:24 -07:00
Nikita Popov	9aa951e80e	[MemCpyOpt] Preserve address space Preserve address space when generating the cast to i8*.	2021-06-27 20:21:19 +02:00
Nikita Popov	f025053977	[MemCpyOpt] Handle unusual memcpy element type Apparently, it is legal to use memcpy/memset with pointer types other than i8. Prior to `81fcdae68c` this case was silently miscompiled, as the i8 offset calculation was performed on some other type. Now it would crash due to a type mismatch. Fix this by inserting an explicit bitcast to i8.	2021-06-27 16:21:44 +02:00
Nikita Popov	81fcdae68c	[MemCpyOpt] Support opaque pointers	2021-06-27 15:52:38 +02:00
Nicolai Hähnle	a888e492f6	[IR] Memory intrinsics are not unconditionally `nosync` Remove the `nosync` attribute from the memory intrinsic definitions (i.e. memset, memcpy, memmove). Like native memory accesses, memory intrinsics can be volatile. This is indicated by an immarg in the intrinsic call. All else equal, a volatile memory intrinsic is `sync`, so we cannot annotate the intrinsic functions themselves as `nosync`. The attributor and function-attr passes know to take the volatile bit into account. Since `nosync` is a default attribute, this means we have to stop using the DefaultAttrIntrinsic tablegen class for memory intrinsics, and specify all default attributes other than `nosync` explicitly. Most of the test changes are trivial churn, but one test case (in nosync.ll) was in fact incorrect before this change. Differential Revision: https://reviews.llvm.org/D102295	2021-05-21 03:40:59 +02:00
Nikita Popov	656296b1c2	Reapply [CaptureTracking] Do not check domination Reapply after adjusting the synchronized.m test case, where the TODO is now resolved. The pointer is only captured on the exception handling path. ----- For the CapturesBefore tracker, it is sufficient to check that I can not reach BeforeHere. This does not necessarily require that BeforeHere dominates I, it can also occur if the capture happens on an entirely disjoint path. This change was previously accepted in D90688, but had to be reverted due to large compile-time impact in some cases: It increases the number of reachability queries that are performed. After recent changes, the compile-time impact is largely mitigated, so I'm reapplying this patch. The remaining compile-time impact is largely proportional to changes in code-size.	2021-05-16 15:46:31 +02:00
Nikita Popov	541c2845de	Revert "[CaptureTracking] Do not check domination" This reverts commit `6b8b43e7af`. This causes clang test to fail (CodeGenObjC/synchronized.m). Revert until I can figure out whether that's an expected change.	2021-05-16 11:04:45 +02:00
Nikita Popov	6b8b43e7af	[CaptureTracking] Do not check domination For the CapturesBefore tracker, it is sufficient to check that I can not reach BeforeHere. This does not necessarily require that BeforeHere dominates I, it can also occur if the capture happens on an entirely disjoint path. This change was previously accepted in D90688, but had to be reverted due to large compile-time impact in some cases: It increases the number of reachability queries that are performed. After recent changes, the compile-time impact is largely mitigated, so I'm reapplying this patch. The remaining compile-time impact is largely proportional to changes in code-size.	2021-05-16 10:49:36 +02:00
Nikita Popov	aaf5fd4316	[MemCpyOpt] Add test for unreachable capture (NFC) This is based on the test from D90688, without the argmemonly attribute. The argmemonly attribute would guaranteed no modref by itself and the question of captures would not arise in the first place.	2021-05-16 10:48:52 +02:00
Olle Fredriksson	f5446b769a	[MemCpyOpt] Allow variable lengths in memcpy optimizer This makes the memcpy-memcpy and memcpy-memset optimizations work for variable sizes as long as they are equal, relaxing the old restriction that they are constant integers. If they're not equal, the old requirement that they are constant integers with certain size restrictions is used. The implementation works by pushing the length tests further down in the code, which reveals some places where it's enough that the lengths are equal (but not necessarily constant). Differential Revision: https://reviews.llvm.org/D100870	2021-04-21 23:23:38 +02:00
Philip Reames	854de7c4d0	[tests] Refresh a bunch of autogen test to adjust for format changes	2021-03-22 10:41:39 -07:00
Nikita Popov	5556660971	[MemCpyOpt] Handle read from lifetime.start with offset This fixes a regression from the MemDep-based implementation: MemDep completely ignores lifetime.start intrinsics that aren't MustAlias -- this is probably unsound, but it does mean that the MemDep based implementation successfully eliminated memcpy's from lifetime.start if the memcpy happens at an offset, rather than the base address of the alloca. Add a special case for the case where the lifetime.start spans the whole alloca (which is pretty much the only kind of lifetime.start that frontends ever emit), as we don't need to figure out our exact aliasing relationship in that case, the whole alloca is dead prior to the call. If this doesn't cover all practically relevant cases, then it would be possible to make use of the recently added PartialAlias clobber offsets to make this more precise.	2021-03-13 20:38:09 +01:00
Nikita Popov	a10bf5572d	[MemCpyOpt] Add additional tests for memcpy of undef (NFC)	2021-03-13 20:38:09 +01:00
Nikita Popov	2902bdeea1	[MemCpyOpt] Use AA to check for MustAlias between memset and memcpy Rather than checking for simple equality, check for MustAlias, as we do in other transforms. This catches equivalent GEPs.	2021-03-13 11:41:15 +01:00
Nikita Popov	9080444f33	[MemCpyOpt] Don't generate zero-size memset If a memset destination is overwritten by a memcpy and the sizes are exactly the same, then the memset is simply dead. We can directly drop it, instead of replacing it with a memset of zero size, which is particularly ugly for the case of a dynamic size.	2021-03-13 11:41:15 +01:00
Nikita Popov	dabd6abbcd	[MemCpyOpt] Add additional tests for memset+memcpy overwrite (NFC)	2021-03-13 11:32:24 +01:00
Nikita Popov	b2f933a6ce	[MemorySSA] Don't bail on phi starting access When calling getClobberingMemoryAccess() with MemoryLocation on a MemoryPHI starting access, the walker currently immediately bails and returns the starting access. This makes sense for the API that does not accept a location (as we wouldn't know what clobber we should be checking for), but doesn't make sense for the MemoryLocation-based API. This means that it can't look through a MemoryPHI if it's the starting access, but can if there is one more non-clobbering def in between. This patch removes the limitation. Differential Revision: https://reviews.llvm.org/D98557	2021-03-13 10:53:13 +01:00
Nikita Popov	dfd27ebbd0	[MemCpyOpt] Add test for memcpy in loop (NFC) This is currently not being optimized.	2021-03-12 22:54:24 +01:00
Nikita Popov	4125afc357	[MemCpyOpt] Fix handling of readnone byval arguments If the call is readnone, then there may not be any MemoryAccess associated with the call. Bail out in that case. This fixes the issue reported at https://reviews.llvm.org/D94376#2578312.	2021-02-22 18:48:31 +01:00
Dávid Bolvanský	cd54c57919	Reland "[Libcalls, Attrs] Annotate libcalls with noundef" Fixed Clang tests.	2021-02-20 06:18:48 +01:00
Dávid Bolvanský	94d034fb86	Revert "[Libcalls, Attrs] Annotate libcalls with noundef" This reverts commit `33b0c63775`. Bots are failing. Some Clang tests need to be updated too.	2021-02-20 04:18:42 +01:00
Dávid Bolvanský	33b0c63775	[Libcalls, Attrs] Annotate libcalls with noundef I think we can use here same logic as for nonnull. strlen(X) - X must be noundef => valid pointer. for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones) Reviewed By: jdoerfert, aqjune Differential Revision: https://reviews.llvm.org/D95122	2021-02-20 04:10:07 +01:00
Nikita Popov	be9889b350	[MemorySSA] Don't treat lifetime.end as NoAlias MemorySSA currently treats lifetime.end intrinsics as not aliasing anything. This breaks MemorySSA-based MemCpyOpt, because we'll happily move a read of a pointer below a lifetime.end intrinsic, as no clobber is reported. I think the MemorySSA modelling here isn't correct: lifetime.end(p) has approximately the same effect as doing a memcpy(p, undef), and should be treated as a clobber. This patch removes the special handling of lifetime.end, leaving alias analysis to handle it appropriately. Differential Revision: https://reviews.llvm.org/D95763	2021-02-04 20:58:28 +01:00
Nikita Popov	6e52eebc2a	[MemCpyOpt] Add test for incorrect optimization across lifetime (NFC) This only affects the MemorySSA-based implementation.	2021-01-29 12:57:02 +01:00

1 2 3 4 5

240 Commits