Commit Graph

294 Commits

Author SHA1 Message Date
zoecarver 6c25816d7b [DSE] Look through memory PHI arguments when removing noop stores in MSSA.
Summary:
Adds support for "following" memory through MSSA PHI arguments. This will help catch more noop stores that exist between blocks.

Originally part of D79391.

Reviewers: fhahn, jfb, asbirlea

Differential Revision: https://reviews.llvm.org/D82588
2020-10-01 10:42:02 -07:00
Florian Hahn 915310bf14 Revert "[DSE] Switch to MemorySSA-backed DSE by default."
There appears to be a mis-compile with MemorySSA-backed DSE in
combination with llvm.lifetime.end. It currently appears like
DSE is doing the right thing and the llvm.lifetime.end markers
are incorrect. The reverted patch uncovers the mis-compile.

This patch temporarily switches back to the legacy DSE
implementation, while we investigate.

This reverts commit 9d172c8e9c.
2020-09-26 18:35:27 +01:00
Florian Hahn 8f0466edc0 [DSE] Unify & fix mem terminator location checks.
When looking for memory defs killed by memory terminators the code
currently incorrectly ignores the size argument of llvm.lifetime.end.

This patch updates the code to use isMemTerminator and updates
isMemTerminator to use isOverwrite() to make sure locations that are
outside the range marked as dead by llvm.lifetime.end are not
considered. Note that isOverwrite is only used for llvm.lifetime.end,
because free-like functions make the whole underlying object dead.
2020-09-26 13:47:50 +01:00
Florian Hahn b2c0193afa [DSE] Add tests with lifetime.end that only mark parts of the obj as dead.
llvm.lifetime.end accepts a size parameters to limit the size of the
location marked as dead. Add a few tests with stores to locations after
the part that has been marked as dead.
2020-09-26 13:33:13 +01:00
Dávid Bolvanský 2990518b03 [MemLoc] Support lllvm.memcpy.inline in MemoryLocation::getForArgument
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D87971
2020-09-20 14:01:48 +02:00
Dávid Bolvanský d716f1608c [MemLoc] Support bcmp in MemoryLocation::getForArgument
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D87964
2020-09-19 17:12:43 +02:00
Florian Hahn 9d172c8e9c Recommit "[DSE] Switch to MemorySSA-backed DSE by default."
This switches to using DSE + MemorySSA by default again, after
fixing the issues reported after the first commit.

Notable fixes fc82006331, a0017c2bc2.

This reverts commit 3a59628f3c.
2020-09-18 11:05:00 +01:00
Florian Hahn cb9528a042 [DSE] Add another test cases with loop carried dependence. 2020-09-16 14:50:35 +01:00
Florian Hahn 3a59628f3c Revert "[DSE] Switch to MemorySSA-backed DSE by default."
This reverts commit fb109c42d9.

Temporarily revert due to a mis-compile pointed out at D87163.
2020-09-15 18:07:56 +01:00
Florian Hahn f715d81c9d [DSE] Only eliminate candidates that always store the same loc.
AliasAnalysis/MemoryLocation does not account for loops. Two
MemoryLocation can be must-overwrite, even if the first one writes
multiple locations in a loop.

This patch prevents removing such stores, by only considering candidates
that are known to be loop invariant, or executed in the same BB.

Currently the invariant check is quite conservative and only considers
Alloca and Alloca-like instructions and arguments as invariant base pointers.
It also considers GEPs with all constant indices and invariant bases as
invariant.

This can be improved in the future, but the current implementation has
only minor impact on the total number of stores eliminated (25903 vs
26047 for the baseline). There are some 2-10% swings for some individual
benchmarks. In roughly half of the cases, the number of stores removed
increases actually, because we skip candidates that are unlikely to be
valid candidates early.
2020-09-14 12:06:58 +01:00
Florian Hahn eef30334d1 [DSE] Precommit test case for invalid elimination of store in loop. 2020-09-14 12:06:58 +01:00
Florian Hahn e082dee2b5 [DSE] Bail out on MemoryPhis when deleting stores at end of function.
When deleting stores at the end of a function, we have to do PHI
translation, otherwise we might miss reads in different iterations of a
loop. See multiblock-loop-carried-dependence.ll for details.

This fixes a mis-compile and surprisingly also increases the number of
eliminated stores from 26047 to 26572 for MultiSource/SPEC2000/SPEC2006
on X86 with -O3 -flto. This is most likely because we save budget by not
exploring through MemoryPhis, which are less likely to result in valid
candidates for elimination.

The issue was reported post-commit for fb109c42d9.
2020-09-12 19:05:59 +01:00
Florian Hahn 3de9e3e493 [DSE] Precommit test case with loop carried dependence. 2020-09-12 18:51:08 +01:00
Krzysztof Parzyszek f92908cc74 [DSE] Make sure that DSE+MSSA can handle masked stores
Differential Revision: https://reviews.llvm.org/D87414
2020-09-11 10:00:21 -05:00
Florian Hahn fb109c42d9 [DSE] Switch to MemorySSA-backed DSE by default.
The tests have been updated and I plan to move them from the MSSA
directory up.

Some end-to-end tests needed small adjustments. One difference to the
legacy DSE is that legacy DSE also deletes trivially dead instructions
that are unrelated to memory operations. Because MemorySSA-backed DSE
just walks the MemorySSA, we only visit/check memory instructions. But
removing unrelated dead instructions is not really DSE's job and other
passes will clean up.

One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll,
but I think this comes down to legacy DSE not handling instructions that
may throw correctly in that case. To cover this with MemorySSA-backed
DSE, we need an update to llvm.coro.begin to treat it's return value to
belong to the same underlying object as the passed pointer.

There are some minor cases MemorySSA-backed DSE currently misses, e.g. related
to atomic operations, but I think those can be implemented after the switch.

This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html

For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores
goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers
and details in the thread on llvm-dev.

Impact on CTMark:
```
                                     Legacy Pass Manager
                        exec instrs    size-text
O3                       + 0.60%        - 0.27%
ReleaseThinLTO           + 1.00%        - 0.42%
ReleaseLTO-g.            + 0.77%        - 0.33%
RelThinLTO (link only)   + 0.87%        - 0.42%
RelLO-g (link only)      + 0.78%        - 0.33%
```
http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions
```
                                     New Pass Manager
                       exec instrs.   size-text
O3                       + 0.95%       - 0.25%
ReleaseThinLTO           + 1.34%       - 0.41%
ReleaseLTO-g.            + 1.71%       - 0.35%
RelThinLTO (link only)   + 0.96%       - 0.41%
RelLO-g (link only)      + 2.21%       - 0.35%
```
http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions

Reviewed By: asbirlea, xbolva00, nikic

Differential Revision: https://reviews.llvm.org/D87163
2020-09-10 22:24:32 +01:00
Florian Hahn a5ec99da6e [DSE] Support eliminating memcpy.inline.
MemoryLocation has been taught about memcpy.inline, which means we can
get the memory locations read and written by it. This means DSE can
handle memcpy.inline
2020-09-10 13:19:25 +01:00
Florian Hahn 9969c317ff [DSE,MemorySSA] Handle atomic stores explicitly in isReadClobber.
Atomic stores are modeled as MemoryDef to model the fact that they may
not be reordered, depending on the ordering constraints.

Atomic stores that are monotonic or weaker do not limit re-ordering, so
we do not have to treat them as potential read clobbers.

Note that llvm/test/Transforms/DeadStoreElimination/MSSA/atomic.ll
already contains a set of negative test cases.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87386
2020-09-09 23:01:58 +01:00
Krzysztof Parzyszek db7defd9ba [DSE] Explicitly not use MSSA in testcase for now
It fails for some reason, but it shouldn't stop switching to MSSA in DSE.
2020-09-09 13:45:55 -05:00
Krzysztof Parzyszek 81ff2d30a9 [DSE] Handle masked stores 2020-09-09 13:31:31 -05:00
Krzysztof Parzyszek 27cd187587 [DSE] Add testcase that uses masked loads and stores 2020-09-09 10:30:32 -05:00
Florian Hahn efb8e156da [DSE,MemorySSA] Add an early check for read clobbers to traversal.
Depending on the benchmark, this early exit can save a substantial
amount of compile-time:

http://llvm-compile-time-tracker.com/compare.php?from=505f2d817aa8e07ba98e5fd4a8f6ff0666f89df1&to=eb4e441147f9b4b7a5fcbbc57428cadbe9e01f10&stat=instructions
2020-09-07 23:22:10 +01:00
Florian Hahn 1ddb3a369f [LangRef] Adjust guarantee for llvm.memcpy to also allow equal arguments.
This adjusts the description of `llvm.memcpy` to also allow operands
to be equal. This is in line with what Clang currently expects.

This change is intended to be temporary and followed by re-introduce
a variant with the non-overlapping guarantee for cases where we can
actually ensure that property in the front-end.

See the links below for more details:
http://lists.llvm.org/pipermail/cfe-dev/2020-August/066614.html
and PR11763.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D86815
2020-09-05 19:18:23 +01:00
Florian Hahn 00eb6fef08 [DSE,MemorySSA] Check for throwing instrs between killing/killed def.
We also have to check all uses between the killing & killed def and
check if any of them is throwing.
2020-09-04 18:54:59 +01:00
Florian Hahn 51932fc6bd [DSE,MemorySSA] Remove some duplicated test functions.
Some tests from multibuild-malloc-free.ll do not actually use malloc or
free and where split out to multiblock-throwing.ll, but not removed from
the original file. This patch cleans that up. It also moves @test22 to
simple.ll, because it does not involve multiple blocks.
2020-09-04 17:52:59 +01:00
Florian Hahn 6cb54cfe0b [DSE] Move legacy tests to DeadStoreElimination/MemDepAnalysis.
This patch moves the tests for the old MemDepAnalysis based DSE
implementation to the MemDepAnalysis subdirectory and updates them to
pass -enable-dse-memoryssa=false.

This is in preparation for the switch to MemorySSA-backed DSE.
2020-09-04 14:38:03 +01:00
Florian Hahn ab86e64a96 [DSE] Remove some dead code from DSE tests.
Some tests depend on DSE removing dead instructions unrelated to any
memory optimization. That's not really DSE's job, remove it.
2020-09-04 09:39:40 +01:00
Florian Hahn 43aa7227df [DSE,MemorySSA] Check if Current is valid for elimination first.
This changes getDomMemoryDef to check if a Current is a valid
candidate for elimination before checking for reads. Before the change,
we were spending a lot of compile-time in checking for read accesses for
Current that might not even be removable.

This patch flips the logic, so we skip Current if they cannot be
removed before checking all their uses. This is much more efficient in
practice.

It also adds a more aggressive limit for checking partially overlapping
stores. The main problem with overlapping stores is that we do not know
if they will lead to elimination until seeing all of them. This patch
limits adds a new limit for overlapping store candidates, which keeps
the number of modified overlapping stores roughly the same.

This is another substantial compile-time improvement (while also
increasing the number of stores eliminated). Geomean -O3 -0.67%,
ReleaseThinLTO -0.97%.

http://llvm-compile-time-tracker.com/compare.php?from=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&to=2e630629b43f64b60b282e90f0d96082fde2dacc&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86487
2020-08-28 11:19:04 +01:00
Florian Hahn fd6ebea50d [MemLoc] Support memcmp in MemoryLocation::getForArgument.
This patch adds support for memcmp in MemoryLocation::getForArgument.
memcmp reads from the first 2 arguments up to the number of bytes of the
third argument.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86725
2020-08-28 10:19:54 +01:00
Florian Hahn bfbd63d51a [DSE,MemorySSA] Add memcmp test case. 2020-08-28 10:02:41 +01:00
Florian Hahn bb024c3c4e [DSE,MemorySSA] Remove short-cut to check if all paths are covered.
The post-order number early continue does not work in some cases, e.g.
if a path from EarlierAccess to an exit includes a node that dominates
EarlierAccess in a cycle.

The short-cut only has very minor impact on compile-time, so it seems
straight-forward to remove it for now:

http://llvm-compile-time-tracker.com/compare.php?from=062412e79fcfedf2cf004433e42036b0333e3f83&to=d7386016a77ce1387bdbbf360f1de157faea9d31&stat=instructions

Fixes PR47285.
2020-08-27 12:42:40 +01:00
Florian Hahn 73f09ce8f3 [DSE,MemorySSA] Add test for PR47285. 2020-08-27 11:27:06 +01:00
Florian Hahn e717fdb0f1 [DSE,MemorySSA] Traverse use-def chain without MemSSA Walker.
For DSE with MemorySSA it is beneficial to manually traverse the
defining access, instead of using a MemorySSA walker, so we can
better control the number of steps together with other limits and
also weed out invalid/unprofitable paths early on.

This patch requires a follow-up patch to be most effective, which I will
share soon after putting this patch up.

This temporarily XFAIL's the limit tests, because we now explore more
MemoryDefs that may not alias/clobber the killing def. This will be
improved/fixed by the follow-up patch.

This patch also renames some `Dom*` variables to `Earlier*`, because the
dominance relation is not really used/important here and potentially
confusing.

This patch allows us to aggressively cut down compile time, geomean
-O3 -0.64%, ReleaseThinLTO -1.65%, at the expense of fewer stores
removed. Subsequent patches will increase the number of removed stores
again, while keeping compile-time in check.

http://llvm-compile-time-tracker.com/compare.php?from=d8e3294118a8c5f3f97688a704d5a05b67646012&to=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86486
2020-08-27 10:02:02 +01:00
Florian Hahn b99a5eb659 [DSE,MemorySSA] Delay PointerMayBeCaptured calls until actually needed.
Avoid computing InvisibleToCallerBefore/AfterRet up front. In most
cases, this information is not really needed. Instead, introduce helper
functions to compute and cache the result on demand.

Notably, this also does not use PointerMayBeCapturedBefore for
isInvisibleToCallerBeforeRet, as it requires the killing MemoryDef as
starting instruction, making the caching ineffective. But it appears the
use of PointerMayBeCapturedBefore has very limited benefits in practice
(e.g. on SPEC2000/SPEC2006/MultiSource there are no binary changes with
-O3 -flto). Refrain from using it for now, to limit-compile-time.

This gives some nice compile-time improvements:
http://llvm-compile-time-tracker.com/compare.php?from=db9345f6810f379a36752dc52caf5230585d0ebd&to=b4d091047e1b8a3d377d200137b79d03aca65663&stat=instructions
2020-08-24 14:05:44 +01:00
Florian Hahn a93514abf2 [DSE,MemorySSA] Regnerate some check lines.
The check lines where generated before align was added for all
instructions. Re-generate them, to reduce diff noise for actual
functional changes.
2020-08-24 13:24:44 +01:00
Florian Hahn 2431b143ae [DSE,MemorySSA] Limit elimination at end of function to single UO.
Limit elimination of stores at the end of a function to MemoryDefs with
a single underlying object, to save compile time.

In practice, the case with multiple underlying objects seems not very
important in practice. For -O3 -flto on MultiSource/SPEC2000/SPEC2006
this results in a total of 2 more stores being eliminated.

We can always re-visit that in the future.
2020-08-24 13:00:17 +01:00
Florian Hahn 9f7350672e [DSE,MemorySSA] Handle atomicrmw/cmpxchg conservatively.
This adds conservative handling of AtomicRMW/AtomicCmpXChg to
isDSEBarrier, similar to atomic loads and stores.
2020-08-21 10:42:42 +01:00
Florian Hahn f7e4e87df3 [DSE,MemorySSA] Regenerate check lines for atomic.ll tests. 2020-08-21 10:18:06 +01:00
Florian Hahn 139810449b [DSE,MemorySSA] Account for ScanLimit == 0 on entry.
Currently the code does not account for the fact that getDomMemoryDef
can be called with ScanLimit == 0, if we reached the limit while
processing an earlier access. Also tighten the check a bit more and bump
the scan limit now that it is handled properly.

In some cases, this brings a 2x speedup in terms of compile-time.
2020-08-17 17:55:14 +01:00
Florian Hahn 3b0878a370 [DSE,MSSA] Fix crash when using tryToMergePartialOverlappingStores.
We are re-using tryToMergePartialOverlappingStores, which requires
earlier to domiante Later. In the long run,
tryToMergeParialOverlappingStores should be re-written using MemorySSA.

Fixes PR46513.
2020-08-13 12:07:56 +01:00
Jinsong Ji d28f86723f Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit bf544fa1c3.

Fixed the typo in PPCInstrInfo.cpp.
2020-07-28 14:00:11 +00:00
Jinsong Ji bf544fa1c3 Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit adffce7153.

This is breaking test-suite, revert while investigation.
2020-07-27 21:07:00 +00:00
Jinsong Ji adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
John Brawn 20854d85e1 [DSE,MSSA] Recognise init_trampoline in getLocForWriteEx
This fixes an instance where MemorySSA-using Dead Store Elimination is failing
to do a transformation that the non-MemorySSA-using version does.

Differential Revision: https://reviews.llvm.org/D83783
2020-07-15 12:18:58 +01:00
Florian Hahn 80970ac875 [DSE,MSSA] Eliminate stores by terminators (free,lifetime.end).
This patch adds support for eliminating stores by free & lifetime.end
calls. We can remove stores that are not read before calling a memory
terminator and we can eliminate all stores after a memory terminator
until we see a new lifetime.start. The second case seems to not really
trigger much in practice though.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72410
2020-07-08 08:59:46 +01:00
Nuno Lopes 7f903873b8 DSE: fix builtin function recognition to take decl into account 2020-07-02 10:28:47 +01:00
Arthur Eubanks 339eed5d0b [NewPM][BasicAA] basicaa -> basic-aa in Transforms/DeadStoreElimination
Summary: Following https://reviews.llvm.org/D82607.

Reviewers: ychen

Subscribers: george.burgess.iv, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82689
2020-06-26 20:13:37 -07:00
Florian Hahn 4837daf883 [DSE,MSSA] Check if Def is removable only wen we try to remove it.
Non-removable MemoryDefs can still eliminate other defs. Update the
isRemovable checks to only candidates for removal.
2020-06-25 14:01:10 +01:00
Florian Hahn ab27603c6d [DSE,MSSA] Add missing -enable-dse-memoryssa flag to test. 2020-06-24 14:07:30 +01:00
Florian Hahn 4e62c6359c [DSE] Eliminate stores at the end of the function.
This patch add support for eliminating MemoryDefs that do not have any
aliasing users, which indicates that there are no reads/writes to the
memory location until the end of the function.

To eliminate such defs, we have to ensure that the underlying object is
not visible in the caller and does not escape via returning. We need a
separate check for that, as InvisibleToCaller does not consider returns.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker, george.burgess.iv

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72631
2020-06-24 12:58:20 +01:00
Florian Hahn 7b72cb47e6 [DSE,MSSA] Precommit small test changes for D72631. 2020-06-24 10:17:09 +01:00