2010-10-20 07:09:08 +08:00
|
|
|
//===- BasicAliasAnalysis.cpp - Stateless Alias Analysis Impl -------------===//
|
2005-04-22 05:13:18 +08:00
|
|
|
//
|
2019-01-19 16:50:56 +08:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
2005-04-22 05:13:18 +08:00
|
|
|
//
|
2003-10-21 03:43:21 +08:00
|
|
|
//===----------------------------------------------------------------------===//
|
2003-02-27 03:41:54 +08:00
|
|
|
//
|
2010-10-20 07:09:08 +08:00
|
|
|
// This file defines the primary stateless implementation of the
|
|
|
|
// Alias Analysis interface that implements identities (two different
|
|
|
|
// globals cannot alias, etc), but does no stateful analysis.
|
2003-02-27 03:41:54 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2015-08-06 15:33:15 +08:00
|
|
|
#include "llvm/Analysis/BasicAliasAnalysis.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/ADT/APInt.h"
|
2020-10-23 03:50:18 +08:00
|
|
|
#include "llvm/ADT/ScopeExit.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/ADT/SmallPtrSet.h"
|
2012-12-04 00:50:05 +08:00
|
|
|
#include "llvm/ADT/SmallVector.h"
|
2015-08-06 07:40:30 +08:00
|
|
|
#include "llvm/ADT/Statistic.h"
|
2012-12-04 00:50:05 +08:00
|
|
|
#include "llvm/Analysis/AliasAnalysis.h"
|
2017-05-14 14:18:34 +08:00
|
|
|
#include "llvm/Analysis/AssumptionCache.h"
|
2014-01-03 13:47:03 +08:00
|
|
|
#include "llvm/Analysis/CFG.h"
|
2014-01-07 19:48:04 +08:00
|
|
|
#include "llvm/Analysis/CaptureTracking.h"
|
2012-12-04 00:50:05 +08:00
|
|
|
#include "llvm/Analysis/InstructionSimplify.h"
|
|
|
|
#include "llvm/Analysis/MemoryBuiltins.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/Analysis/MemoryLocation.h"
|
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
2019-11-14 05:15:01 +08:00
|
|
|
#include "llvm/Analysis/PhiValues.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/Analysis/TargetLibraryInfo.h"
|
2012-12-04 00:50:05 +08:00
|
|
|
#include "llvm/Analysis/ValueTracking.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/IR/Argument.h"
|
|
|
|
#include "llvm/IR/Attributes.h"
|
|
|
|
#include "llvm/IR/Constant.h"
|
2013-01-02 19:36:10 +08:00
|
|
|
#include "llvm/IR/Constants.h"
|
|
|
|
#include "llvm/IR/DataLayout.h"
|
|
|
|
#include "llvm/IR/DerivedTypes.h"
|
2014-01-13 17:26:24 +08:00
|
|
|
#include "llvm/IR/Dominators.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/IR/Function.h"
|
|
|
|
#include "llvm/IR/GetElementPtrTypeIterator.h"
|
2013-01-02 19:36:10 +08:00
|
|
|
#include "llvm/IR/GlobalAlias.h"
|
|
|
|
#include "llvm/IR/GlobalVariable.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/IR/InstrTypes.h"
|
|
|
|
#include "llvm/IR/Instruction.h"
|
2013-01-02 19:36:10 +08:00
|
|
|
#include "llvm/IR/Instructions.h"
|
|
|
|
#include "llvm/IR/IntrinsicInst.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/IR/Intrinsics.h"
|
|
|
|
#include "llvm/IR/Metadata.h"
|
2013-01-02 19:36:10 +08:00
|
|
|
#include "llvm/IR/Operator.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/IR/Type.h"
|
|
|
|
#include "llvm/IR/User.h"
|
|
|
|
#include "llvm/IR/Value.h"
|
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
2019-11-14 05:15:01 +08:00
|
|
|
#include "llvm/InitializePasses.h"
|
2004-03-15 11:36:49 +08:00
|
|
|
#include "llvm/Pass.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include "llvm/Support/Casting.h"
|
|
|
|
#include "llvm/Support/CommandLine.h"
|
|
|
|
#include "llvm/Support/Compiler.h"
|
2017-05-15 14:39:41 +08:00
|
|
|
#include "llvm/Support/KnownBits.h"
|
2017-08-12 05:30:02 +08:00
|
|
|
#include <cassert>
|
|
|
|
#include <cstdint>
|
|
|
|
#include <cstdlib>
|
|
|
|
#include <utility>
|
2016-01-30 10:42:11 +08:00
|
|
|
|
|
|
|
#define DEBUG_TYPE "basicaa"
|
|
|
|
|
2003-11-26 02:33:40 +08:00
|
|
|
using namespace llvm;
|
2003-11-12 06:41:34 +08:00
|
|
|
|
2015-07-16 03:32:22 +08:00
|
|
|
/// Enable analysis of recursive PHI nodes.
|
2020-06-27 11:41:37 +08:00
|
|
|
static cl::opt<bool> EnableRecPhiAnalysis("basic-aa-recphi", cl::Hidden,
|
2020-08-04 17:43:42 +08:00
|
|
|
cl::init(true));
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
|
|
|
|
/// By default, even on 32-bit architectures we use 64-bit integers for
|
|
|
|
/// calculations. This will allow us to more-aggressively decompose indexing
|
|
|
|
/// expressions calculated using i64 values (e.g., long long in C) which is
|
|
|
|
/// common enough to worry about.
|
2020-06-27 11:41:37 +08:00
|
|
|
static cl::opt<bool> ForceAtLeast64Bits("basic-aa-force-at-least-64b",
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
cl::Hidden, cl::init(true));
|
2020-06-27 11:41:37 +08:00
|
|
|
static cl::opt<bool> DoubleCalcBits("basic-aa-double-calc-bits",
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
cl::Hidden, cl::init(false));
|
|
|
|
|
2015-08-06 07:40:30 +08:00
|
|
|
/// SearchLimitReached / SearchTimes shows how often the limit of
|
|
|
|
/// to decompose GEPs is reached. It will affect the precision
|
|
|
|
/// of basic alias analysis.
|
|
|
|
STATISTIC(SearchLimitReached, "Number of times the limit to "
|
|
|
|
"decompose GEPs is reached");
|
|
|
|
STATISTIC(SearchTimes, "Number of times a GEP is decomposed");
|
|
|
|
|
2014-01-02 11:31:36 +08:00
|
|
|
/// Cutoff after which to stop analysing a set of phi nodes potentially involved
|
2016-01-18 07:13:48 +08:00
|
|
|
/// in a cycle. Because we are analysing 'through' phi nodes, we need to be
|
2014-01-03 13:47:03 +08:00
|
|
|
/// careful with value equivalence. We use reachability to make sure a value
|
|
|
|
/// cannot be involved in a cycle.
|
|
|
|
const unsigned MaxNumPhiBBsValueReachabilityCheck = 20;
|
2014-01-02 11:31:36 +08:00
|
|
|
|
2014-03-27 05:30:19 +08:00
|
|
|
// The max limit of the search depth in DecomposeGEPExpression() and
|
2020-07-31 12:07:10 +08:00
|
|
|
// getUnderlyingObject(), both functions need to use the same search
|
2014-03-27 05:30:19 +08:00
|
|
|
// depth otherwise the algorithm in aliasGEP will assert.
|
|
|
|
static const unsigned MaxLookupSearchDepth = 6;
|
|
|
|
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
bool BasicAAResult::invalidate(Function &Fn, const PreservedAnalyses &PA,
|
2016-12-27 18:30:45 +08:00
|
|
|
FunctionAnalysisManager::Invalidator &Inv) {
|
|
|
|
// We don't care if this analysis itself is preserved, it has no state. But
|
|
|
|
// we need to check that the analyses it depends on have been. Note that we
|
|
|
|
// may be created without handles to some analyses and in that case don't
|
|
|
|
// depend on them.
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
if (Inv.invalidate<AssumptionAnalysis>(Fn, PA) ||
|
|
|
|
(DT && Inv.invalidate<DominatorTreeAnalysis>(Fn, PA)) ||
|
2018-07-30 19:52:08 +08:00
|
|
|
(PV && Inv.invalidate<PhiValuesAnalysis>(Fn, PA)))
|
2016-12-27 18:30:45 +08:00
|
|
|
return true;
|
|
|
|
|
|
|
|
// Otherwise this analysis result remains valid.
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2008-06-16 14:30:22 +08:00
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
// Useful predicates
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Returns true if the pointer is one which would have been considered an
|
|
|
|
/// escape by isNonEscapingLocalObject.
|
2010-07-02 04:08:40 +08:00
|
|
|
static bool isEscapeSource(const Value *V) {
|
2019-01-07 13:42:51 +08:00
|
|
|
if (isa<CallBase>(V))
|
2018-05-16 21:16:54 +08:00
|
|
|
return true;
|
|
|
|
|
|
|
|
if (isa<Argument>(V))
|
2010-07-02 04:08:40 +08:00
|
|
|
return true;
|
2010-06-29 08:50:39 +08:00
|
|
|
|
|
|
|
// The load case works because isNonEscapingLocalObject considers all
|
|
|
|
// stores to be escapes (it passes true for the StoreCaptures argument
|
|
|
|
// to PointerMayBeCaptured).
|
|
|
|
if (isa<LoadInst>(V))
|
|
|
|
return true;
|
|
|
|
|
2021-05-07 22:48:18 +08:00
|
|
|
// The inttoptr case works because isNonEscapingLocalObject considers all
|
|
|
|
// means of converting or equating a pointer to an int (ptrtoint, ptr store
|
|
|
|
// which could be followed by an integer load, ptr<->int compare) as
|
|
|
|
// escaping, and objects located at well-known addresses via platform-specific
|
|
|
|
// means cannot be considered non-escaping local objects.
|
|
|
|
if (isa<IntToPtrInst>(V))
|
|
|
|
return true;
|
|
|
|
|
2010-06-29 08:50:39 +08:00
|
|
|
return false;
|
|
|
|
}
|
2008-06-16 14:30:22 +08:00
|
|
|
|
2016-01-18 07:13:48 +08:00
|
|
|
/// Returns the size of the object specified by V or UnknownSize if unknown.
|
2014-02-22 02:34:28 +08:00
|
|
|
static uint64_t getObjectSize(const Value *V, const DataLayout &DL,
|
2012-08-29 23:32:21 +08:00
|
|
|
const TargetLibraryInfo &TLI,
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
bool NullIsValidLoc,
|
2012-02-28 04:46:07 +08:00
|
|
|
bool RoundToAlign = false) {
|
2012-06-21 23:45:28 +08:00
|
|
|
uint64_t Size;
|
2017-03-22 04:08:59 +08:00
|
|
|
ObjectSizeOpts Opts;
|
|
|
|
Opts.RoundToAlign = RoundToAlign;
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
Opts.NullIsUnknownSize = NullIsValidLoc;
|
2017-03-22 04:08:59 +08:00
|
|
|
if (getObjectSize(V, Size, DL, &TLI, Opts))
|
2012-06-21 23:45:28 +08:00
|
|
|
return Size;
|
2015-06-17 15:21:38 +08:00
|
|
|
return MemoryLocation::UnknownSize;
|
2011-01-19 05:16:06 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Returns true if we can prove that the object specified by V is smaller than
|
|
|
|
/// Size.
|
2011-01-19 05:16:06 +08:00
|
|
|
static bool isObjectSmallerThan(const Value *V, uint64_t Size,
|
2014-02-22 02:34:28 +08:00
|
|
|
const DataLayout &DL,
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
const TargetLibraryInfo &TLI,
|
|
|
|
bool NullIsValidLoc) {
|
2013-04-10 02:16:05 +08:00
|
|
|
// Note that the meanings of the "object" are slightly different in the
|
|
|
|
// following contexts:
|
|
|
|
// c1: llvm::getObjectSize()
|
|
|
|
// c2: llvm.objectsize() intrinsic
|
|
|
|
// c3: isObjectSmallerThan()
|
|
|
|
// c1 and c2 share the same meaning; however, the meaning of "object" in c3
|
|
|
|
// refers to the "entire object".
|
|
|
|
//
|
|
|
|
// Consider this example:
|
|
|
|
// char *p = (char*)malloc(100)
|
|
|
|
// char *q = p+80;
|
|
|
|
//
|
|
|
|
// In the context of c1 and c2, the "object" pointed by q refers to the
|
|
|
|
// stretch of memory of q[0:19]. So, getObjectSize(q) should return 20.
|
|
|
|
//
|
|
|
|
// However, in the context of c3, the "object" refers to the chunk of memory
|
|
|
|
// being allocated. So, the "object" has 100 bytes, and q points to the middle
|
|
|
|
// the "object". In case q is passed to isObjectSmallerThan() as the 1st
|
|
|
|
// parameter, before the llvm::getObjectSize() is called to get the size of
|
|
|
|
// entire object, we should:
|
|
|
|
// - either rewind the pointer q to the base-address of the object in
|
|
|
|
// question (in this case rewind to p), or
|
|
|
|
// - just give up. It is up to caller to make sure the pointer is pointing
|
|
|
|
// to the base address the object.
|
2013-08-24 22:16:00 +08:00
|
|
|
//
|
2013-04-10 02:16:05 +08:00
|
|
|
// We go for 2nd option for simplicity.
|
|
|
|
if (!isIdentifiedObject(V))
|
|
|
|
return false;
|
|
|
|
|
2012-02-28 04:46:07 +08:00
|
|
|
// This function needs to use the aligned object size because we allow
|
|
|
|
// reads a bit past the end given sufficient alignment.
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
uint64_t ObjectSize = getObjectSize(V, DL, TLI, NullIsValidLoc,
|
|
|
|
/*RoundToAlign*/ true);
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2015-06-17 15:21:38 +08:00
|
|
|
return ObjectSize != MemoryLocation::UnknownSize && ObjectSize < Size;
|
2011-01-19 05:16:06 +08:00
|
|
|
}
|
|
|
|
|
2019-08-24 01:56:10 +08:00
|
|
|
/// Return the minimal extent from \p V to the end of the underlying object,
|
|
|
|
/// assuming the result is used in an aliasing query. E.g., we do use the query
|
|
|
|
/// location size and the fact that null pointers cannot alias here.
|
|
|
|
static uint64_t getMinimalExtentFrom(const Value &V,
|
|
|
|
const LocationSize &LocSize,
|
|
|
|
const DataLayout &DL,
|
|
|
|
bool NullIsValidLoc) {
|
|
|
|
// If we have dereferenceability information we know a lower bound for the
|
|
|
|
// extent as accesses for a lower offset would be valid. We need to exclude
|
|
|
|
// the "or null" part if null is a valid pointer.
|
2021-03-20 02:15:29 +08:00
|
|
|
bool CanBeNull, CanBeFreed;
|
|
|
|
uint64_t DerefBytes =
|
|
|
|
V.getPointerDereferenceableBytes(DL, CanBeNull, CanBeFreed);
|
2019-08-24 01:56:10 +08:00
|
|
|
DerefBytes = (CanBeNull && NullIsValidLoc) ? 0 : DerefBytes;
|
2021-03-20 02:15:29 +08:00
|
|
|
DerefBytes = CanBeFreed ? 0 : DerefBytes;
|
2019-08-24 01:56:10 +08:00
|
|
|
// If queried with a precise location size, we assume that location size to be
|
|
|
|
// accessed, thus valid.
|
|
|
|
if (LocSize.isPrecise())
|
|
|
|
DerefBytes = std::max(DerefBytes, LocSize.getValue());
|
|
|
|
return DerefBytes;
|
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Returns true if we can prove that the object specified by V has size Size.
|
2015-08-06 15:57:58 +08:00
|
|
|
static bool isObjectSize(const Value *V, uint64_t Size, const DataLayout &DL,
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
const TargetLibraryInfo &TLI, bool NullIsValidLoc) {
|
|
|
|
uint64_t ObjectSize = getObjectSize(V, DL, TLI, NullIsValidLoc);
|
2015-06-17 15:21:38 +08:00
|
|
|
return ObjectSize != MemoryLocation::UnknownSize && ObjectSize == Size;
|
2008-06-16 14:30:22 +08:00
|
|
|
}
|
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
// GetElementPtr Instruction Decomposition and Analysis
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
namespace {
|
|
|
|
/// Represents zext(sext(V)).
|
|
|
|
struct ExtendedValue {
|
|
|
|
const Value *V;
|
|
|
|
unsigned ZExtBits;
|
|
|
|
unsigned SExtBits;
|
|
|
|
|
|
|
|
explicit ExtendedValue(const Value *V, unsigned ZExtBits = 0,
|
|
|
|
unsigned SExtBits = 0)
|
|
|
|
: V(V), ZExtBits(ZExtBits), SExtBits(SExtBits) {}
|
|
|
|
|
|
|
|
unsigned getBitWidth() const {
|
|
|
|
return V->getType()->getPrimitiveSizeInBits() + ZExtBits + SExtBits;
|
2021-03-27 22:15:47 +08:00
|
|
|
}
|
|
|
|
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
ExtendedValue withValue(const Value *NewV) const {
|
|
|
|
return ExtendedValue(NewV, ZExtBits, SExtBits);
|
|
|
|
}
|
|
|
|
|
|
|
|
ExtendedValue withZExtOfValue(const Value *NewV) const {
|
|
|
|
unsigned ExtendBy = V->getType()->getPrimitiveSizeInBits() -
|
|
|
|
NewV->getType()->getPrimitiveSizeInBits();
|
|
|
|
// zext(sext(zext(NewV))) == zext(zext(zext(NewV)))
|
|
|
|
return ExtendedValue(NewV, ZExtBits + SExtBits + ExtendBy, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
ExtendedValue withSExtOfValue(const Value *NewV) const {
|
|
|
|
unsigned ExtendBy = V->getType()->getPrimitiveSizeInBits() -
|
|
|
|
NewV->getType()->getPrimitiveSizeInBits();
|
|
|
|
// zext(sext(sext(NewV)))
|
|
|
|
return ExtendedValue(NewV, ZExtBits, SExtBits + ExtendBy);
|
|
|
|
}
|
|
|
|
|
|
|
|
APInt evaluateWith(APInt N) const {
|
|
|
|
assert(N.getBitWidth() == V->getType()->getPrimitiveSizeInBits() &&
|
|
|
|
"Incompatible bit width");
|
|
|
|
if (SExtBits) N = N.sext(N.getBitWidth() + SExtBits);
|
|
|
|
if (ZExtBits) N = N.zext(N.getBitWidth() + ZExtBits);
|
|
|
|
return N;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool canDistributeOver(bool NUW, bool NSW) const {
|
|
|
|
// zext(x op<nuw> y) == zext(x) op<nuw> zext(y)
|
|
|
|
// sext(x op<nsw> y) == sext(x) op<nsw> sext(y)
|
|
|
|
return (!ZExtBits || NUW) && (!SExtBits || NSW);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
/// Represents zext(sext(V)) * Scale + Offset.
|
|
|
|
struct LinearExpression {
|
|
|
|
ExtendedValue Val;
|
|
|
|
APInt Scale;
|
|
|
|
APInt Offset;
|
|
|
|
|
|
|
|
LinearExpression(const ExtendedValue &Val, const APInt &Scale,
|
|
|
|
const APInt &Offset)
|
|
|
|
: Val(Val), Scale(Scale), Offset(Offset) {}
|
|
|
|
|
|
|
|
LinearExpression(const ExtendedValue &Val) : Val(Val) {
|
|
|
|
unsigned BitWidth = Val.getBitWidth();
|
|
|
|
Scale = APInt(BitWidth, 1);
|
|
|
|
Offset = APInt(BitWidth, 0);
|
|
|
|
}
|
|
|
|
};
|
2021-03-27 22:15:47 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Analyzes the specified value as a linear expression: "A*V + B", where A and
|
|
|
|
/// B are constant integers.
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
static LinearExpression GetLinearExpression(
|
|
|
|
const ExtendedValue &Val, const DataLayout &DL, unsigned Depth,
|
|
|
|
AssumptionCache *AC, DominatorTree *DT) {
|
2010-08-19 06:07:29 +08:00
|
|
|
// Limit our recursion depth.
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
if (Depth == 6)
|
|
|
|
return Val;
|
2013-08-24 22:16:00 +08:00
|
|
|
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
if (const ConstantInt *Const = dyn_cast<ConstantInt>(Val.V))
|
|
|
|
return LinearExpression(Val, APInt(Val.getBitWidth(), 0),
|
|
|
|
Val.evaluateWith(Const->getValue()));
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
if (const BinaryOperator *BOp = dyn_cast<BinaryOperator>(Val.V)) {
|
2010-08-19 06:07:29 +08:00
|
|
|
if (ConstantInt *RHSC = dyn_cast<ConstantInt>(BOp->getOperand(1))) {
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
APInt RHS = Val.evaluateWith(RHSC->getValue());
|
|
|
|
// The only non-OBO case we deal with is or, and only limited to the
|
|
|
|
// case where it is both nuw and nsw.
|
|
|
|
bool NUW = true, NSW = true;
|
|
|
|
if (isa<OverflowingBinaryOperator>(BOp)) {
|
|
|
|
NUW &= BOp->hasNoUnsignedWrap();
|
|
|
|
NSW &= BOp->hasNoSignedWrap();
|
|
|
|
}
|
|
|
|
if (!Val.canDistributeOver(NUW, NSW))
|
|
|
|
return Val;
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
switch (BOp->getOpcode()) {
|
2015-08-06 15:57:58 +08:00
|
|
|
default:
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
// We don't understand this instruction, so we can't decompose it any
|
|
|
|
// further.
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
return Val;
|
2010-08-19 06:07:29 +08:00
|
|
|
case Instruction::Or:
|
|
|
|
// X|C == X+C if all the bits in C are unset in X. Otherwise we can't
|
|
|
|
// analyze it.
|
2016-12-19 16:22:17 +08:00
|
|
|
if (!MaskedValueIsZero(BOp->getOperand(0), RHSC->getValue(), DL, 0, AC,
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
BOp, DT))
|
|
|
|
return Val;
|
|
|
|
|
2016-08-18 04:30:52 +08:00
|
|
|
LLVM_FALLTHROUGH;
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
case Instruction::Add: {
|
|
|
|
LinearExpression E = GetLinearExpression(
|
|
|
|
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);
|
|
|
|
E.Offset += RHS;
|
|
|
|
return E;
|
|
|
|
}
|
|
|
|
case Instruction::Sub: {
|
|
|
|
LinearExpression E = GetLinearExpression(
|
|
|
|
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);
|
|
|
|
E.Offset -= RHS;
|
|
|
|
return E;
|
|
|
|
}
|
|
|
|
case Instruction::Mul: {
|
|
|
|
LinearExpression E = GetLinearExpression(
|
|
|
|
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);
|
|
|
|
E.Offset *= RHS;
|
|
|
|
E.Scale *= RHS;
|
|
|
|
return E;
|
|
|
|
}
|
2010-08-19 06:07:29 +08:00
|
|
|
case Instruction::Shl:
|
2018-01-06 00:18:47 +08:00
|
|
|
// We're trying to linearize an expression of the kind:
|
|
|
|
// shl i8 -128, 36
|
|
|
|
// where the shift count exceeds the bitwidth of the type.
|
|
|
|
// We can't decompose this further (the expression would return
|
|
|
|
// a poison value).
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
if (RHS.getLimitedValue() > Val.getBitWidth())
|
|
|
|
return Val;
|
|
|
|
|
|
|
|
LinearExpression E = GetLinearExpression(
|
|
|
|
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);
|
|
|
|
E.Offset <<= RHS.getLimitedValue();
|
|
|
|
E.Scale <<= RHS.getLimitedValue();
|
|
|
|
return E;
|
2010-08-19 06:07:29 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
if (isa<ZExtInst>(Val.V))
|
|
|
|
return GetLinearExpression(
|
|
|
|
Val.withZExtOfValue(cast<CastInst>(Val.V)->getOperand(0)),
|
|
|
|
DL, Depth + 1, AC, DT);
|
|
|
|
|
|
|
|
if (isa<SExtInst>(Val.V))
|
|
|
|
return GetLinearExpression(
|
|
|
|
Val.withSExtOfValue(cast<CastInst>(Val.V)->getOperand(0)),
|
|
|
|
DL, Depth + 1, AC, DT);
|
2013-08-24 22:16:00 +08:00
|
|
|
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
return Val;
|
2010-08-19 06:07:29 +08:00
|
|
|
}
|
|
|
|
|
2016-01-30 10:42:11 +08:00
|
|
|
/// To ensure a pointer offset fits in an integer of size PointerSize
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
/// (in bits) when that size is smaller than the maximum pointer size. This is
|
|
|
|
/// an issue, for example, in particular for 32b pointers with negative indices
|
|
|
|
/// that rely on two's complement wrap-arounds for precise alias information
|
|
|
|
/// where the maximum pointer size is 64b.
|
2020-07-09 18:51:24 +08:00
|
|
|
static APInt adjustToPointerSize(const APInt &Offset, unsigned PointerSize) {
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
assert(PointerSize <= Offset.getBitWidth() && "Invalid PointerSize!");
|
|
|
|
unsigned ShiftBits = Offset.getBitWidth() - PointerSize;
|
|
|
|
return (Offset << ShiftBits).ashr(ShiftBits);
|
|
|
|
}
|
|
|
|
|
|
|
|
static unsigned getMaxPointerSize(const DataLayout &DL) {
|
|
|
|
unsigned MaxPointerSize = DL.getMaxPointerSizeInBits();
|
|
|
|
if (MaxPointerSize < 64 && ForceAtLeast64Bits) MaxPointerSize = 64;
|
|
|
|
if (DoubleCalcBits) MaxPointerSize *= 2;
|
|
|
|
|
|
|
|
return MaxPointerSize;
|
2016-01-30 10:42:11 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// If V is a symbolic pointer expression, decompose it into a base pointer
|
|
|
|
/// with a constant offset and a number of scaled symbolic offsets.
|
2010-08-19 06:07:29 +08:00
|
|
|
///
|
2015-08-06 16:17:06 +08:00
|
|
|
/// The scaled symbolic offsets (represented by pairs of a Value* and a scale
|
|
|
|
/// in the VarIndices vector) are Value*'s that are known to be scaled by the
|
|
|
|
/// specified amount, but which may have other unrepresented high bits. As
|
|
|
|
/// such, the gep cannot necessarily be reconstructed from its decomposed form.
|
2010-08-19 06:07:29 +08:00
|
|
|
///
|
2020-10-18 04:09:32 +08:00
|
|
|
/// This function is capable of analyzing everything that getUnderlyingObject
|
|
|
|
/// can look through. To be able to do that getUnderlyingObject and
|
|
|
|
/// DecomposeGEPExpression must use the same search depth
|
|
|
|
/// (MaxLookupSearchDepth).
|
2020-11-22 03:47:11 +08:00
|
|
|
BasicAAResult::DecomposedGEP
|
|
|
|
BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
|
|
|
|
AssumptionCache *AC, DominatorTree *DT) {
|
2010-08-19 06:07:29 +08:00
|
|
|
// Limit recursion depth to limit compile time in crazy cases.
|
2014-03-27 05:30:19 +08:00
|
|
|
unsigned MaxLookup = MaxLookupSearchDepth;
|
2015-08-06 07:40:30 +08:00
|
|
|
SearchTimes++;
|
2020-12-14 03:27:16 +08:00
|
|
|
const Instruction *CxtI = dyn_cast<Instruction>(V);
|
2013-08-24 22:16:00 +08:00
|
|
|
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
unsigned MaxPointerSize = getMaxPointerSize(DL);
|
2020-11-22 03:47:11 +08:00
|
|
|
DecomposedGEP Decomposed;
|
|
|
|
Decomposed.Offset = APInt(MaxPointerSize, 0);
|
|
|
|
Decomposed.HasCompileTimeConstantScale = true;
|
2010-08-19 06:07:29 +08:00
|
|
|
do {
|
|
|
|
// See if this is a bitcast or GEP.
|
|
|
|
const Operator *Op = dyn_cast<Operator>(V);
|
2014-04-15 12:59:12 +08:00
|
|
|
if (!Op) {
|
2010-08-19 06:07:29 +08:00
|
|
|
// The only non-operator case we can handle are GlobalAliases.
|
|
|
|
if (const GlobalAlias *GA = dyn_cast<GlobalAlias>(V)) {
|
Don't IPO over functions that can be de-refined
Summary:
Fixes PR26774.
If you're aware of the issue, feel free to skip the "Motivation"
section and jump directly to "This patch".
Motivation:
I define "refinement" as discarding behaviors from a program that the
optimizer has license to discard. So transforming:
```
void f(unsigned x) {
unsigned t = 5 / x;
(void)t;
}
```
to
```
void f(unsigned x) { }
```
is refinement, since the behavior went from "if x == 0 then undefined
else nothing" to "nothing" (the optimizer has license to discard
undefined behavior).
Refinement is a fundamental aspect of many mid-level optimizations done
by LLVM. For instance, transforming `x == (x + 1)` to `false` also
involves refinement since the expression's value went from "if x is
`undef` then { `true` or `false` } else { `false` }" to "`false`" (by
definition, the optimizer has license to fold `undef` to any non-`undef`
value).
Unfortunately, refinement implies that the optimizer cannot assume
that the implementation of a function it can see has all of the
behavior an unoptimized or a differently optimized version of the same
function can have. This is a problem for functions with comdat
linkage, where a function can be replaced by an unoptimized or a
differently optimized version of the same source level function.
For instance, FunctionAttrs cannot assume a comdat function is
actually `readnone` even if it does not have any loads or stores in
it; since there may have been loads and stores in the "original
function" that were refined out in the currently visible variant, and
at the link step the linker may in fact choose an implementation with
a load or a store. As an example, consider a function that does two
atomic loads from the same memory location, and writes to memory only
if the two values are not equal. The optimizer is allowed to refine
this function by first CSE'ing the two loads, and the folding the
comparision to always report that the two values are equal. Such a
refined variant will look like it is `readonly`. However, the
unoptimized version of the function can still write to memory (since
the two loads //can// result in different values), and selecting the
unoptimized version at link time will retroactively invalidate
transforms we may have done under the assumption that the function
does not write to memory.
Note: this is not just a problem with atomics or with linking
differently optimized object files. See PR26774 for more realistic
examples that involved neither.
This patch:
This change introduces a new set of linkage types, predicated as
`GlobalValue::mayBeDerefined` that returns true if the linkage type
allows a function to be replaced by a differently optimized variant at
link time. It then changes a set of IPO passes to bail out if they see
such a function.
Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18634
llvm-svn: 265762
2016-04-08 08:48:30 +08:00
|
|
|
if (!GA->isInterposable()) {
|
2010-08-19 06:07:29 +08:00
|
|
|
V = GA->getAliasee();
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
Decomposed.Base = V;
|
2020-11-22 03:47:11 +08:00
|
|
|
return Decomposed;
|
2010-08-19 06:07:29 +08:00
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2014-07-15 08:56:40 +08:00
|
|
|
if (Op->getOpcode() == Instruction::BitCast ||
|
|
|
|
Op->getOpcode() == Instruction::AddrSpaceCast) {
|
2010-08-19 06:07:29 +08:00
|
|
|
V = Op->getOperand(0);
|
|
|
|
continue;
|
|
|
|
}
|
2010-12-16 04:49:55 +08:00
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
const GEPOperator *GEPOp = dyn_cast<GEPOperator>(Op);
|
2014-04-15 12:59:12 +08:00
|
|
|
if (!GEPOp) {
|
[ValueTracking, BasicAA] Don't simplify instructions
GetUnderlyingObject() (and by required symmetry
DecomposeGEPExpression()) will call SimplifyInstruction() on the
passed value if other checks fail. This simplification is very
expensive, but has little effect in practice. This patch removes
the SimplifyInstruction call(), and replaces it with a check for
single-argument phis (which can occur in canonical IR in LCSSA
form), which is the only useful simplification case I was able to
identify.
At O3 the geomean CTMark improvement is -1.7%. The largest
improvement is SPASS with ThinLTO at -6%.
In test-suite, I see only two tests with a hash difference and
no code size difference (PAQ8p, Ptrdist), which indicates that
the simplification only ends up being useful very rarely. (I would
have liked to figure out which simplification is responsible here,
but wasn't able to spot it looking at transformation logs.)
The AMDGPU test case that is update was using two selects with
undef condition, in which case GetUnderlyingObject will return
the first select operand as the underlying object. This will of
course not happen with non-undef conditions, so this was not
testing anything realistic. Additionally this illustrates potential
unsoundness: While GetUnderlyingObject will pick the first operand,
the select might be later replaced by the second operand, resulting
in inconsistent assumptions about the undef value.
Differential Revision: https://reviews.llvm.org/D82261
2020-06-20 19:59:24 +08:00
|
|
|
if (const auto *PHI = dyn_cast<PHINode>(V)) {
|
|
|
|
// Look through single-arg phi nodes created by LCSSA.
|
|
|
|
if (PHI->getNumIncomingValues() == 1) {
|
|
|
|
V = PHI->getIncomingValue(0);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
} else if (const auto *Call = dyn_cast<CallBase>(V)) {
|
Implement strip.invariant.group
Summary:
This patch introduce new intrinsic -
strip.invariant.group that was described in the
RFC: Devirtualization v2
Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar
Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47103
Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 336073
2018-07-02 12:49:30 +08:00
|
|
|
// CaptureTracking can know about special capturing properties of some
|
|
|
|
// intrinsics like launder.invariant.group, that can't be expressed with
|
|
|
|
// the attributes, but have properties like returning aliasing pointer.
|
|
|
|
// Because some analysis may assume that nocaptured pointer is not
|
|
|
|
// returned from some special intrinsic (because function would have to
|
|
|
|
// be marked with returns attribute), it is crucial to use this function
|
|
|
|
// because it should be in sync with CaptureTracking. Not using it may
|
|
|
|
// cause weird miscompilations where 2 aliasing pointers are assumed to
|
|
|
|
// noalias.
|
2019-08-16 02:39:56 +08:00
|
|
|
if (auto *RP = getArgumentAliasingToReturnedPointer(Call, false)) {
|
2018-05-23 17:16:44 +08:00
|
|
|
V = RP;
|
2016-07-11 09:32:20 +08:00
|
|
|
continue;
|
|
|
|
}
|
2018-05-23 17:16:44 +08:00
|
|
|
}
|
2016-07-11 09:32:20 +08:00
|
|
|
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
Decomposed.Base = V;
|
2020-11-22 03:47:11 +08:00
|
|
|
return Decomposed;
|
2011-05-25 02:24:08 +08:00
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2021-03-04 00:43:32 +08:00
|
|
|
// Track whether we've seen at least one in bounds gep, and if so, whether
|
|
|
|
// all geps parsed were in bounds.
|
|
|
|
if (Decomposed.InBounds == None)
|
|
|
|
Decomposed.InBounds = GEPOp->isInBounds();
|
|
|
|
else if (!GEPOp->isInBounds())
|
|
|
|
Decomposed.InBounds = false;
|
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
// Don't attempt to analyze GEPs over unsized objects.
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
if (!GEPOp->getSourceElementType()->isSized()) {
|
|
|
|
Decomposed.Base = V;
|
2020-11-22 03:47:11 +08:00
|
|
|
return Decomposed;
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2020-04-11 07:24:18 +08:00
|
|
|
// Don't attempt to analyze GEPs if index scale is not a compile-time
|
|
|
|
// constant.
|
2020-04-24 03:19:54 +08:00
|
|
|
if (isa<ScalableVectorType>(GEPOp->getSourceElementType())) {
|
2020-04-11 07:24:18 +08:00
|
|
|
Decomposed.Base = V;
|
|
|
|
Decomposed.HasCompileTimeConstantScale = false;
|
2020-11-22 03:47:11 +08:00
|
|
|
return Decomposed;
|
2020-04-11 07:24:18 +08:00
|
|
|
}
|
|
|
|
|
2013-11-16 08:36:43 +08:00
|
|
|
unsigned AS = GEPOp->getPointerAddressSpace();
|
2010-08-19 06:07:29 +08:00
|
|
|
// Walk the indices of the GEP, accumulating them into BaseOff/VarIndices.
|
|
|
|
gep_type_iterator GTI = gep_type_begin(GEPOp);
|
2016-01-30 10:42:11 +08:00
|
|
|
unsigned PointerSize = DL.getPointerSizeInBits(AS);
|
2016-10-22 10:41:39 +08:00
|
|
|
// Assume all GEP operands are constants until proven otherwise.
|
|
|
|
bool GepHasConstantOffset = true;
|
2015-08-06 15:57:58 +08:00
|
|
|
for (User::const_op_iterator I = GEPOp->op_begin() + 1, E = GEPOp->op_end();
|
2016-12-02 10:24:42 +08:00
|
|
|
I != E; ++I, ++GTI) {
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
const Value *Index = *I;
|
2010-08-19 06:07:29 +08:00
|
|
|
// Compute the (potentially symbolic) offset in bytes for this index.
|
2016-12-02 10:24:42 +08:00
|
|
|
if (StructType *STy = GTI.getStructTypeOrNull()) {
|
2010-08-19 06:07:29 +08:00
|
|
|
// For a struct, add the member offset.
|
|
|
|
unsigned FieldNo = cast<ConstantInt>(Index)->getZExtValue();
|
2015-08-06 15:57:58 +08:00
|
|
|
if (FieldNo == 0)
|
|
|
|
continue;
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2020-11-08 01:29:47 +08:00
|
|
|
Decomposed.Offset += DL.getStructLayout(STy)->getElementOffset(FieldNo);
|
2010-08-19 06:07:29 +08:00
|
|
|
continue;
|
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
// For an array/pointer, add the element offset, explicitly scaled.
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
if (const ConstantInt *CIdx = dyn_cast<ConstantInt>(Index)) {
|
2015-08-06 15:57:58 +08:00
|
|
|
if (CIdx->isZero())
|
|
|
|
continue;
|
2020-11-08 01:29:47 +08:00
|
|
|
Decomposed.Offset +=
|
2020-11-22 04:31:06 +08:00
|
|
|
DL.getTypeAllocSize(GTI.getIndexedType()).getFixedSize() *
|
|
|
|
CIdx->getValue().sextOrTrunc(MaxPointerSize);
|
2010-08-19 06:07:29 +08:00
|
|
|
continue;
|
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2016-10-22 10:41:39 +08:00
|
|
|
GepHasConstantOffset = false;
|
|
|
|
|
2020-04-11 07:24:18 +08:00
|
|
|
APInt Scale(MaxPointerSize,
|
|
|
|
DL.getTypeAllocSize(GTI.getIndexedType()).getFixedSize());
|
2021-03-27 22:15:47 +08:00
|
|
|
// If the integer type is smaller than the pointer size, it is implicitly
|
|
|
|
// sign extended to pointer size.
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
unsigned Width = Index->getType()->getIntegerBitWidth();
|
|
|
|
unsigned SExtBits = PointerSize > Width ? PointerSize - Width : 0;
|
|
|
|
LinearExpression LE = GetLinearExpression(
|
|
|
|
ExtendedValue(Index, 0, SExtBits), DL, 0, AC, DT);
|
2021-03-27 22:15:47 +08:00
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
// The GEP index scale ("Scale") scales C1*V+C2, yielding (C1*V+C2)*Scale.
|
|
|
|
// This gives us an aggregate computation of (C1*Scale)*V + C2*Scale.
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
|
|
|
|
// It can be the case that, even through C1*V+C2 does not overflow for
|
|
|
|
// relevant values of V, (C2*Scale) can overflow. In that case, we cannot
|
|
|
|
// decompose the expression in this way.
|
|
|
|
//
|
|
|
|
// FIXME: C1*Scale and the other operations in the decomposed
|
|
|
|
// (C1*Scale)*V+C2*Scale can also overflow. We should check for this
|
|
|
|
// possibility.
|
2020-11-08 00:48:41 +08:00
|
|
|
bool Overflow;
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
APInt ScaledOffset = LE.Offset.sextOrTrunc(MaxPointerSize)
|
2020-11-08 00:48:41 +08:00
|
|
|
.smul_ov(Scale, Overflow);
|
|
|
|
if (Overflow) {
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
LE = LinearExpression(ExtendedValue(Index, 0, SExtBits));
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
} else {
|
2020-11-08 01:29:47 +08:00
|
|
|
Decomposed.Offset += ScaledOffset;
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
Scale *= LE.Scale.sextOrTrunc(MaxPointerSize);
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2011-04-15 13:18:47 +08:00
|
|
|
// If we already had an occurrence of this index variable, merge this
|
2010-08-19 06:07:29 +08:00
|
|
|
// scale into it. For example, we want to handle:
|
|
|
|
// A[x][x] -> x*16 + x*4 -> x*20
|
|
|
|
// This also ensures that 'x' only appears in the index list once.
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
for (unsigned i = 0, e = Decomposed.VarIndices.size(); i != e; ++i) {
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
if (Decomposed.VarIndices[i].V == LE.Val.V &&
|
|
|
|
Decomposed.VarIndices[i].ZExtBits == LE.Val.ZExtBits &&
|
|
|
|
Decomposed.VarIndices[i].SExtBits == LE.Val.SExtBits) {
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
Scale += Decomposed.VarIndices[i].Scale;
|
|
|
|
Decomposed.VarIndices.erase(Decomposed.VarIndices.begin() + i);
|
2010-08-19 06:07:29 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
// Make sure that we have a scale that makes sense for this target's
|
|
|
|
// pointer size.
|
2016-01-30 10:42:11 +08:00
|
|
|
Scale = adjustToPointerSize(Scale, PointerSize);
|
2013-08-24 22:16:00 +08:00
|
|
|
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
if (!!Scale) {
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
VariableGEPIndex Entry = {LE.Val.V, LE.Val.ZExtBits, LE.Val.SExtBits,
|
|
|
|
Scale, CxtI};
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
Decomposed.VarIndices.push_back(Entry);
|
2010-08-19 06:47:56 +08:00
|
|
|
}
|
2010-08-19 06:07:29 +08:00
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2016-01-30 13:52:53 +08:00
|
|
|
// Take care of wrap-arounds
|
2020-11-08 01:29:47 +08:00
|
|
|
if (GepHasConstantOffset)
|
|
|
|
Decomposed.Offset = adjustToPointerSize(Decomposed.Offset, PointerSize);
|
2016-01-30 13:52:53 +08:00
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
// Analyze the base pointer next.
|
|
|
|
V = GEPOp->getOperand(0);
|
|
|
|
} while (--MaxLookup);
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2010-08-19 06:07:29 +08:00
|
|
|
// If the chain of expressions is too deep, just return early.
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
Decomposed.Base = V;
|
2015-08-06 07:40:30 +08:00
|
|
|
SearchLimitReached++;
|
2020-11-22 03:47:11 +08:00
|
|
|
return Decomposed;
|
2010-08-19 06:07:29 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Returns whether the given pointer value points to memory that is local to
|
|
|
|
/// the function, with global constants being considered local to all
|
|
|
|
/// functions.
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
bool BasicAAResult::pointsToConstantMemory(const MemoryLocation &Loc,
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
AAQueryInfo &AAQI, bool OrLocal) {
|
2010-11-09 00:45:26 +08:00
|
|
|
assert(Visited.empty() && "Visited must be cleared after use!");
|
2003-12-12 07:20:16 +08:00
|
|
|
|
2010-11-09 04:26:19 +08:00
|
|
|
unsigned MaxLookup = 8;
|
2010-11-09 00:45:26 +08:00
|
|
|
SmallVector<const Value *, 16> Worklist;
|
|
|
|
Worklist.push_back(Loc.Ptr);
|
|
|
|
do {
|
2020-07-31 17:09:54 +08:00
|
|
|
const Value *V = getUnderlyingObject(Worklist.pop_back_val());
|
2014-11-19 15:49:26 +08:00
|
|
|
if (!Visited.insert(V).second) {
|
2010-11-09 00:45:26 +08:00
|
|
|
Visited.clear();
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
return AAResultBase::pointsToConstantMemory(Loc, AAQI, OrLocal);
|
2010-11-09 00:45:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
// An alloca instruction defines local memory.
|
|
|
|
if (OrLocal && isa<AllocaInst>(V))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
// A global constant counts as local memory for our purposes.
|
|
|
|
if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(V)) {
|
|
|
|
// Note: this doesn't require GV to be "ODR" because it isn't legal for a
|
|
|
|
// global to be marked constant in some modules and non-constant in
|
|
|
|
// others. GV may even be a declaration, not a definition.
|
|
|
|
if (!GV->isConstant()) {
|
|
|
|
Visited.clear();
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
return AAResultBase::pointsToConstantMemory(Loc, AAQI, OrLocal);
|
2010-11-09 00:45:26 +08:00
|
|
|
}
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
// If both select values point to local memory, then so does the select.
|
|
|
|
if (const SelectInst *SI = dyn_cast<SelectInst>(V)) {
|
|
|
|
Worklist.push_back(SI->getTrueValue());
|
|
|
|
Worklist.push_back(SI->getFalseValue());
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
// If all values incoming to a phi node point to local memory, then so does
|
|
|
|
// the phi.
|
|
|
|
if (const PHINode *PN = dyn_cast<PHINode>(V)) {
|
2010-11-09 04:26:19 +08:00
|
|
|
// Don't bother inspecting phi nodes with many operands.
|
|
|
|
if (PN->getNumIncomingValues() > MaxLookup) {
|
|
|
|
Visited.clear();
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
return AAResultBase::pointsToConstantMemory(Loc, AAQI, OrLocal);
|
2010-11-09 04:26:19 +08:00
|
|
|
}
|
2021-01-23 15:25:01 +08:00
|
|
|
append_range(Worklist, PN->incoming_values());
|
2010-11-09 00:45:26 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Otherwise be conservative.
|
|
|
|
Visited.clear();
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
return AAResultBase::pointsToConstantMemory(Loc, AAQI, OrLocal);
|
2010-11-09 04:26:19 +08:00
|
|
|
} while (!Worklist.empty() && --MaxLookup);
|
2010-08-06 09:25:49 +08:00
|
|
|
|
2010-11-09 00:45:26 +08:00
|
|
|
Visited.clear();
|
2010-11-09 04:26:19 +08:00
|
|
|
return Worklist.empty();
|
2004-01-31 06:17:24 +08:00
|
|
|
}
|
2003-12-12 07:20:16 +08:00
|
|
|
|
2021-03-24 04:18:30 +08:00
|
|
|
static bool isIntrinsicCall(const CallBase *Call, Intrinsic::ID IID) {
|
|
|
|
const IntrinsicInst *II = dyn_cast<IntrinsicInst>(Call);
|
|
|
|
return II && II->getIntrinsicID() == IID;
|
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Returns the behavior when calling the given call site.
|
2019-01-07 13:42:51 +08:00
|
|
|
FunctionModRefBehavior BasicAAResult::getModRefBehavior(const CallBase *Call) {
|
|
|
|
if (Call->doesNotAccessMemory())
|
2010-08-06 09:25:49 +08:00
|
|
|
// Can't do better than this.
|
2015-07-23 07:15:57 +08:00
|
|
|
return FMRB_DoesNotAccessMemory;
|
2010-08-06 09:25:49 +08:00
|
|
|
|
2015-07-23 07:15:57 +08:00
|
|
|
FunctionModRefBehavior Min = FMRB_UnknownModRefBehavior;
|
2010-08-06 09:25:49 +08:00
|
|
|
|
|
|
|
// If the callsite knows it only reads memory, don't return worse
|
|
|
|
// than that.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->onlyReadsMemory())
|
2015-07-23 07:15:57 +08:00
|
|
|
Min = FMRB_OnlyReadsMemory;
|
2019-01-07 13:42:51 +08:00
|
|
|
else if (Call->doesNotReadMemory())
|
2020-01-22 08:51:17 +08:00
|
|
|
Min = FMRB_OnlyWritesMemory;
|
2010-08-06 09:25:49 +08:00
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->onlyAccessesArgMemory())
|
2015-07-23 07:15:57 +08:00
|
|
|
Min = FunctionModRefBehavior(Min & FMRB_OnlyAccessesArgumentPointees);
|
2019-01-07 13:42:51 +08:00
|
|
|
else if (Call->onlyAccessesInaccessibleMemory())
|
2017-11-02 20:18:33 +08:00
|
|
|
Min = FunctionModRefBehavior(Min & FMRB_OnlyAccessesInaccessibleMem);
|
2019-01-07 13:42:51 +08:00
|
|
|
else if (Call->onlyAccessesInaccessibleMemOrArgMem())
|
2017-11-02 20:18:33 +08:00
|
|
|
Min = FunctionModRefBehavior(Min & FMRB_OnlyAccessesInaccessibleOrArgMem);
|
2015-07-11 18:30:36 +08:00
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
// If the call has operand bundles then aliasing attributes from the function
|
|
|
|
// it calls do not directly apply to the call. This can be made more precise
|
|
|
|
// in the future.
|
|
|
|
if (!Call->hasOperandBundles())
|
|
|
|
if (const Function *F = Call->getCalledFunction())
|
[AA] Hoist the logic to reformulate various AA queries in terms of other
parts of the AA interface out of the base class of every single AA
result object.
Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.
When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.
To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).
However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.
The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.
It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.
However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.
Differential Revision: http://reviews.llvm.org/D17329
llvm-svn: 262490
2016-03-02 23:56:53 +08:00
|
|
|
Min =
|
|
|
|
FunctionModRefBehavior(Min & getBestAAResults().getModRefBehavior(F));
|
|
|
|
|
|
|
|
return Min;
|
2010-08-06 09:25:49 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Returns the behavior when calling the given function. For use when the call
|
|
|
|
/// site is not known.
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
FunctionModRefBehavior BasicAAResult::getModRefBehavior(const Function *F) {
|
2010-11-09 00:08:43 +08:00
|
|
|
// If the function declares it doesn't access memory, we can't do better.
|
2010-08-06 09:25:49 +08:00
|
|
|
if (F->doesNotAccessMemory())
|
2015-07-23 07:15:57 +08:00
|
|
|
return FMRB_DoesNotAccessMemory;
|
2010-11-09 00:08:43 +08:00
|
|
|
|
2015-07-23 07:15:57 +08:00
|
|
|
FunctionModRefBehavior Min = FMRB_UnknownModRefBehavior;
|
2010-11-10 09:02:18 +08:00
|
|
|
|
2010-11-09 00:08:43 +08:00
|
|
|
// If the function declares it only reads memory, go with that.
|
2010-08-06 09:25:49 +08:00
|
|
|
if (F->onlyReadsMemory())
|
2015-07-23 07:15:57 +08:00
|
|
|
Min = FMRB_OnlyReadsMemory;
|
2016-07-04 16:01:29 +08:00
|
|
|
else if (F->doesNotReadMemory())
|
2020-01-22 08:51:17 +08:00
|
|
|
Min = FMRB_OnlyWritesMemory;
|
2010-08-06 09:25:49 +08:00
|
|
|
|
2015-07-11 18:30:36 +08:00
|
|
|
if (F->onlyAccessesArgMemory())
|
2015-07-23 07:15:57 +08:00
|
|
|
Min = FunctionModRefBehavior(Min & FMRB_OnlyAccessesArgumentPointees);
|
2016-11-09 05:07:42 +08:00
|
|
|
else if (F->onlyAccessesInaccessibleMemory())
|
|
|
|
Min = FunctionModRefBehavior(Min & FMRB_OnlyAccessesInaccessibleMem);
|
|
|
|
else if (F->onlyAccessesInaccessibleMemOrArgMem())
|
|
|
|
Min = FunctionModRefBehavior(Min & FMRB_OnlyAccessesInaccessibleOrArgMem);
|
2015-07-11 18:30:36 +08:00
|
|
|
|
[AA] Hoist the logic to reformulate various AA queries in terms of other
parts of the AA interface out of the base class of every single AA
result object.
Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.
When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.
To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).
However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.
The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.
It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.
However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.
Differential Revision: http://reviews.llvm.org/D17329
llvm-svn: 262490
2016-03-02 23:56:53 +08:00
|
|
|
return Min;
|
2010-08-06 09:25:49 +08:00
|
|
|
}
|
2009-02-06 07:36:27 +08:00
|
|
|
|
2016-07-04 16:01:29 +08:00
|
|
|
/// Returns true if this is a writeonly (i.e Mod only) parameter.
|
2019-01-07 13:42:51 +08:00
|
|
|
static bool isWriteOnlyParam(const CallBase *Call, unsigned ArgIdx,
|
2016-01-07 02:10:35 +08:00
|
|
|
const TargetLibraryInfo &TLI) {
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->paramHasAttr(ArgIdx, Attribute::WriteOnly))
|
2016-07-04 16:01:29 +08:00
|
|
|
return true;
|
Improve BasicAA CS-CS queries (redux)
This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both
problems.
Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.
Original commit message:
BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.
Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.
This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.
Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).
llvm-svn: 213219
2014-07-17 09:28:25 +08:00
|
|
|
|
|
|
|
// We can bound the aliasing properties of memset_pattern16 just as we can
|
|
|
|
// for memcpy/memset. This is particularly important because the
|
|
|
|
// LoopIdiomRecognizer likes to turn loops into calls to memset_pattern16
|
2016-07-04 16:01:29 +08:00
|
|
|
// whenever possible.
|
|
|
|
// FIXME Consider handling this in InferFunctionAttr.cpp together with other
|
|
|
|
// attributes.
|
[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).
Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.
The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)
There are additional changes required in clang.
Reviewers: rsmith
Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28476
llvm-svn: 292848
2017-01-24 07:16:46 +08:00
|
|
|
LibFunc F;
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->getCalledFunction() &&
|
|
|
|
TLI.getLibFunc(*Call->getCalledFunction(), F) &&
|
[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).
Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.
The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)
There are additional changes required in clang.
Reviewers: rsmith
Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28476
llvm-svn: 292848
2017-01-24 07:16:46 +08:00
|
|
|
F == LibFunc_memset_pattern16 && TLI.has(F))
|
2016-01-06 12:53:16 +08:00
|
|
|
if (ArgIdx == 0)
|
2016-01-07 02:10:35 +08:00
|
|
|
return true;
|
|
|
|
|
|
|
|
// TODO: memset_pattern4, memset_pattern8
|
|
|
|
// TODO: _chk variants
|
|
|
|
// TODO: strcmp, strcpy
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
ModRefInfo BasicAAResult::getArgModRefInfo(const CallBase *Call,
|
2016-01-07 02:10:35 +08:00
|
|
|
unsigned ArgIdx) {
|
2016-07-04 16:01:29 +08:00
|
|
|
// Checking for known builtin intrinsics and target library functions.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (isWriteOnlyParam(Call, ArgIdx, TLI))
|
2017-12-08 06:41:34 +08:00
|
|
|
return ModRefInfo::Mod;
|
Improve BasicAA CS-CS queries (redux)
This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both
problems.
Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.
Original commit message:
BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.
Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.
This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.
Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).
llvm-svn: 213219
2014-07-17 09:28:25 +08:00
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->paramHasAttr(ArgIdx, Attribute::ReadOnly))
|
2017-12-08 06:41:34 +08:00
|
|
|
return ModRefInfo::Ref;
|
2015-10-29 00:42:00 +08:00
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->paramHasAttr(ArgIdx, Attribute::ReadNone))
|
2017-12-08 06:41:34 +08:00
|
|
|
return ModRefInfo::NoModRef;
|
2015-10-29 01:54:48 +08:00
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
return AAResultBase::getArgModRefInfo(Call, ArgIdx);
|
Improve BasicAA CS-CS queries (redux)
This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both
problems.
Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.
Original commit message:
BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.
Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.
This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.
Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).
llvm-svn: 213219
2014-07-17 09:28:25 +08:00
|
|
|
}
|
|
|
|
|
2015-09-24 13:29:31 +08:00
|
|
|
#ifndef NDEBUG
|
2015-09-24 12:59:24 +08:00
|
|
|
static const Function *getParent(const Value *V) {
|
2017-05-20 03:01:21 +08:00
|
|
|
if (const Instruction *inst = dyn_cast<Instruction>(V)) {
|
|
|
|
if (!inst->getParent())
|
|
|
|
return nullptr;
|
2015-09-24 12:59:24 +08:00
|
|
|
return inst->getParent()->getParent();
|
2017-05-20 03:01:21 +08:00
|
|
|
}
|
2015-09-24 12:59:24 +08:00
|
|
|
|
|
|
|
if (const Argument *arg = dyn_cast<Argument>(V))
|
|
|
|
return arg->getParent();
|
|
|
|
|
|
|
|
return nullptr;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool notDifferentParent(const Value *O1, const Value *O2) {
|
|
|
|
|
|
|
|
const Function *F1 = getParent(O1);
|
|
|
|
const Function *F2 = getParent(O2);
|
|
|
|
|
|
|
|
return !F1 || !F2 || F1 == F2;
|
|
|
|
}
|
2015-09-24 13:29:31 +08:00
|
|
|
#endif
|
2015-09-24 12:59:24 +08:00
|
|
|
|
|
|
|
AliasResult BasicAAResult::alias(const MemoryLocation &LocA,
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
const MemoryLocation &LocB,
|
|
|
|
AAQueryInfo &AAQI) {
|
2015-09-24 12:59:24 +08:00
|
|
|
assert(notDifferentParent(LocA.Ptr, LocB.Ptr) &&
|
|
|
|
"BasicAliasAnalysis doesn't support interprocedural queries.");
|
2020-10-24 16:39:24 +08:00
|
|
|
return aliasCheck(LocA.Ptr, LocA.Size, LocB.Ptr, LocB.Size, AAQI);
|
2015-09-24 12:59:24 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Checks to see if the specified callsite can clobber the specified memory
|
|
|
|
/// object.
|
|
|
|
///
|
|
|
|
/// Since we only look at local properties of this function, we really can't
|
|
|
|
/// say much about this query. We do, however, use simple "address taken"
|
|
|
|
/// analysis on local objects.
|
2019-01-07 13:42:51 +08:00
|
|
|
ModRefInfo BasicAAResult::getModRefInfo(const CallBase *Call,
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
const MemoryLocation &Loc,
|
|
|
|
AAQueryInfo &AAQI) {
|
2019-01-07 13:42:51 +08:00
|
|
|
assert(notDifferentParent(Call, Loc.Ptr) &&
|
2010-07-07 22:27:09 +08:00
|
|
|
"AliasAnalysis query involving multiple functions!");
|
|
|
|
|
2020-07-31 17:09:54 +08:00
|
|
|
const Value *Object = getUnderlyingObject(Loc.Ptr);
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2018-08-14 09:24:35 +08:00
|
|
|
// Calls marked 'tail' cannot read or write allocas from the current frame
|
|
|
|
// because the current frame might be destroyed by the time they run. However,
|
|
|
|
// a tail call may use an alloca with byval. Calling with byval copies the
|
|
|
|
// contents of the alloca into argument registers or stack slots, so there is
|
|
|
|
// no lifetime issue.
|
2009-11-23 00:05:05 +08:00
|
|
|
if (isa<AllocaInst>(Object))
|
2019-01-07 13:42:51 +08:00
|
|
|
if (const CallInst *CI = dyn_cast<CallInst>(Call))
|
2018-08-14 09:24:35 +08:00
|
|
|
if (CI->isTailCall() &&
|
|
|
|
!CI->getAttributes().hasAttrSomewhere(Attribute::ByVal))
|
2017-12-08 06:41:34 +08:00
|
|
|
return ModRefInfo::NoModRef;
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2018-12-22 03:59:03 +08:00
|
|
|
// Stack restore is able to modify unescaped dynamic allocas. Assume it may
|
|
|
|
// modify them even though the alloca is not escaped.
|
|
|
|
if (auto *AI = dyn_cast<AllocaInst>(Object))
|
2019-01-07 13:42:51 +08:00
|
|
|
if (!AI->isStaticAlloca() && isIntrinsicCall(Call, Intrinsic::stackrestore))
|
2018-12-22 03:59:03 +08:00
|
|
|
return ModRefInfo::Mod;
|
|
|
|
|
2009-11-23 00:05:05 +08:00
|
|
|
// If the pointer is to a locally allocated object that does not escape,
|
2009-11-24 00:44:43 +08:00
|
|
|
// then the call can not mod/ref the pointer unless the call takes the pointer
|
|
|
|
// as an argument, and itself doesn't capture it.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (!isa<Constant>(Object) && Call != Object &&
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
isNonEscapingLocalObject(Object, &AAQI.IsCapturedCache)) {
|
2017-03-01 21:19:51 +08:00
|
|
|
|
|
|
|
// Optimistically assume that call doesn't touch Object and check this
|
|
|
|
// assumption in the following loop.
|
2017-12-08 06:41:34 +08:00
|
|
|
ModRefInfo Result = ModRefInfo::NoModRef;
|
[ModRefInfo] Add must alias info to ModRefInfo.
Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.
Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.
Reviewers: dberlin, hfinkel, george.burgess.iv
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38862
llvm-svn: 321309
2017-12-22 05:41:53 +08:00
|
|
|
bool IsMustAlias = true;
|
2017-03-01 21:19:51 +08:00
|
|
|
|
2016-01-16 20:15:53 +08:00
|
|
|
unsigned OperandNo = 0;
|
2019-01-07 13:42:51 +08:00
|
|
|
for (auto CI = Call->data_operands_begin(), CE = Call->data_operands_end();
|
2016-01-16 20:15:53 +08:00
|
|
|
CI != CE; ++CI, ++OperandNo) {
|
2011-05-23 13:15:43 +08:00
|
|
|
// Only look at the no-capture or byval pointer arguments. If this
|
|
|
|
// pointer were passed to arguments that were neither of these, then it
|
|
|
|
// couldn't be no-capture.
|
2010-02-16 19:11:14 +08:00
|
|
|
if (!(*CI)->getType()->isPointerTy() ||
|
2019-01-07 13:42:51 +08:00
|
|
|
(!Call->doesNotCapture(OperandNo) &&
|
|
|
|
OperandNo < Call->getNumArgOperands() &&
|
|
|
|
!Call->isByValArgument(OperandNo)))
|
2009-11-24 00:44:43 +08:00
|
|
|
continue;
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2017-03-01 21:19:51 +08:00
|
|
|
// Call doesn't access memory through this operand, so we don't care
|
|
|
|
// if it aliases with Object.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->doesNotAccessMemory(OperandNo))
|
2017-03-01 21:19:51 +08:00
|
|
|
continue;
|
|
|
|
|
2010-09-15 05:25:10 +08:00
|
|
|
// If this is a no-capture pointer argument, see if we can tell that it
|
2017-03-01 21:19:51 +08:00
|
|
|
// is impossible to alias the pointer we're checking.
|
2020-11-20 04:41:51 +08:00
|
|
|
AliasResult AR = getBestAAResults().alias(
|
2020-11-18 03:11:09 +08:00
|
|
|
MemoryLocation::getBeforeOrAfter(*CI),
|
|
|
|
MemoryLocation::getBeforeOrAfter(Object), AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (AR != AliasResult::MustAlias)
|
[ModRefInfo] Add must alias info to ModRefInfo.
Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.
Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.
Reviewers: dberlin, hfinkel, george.burgess.iv
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38862
llvm-svn: 321309
2017-12-22 05:41:53 +08:00
|
|
|
IsMustAlias = false;
|
2019-02-05 16:30:48 +08:00
|
|
|
// Operand doesn't alias 'Object', continue looking for other aliases
|
2021-03-05 18:58:13 +08:00
|
|
|
if (AR == AliasResult::NoAlias)
|
2017-03-01 21:19:51 +08:00
|
|
|
continue;
|
|
|
|
// Operand aliases 'Object', but call doesn't modify it. Strengthen
|
|
|
|
// initial assumption and keep looking in case if there are more aliases.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->onlyReadsMemory(OperandNo)) {
|
2017-12-06 04:12:23 +08:00
|
|
|
Result = setRef(Result);
|
2017-03-01 21:19:51 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
// Operand aliases 'Object' but call only writes into it.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (Call->doesNotReadMemory(OperandNo)) {
|
2017-12-06 04:12:23 +08:00
|
|
|
Result = setMod(Result);
|
2017-03-01 21:19:51 +08:00
|
|
|
continue;
|
2009-11-24 00:44:43 +08:00
|
|
|
}
|
2017-03-01 21:19:51 +08:00
|
|
|
// This operand aliases 'Object' and call reads and writes into it.
|
[ModRefInfo] Add must alias info to ModRefInfo.
Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.
Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.
Reviewers: dberlin, hfinkel, george.burgess.iv
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38862
llvm-svn: 321309
2017-12-22 05:41:53 +08:00
|
|
|
// Setting ModRef will not yield an early return below, MustAlias is not
|
|
|
|
// used further.
|
2017-12-08 06:41:34 +08:00
|
|
|
Result = ModRefInfo::ModRef;
|
2017-03-01 21:19:51 +08:00
|
|
|
break;
|
2009-11-24 00:44:43 +08:00
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
[ModRefInfo] Add must alias info to ModRefInfo.
Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.
Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.
Reviewers: dberlin, hfinkel, george.burgess.iv
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38862
llvm-svn: 321309
2017-12-22 05:41:53 +08:00
|
|
|
// No operand aliases, reset Must bit. Add below if at least one aliases
|
|
|
|
// and all aliases found are MustAlias.
|
|
|
|
if (isNoModRef(Result))
|
|
|
|
IsMustAlias = false;
|
|
|
|
|
2017-03-01 21:19:51 +08:00
|
|
|
// Early return if we improved mod ref information
|
2018-01-19 18:26:40 +08:00
|
|
|
if (!isModAndRefSet(Result)) {
|
|
|
|
if (isNoModRef(Result))
|
|
|
|
return ModRefInfo::NoModRef;
|
[ModRefInfo] Add must alias info to ModRefInfo.
Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.
Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.
Reviewers: dberlin, hfinkel, george.burgess.iv
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38862
llvm-svn: 321309
2017-12-22 05:41:53 +08:00
|
|
|
return IsMustAlias ? setMust(Result) : clearMust(Result);
|
2018-01-19 18:26:40 +08:00
|
|
|
}
|
2009-11-23 00:05:05 +08:00
|
|
|
}
|
|
|
|
|
2020-03-28 13:59:52 +08:00
|
|
|
// If the call is malloc/calloc like, we can assume that it doesn't
|
2016-03-10 07:19:56 +08:00
|
|
|
// modify any IR visible value. This is only valid because we assume these
|
|
|
|
// routines do not read values visible in the IR. TODO: Consider special
|
|
|
|
// casing realloc and strdup routines which access only their arguments as
|
|
|
|
// well. Or alternatively, replace all of this with inaccessiblememonly once
|
2017-12-06 04:12:23 +08:00
|
|
|
// that's implemented fully.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (isMallocOrCallocLikeFn(Call, &TLI)) {
|
2016-03-10 07:19:56 +08:00
|
|
|
// Be conservative if the accessed pointer may alias the allocation -
|
|
|
|
// fallback to the generic handling below.
|
2021-03-05 18:58:13 +08:00
|
|
|
if (getBestAAResults().alias(MemoryLocation::getBeforeOrAfter(Call), Loc,
|
|
|
|
AAQI) == AliasResult::NoAlias)
|
2017-12-08 06:41:34 +08:00
|
|
|
return ModRefInfo::NoModRef;
|
2016-03-10 07:19:56 +08:00
|
|
|
}
|
|
|
|
|
2020-09-05 03:43:28 +08:00
|
|
|
// The semantics of memcpy intrinsics either exactly overlap or do not
|
|
|
|
// overlap, i.e., source and destination of any given memcpy are either
|
|
|
|
// no-alias or must-alias.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (auto *Inst = dyn_cast<AnyMemCpyInst>(Call)) {
|
2020-09-05 03:43:28 +08:00
|
|
|
AliasResult SrcAA =
|
|
|
|
getBestAAResults().alias(MemoryLocation::getForSource(Inst), Loc, AAQI);
|
|
|
|
AliasResult DestAA =
|
|
|
|
getBestAAResults().alias(MemoryLocation::getForDest(Inst), Loc, AAQI);
|
2016-12-26 06:42:27 +08:00
|
|
|
// It's also possible for Loc to alias both src and dest, or neither.
|
2017-12-08 06:41:34 +08:00
|
|
|
ModRefInfo rv = ModRefInfo::NoModRef;
|
2021-03-05 18:58:13 +08:00
|
|
|
if (SrcAA != AliasResult::NoAlias)
|
2017-12-06 04:12:23 +08:00
|
|
|
rv = setRef(rv);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (DestAA != AliasResult::NoAlias)
|
2017-12-06 04:12:23 +08:00
|
|
|
rv = setMod(rv);
|
2016-12-26 06:42:27 +08:00
|
|
|
return rv;
|
|
|
|
}
|
|
|
|
|
2021-03-21 01:11:17 +08:00
|
|
|
// Guard intrinsics are marked as arbitrarily writing so that proper control
|
|
|
|
// dependencies are maintained but they never mods any particular memory
|
|
|
|
// location.
|
2016-05-10 10:35:41 +08:00
|
|
|
//
|
|
|
|
// *Unlike* assumes, guard intrinsics are modeled as reading memory since the
|
|
|
|
// heap state at the point the guard is issued needs to be consistent in case
|
|
|
|
// the guard invokes the "deopt" continuation.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (isIntrinsicCall(Call, Intrinsic::experimental_guard))
|
2017-12-08 06:41:34 +08:00
|
|
|
return ModRefInfo::Ref;
|
2020-11-18 04:16:24 +08:00
|
|
|
// The same applies to deoptimize which is essentially a guard(false).
|
|
|
|
if (isIntrinsicCall(Call, Intrinsic::experimental_deoptimize))
|
|
|
|
return ModRefInfo::Ref;
|
2016-05-10 10:35:41 +08:00
|
|
|
|
2016-08-10 01:18:05 +08:00
|
|
|
// Like assumes, invariant.start intrinsics were also marked as arbitrarily
|
|
|
|
// writing so that proper control dependencies are maintained but they never
|
|
|
|
// mod any particular memory location visible to the IR.
|
|
|
|
// *Unlike* assumes (which are now modeled as NoModRef), invariant.start
|
|
|
|
// intrinsic is now modeled as reading memory. This prevents hoisting the
|
|
|
|
// invariant.start intrinsic over stores. Consider:
|
|
|
|
// *ptr = 40;
|
|
|
|
// *ptr = 50;
|
|
|
|
// invariant_start(ptr)
|
|
|
|
// int val = *ptr;
|
|
|
|
// print(val);
|
|
|
|
//
|
|
|
|
// This cannot be transformed to:
|
|
|
|
//
|
|
|
|
// *ptr = 40;
|
|
|
|
// invariant_start(ptr)
|
|
|
|
// *ptr = 50;
|
|
|
|
// int val = *ptr;
|
|
|
|
// print(val);
|
|
|
|
//
|
|
|
|
// The transformation will cause the second store to be ignored (based on
|
|
|
|
// rules of invariant.start) and print 40, while the first program always
|
|
|
|
// prints 50.
|
2019-01-07 13:42:51 +08:00
|
|
|
if (isIntrinsicCall(Call, Intrinsic::invariant_start))
|
2017-12-08 06:41:34 +08:00
|
|
|
return ModRefInfo::Ref;
|
2016-08-10 01:18:05 +08:00
|
|
|
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
// The AAResultBase base class has some smarts, lets use them.
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
return AAResultBase::getModRefInfo(Call, Loc, AAQI);
|
2010-09-08 09:32:20 +08:00
|
|
|
}
|
2008-06-16 14:10:11 +08:00
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
ModRefInfo BasicAAResult::getModRefInfo(const CallBase *Call1,
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
const CallBase *Call2,
|
|
|
|
AAQueryInfo &AAQI) {
|
2021-03-21 01:11:17 +08:00
|
|
|
// Guard intrinsics are marked as arbitrarily writing so that proper control
|
|
|
|
// dependencies are maintained but they never mods any particular memory
|
|
|
|
// location.
|
2016-05-10 10:35:41 +08:00
|
|
|
//
|
|
|
|
// *Unlike* assumes, guard intrinsics are modeled as reading memory since the
|
|
|
|
// heap state at the point the guard is issued needs to be consistent in case
|
|
|
|
// the guard invokes the "deopt" continuation.
|
|
|
|
|
2019-02-05 16:30:48 +08:00
|
|
|
// NB! This function is *not* commutative, so we special case two
|
2016-05-10 10:35:41 +08:00
|
|
|
// possibilities for guard intrinsics.
|
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
if (isIntrinsicCall(Call1, Intrinsic::experimental_guard))
|
|
|
|
return isModSet(createModRefInfo(getModRefBehavior(Call2)))
|
2017-12-08 06:41:34 +08:00
|
|
|
? ModRefInfo::Ref
|
|
|
|
: ModRefInfo::NoModRef;
|
2016-05-10 10:35:41 +08:00
|
|
|
|
2019-01-07 13:42:51 +08:00
|
|
|
if (isIntrinsicCall(Call2, Intrinsic::experimental_guard))
|
|
|
|
return isModSet(createModRefInfo(getModRefBehavior(Call1)))
|
2017-12-08 06:41:34 +08:00
|
|
|
? ModRefInfo::Mod
|
|
|
|
: ModRefInfo::NoModRef;
|
2016-05-10 10:35:41 +08:00
|
|
|
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
// The AAResultBase base class has some smarts, lets use them.
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
return AAResultBase::getModRefInfo(Call1, Call2, AAQI);
|
2014-07-26 05:13:35 +08:00
|
|
|
}
|
|
|
|
|
2021-03-04 00:58:53 +08:00
|
|
|
/// Return true if we know V to the base address of the corresponding memory
|
|
|
|
/// object. This implies that any address less than V must be out of bounds
|
|
|
|
/// for the underlying object. Note that just being isIdentifiedObject() is
|
|
|
|
/// not enough - For example, a negative offset from a noalias argument or call
|
|
|
|
/// can be inbounds w.r.t the actual underlying object.
|
|
|
|
static bool isBaseOfObject(const Value *V) {
|
|
|
|
// TODO: We can handle other cases here
|
|
|
|
// 1) For GC languages, arguments to functions are often required to be
|
|
|
|
// base pointers.
|
|
|
|
// 2) Result of allocation routines are often base pointers. Leverage TLI.
|
|
|
|
return (isa<AllocaInst>(V) || isa<GlobalVariable>(V));
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Provides a bunch of ad-hoc rules to disambiguate a GEP instruction against
|
|
|
|
/// another pointer.
|
2009-11-26 10:11:08 +08:00
|
|
|
///
|
2015-08-06 16:17:06 +08:00
|
|
|
/// We know that V1 is a GEP, but we don't know anything about V2.
|
2020-07-31 17:09:54 +08:00
|
|
|
/// UnderlyingV1 is getUnderlyingObject(GEP1), UnderlyingV2 is the same for
|
2015-08-06 16:17:06 +08:00
|
|
|
/// V2.
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
AliasResult BasicAAResult::aliasGEP(
|
2020-10-24 16:39:24 +08:00
|
|
|
const GEPOperator *GEP1, LocationSize V1Size,
|
|
|
|
const Value *V2, LocationSize V2Size,
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
const Value *UnderlyingV1, const Value *UnderlyingV2, AAQueryInfo &AAQI) {
|
2021-03-28 19:06:52 +08:00
|
|
|
if (!V1Size.hasValue() && !V2Size.hasValue()) {
|
|
|
|
// TODO: This limitation exists for compile-time reasons. Relax it if we
|
|
|
|
// can avoid exponential pathological cases.
|
|
|
|
if (!isa<GEPOperator>(V2))
|
2021-03-16 21:36:17 +08:00
|
|
|
return AliasResult::MayAlias;
|
2021-03-28 19:06:52 +08:00
|
|
|
|
|
|
|
// If both accesses have unknown size, we can only check whether the base
|
|
|
|
// objects don't alias.
|
|
|
|
AliasResult BaseAlias = getBestAAResults().alias(
|
|
|
|
MemoryLocation::getBeforeOrAfter(UnderlyingV1),
|
|
|
|
MemoryLocation::getBeforeOrAfter(UnderlyingV2), AAQI);
|
2021-03-16 21:36:17 +08:00
|
|
|
return BaseAlias == AliasResult::NoAlias ? AliasResult::NoAlias
|
|
|
|
: AliasResult::MayAlias;
|
2021-03-28 19:06:52 +08:00
|
|
|
}
|
|
|
|
|
2020-11-22 03:47:11 +08:00
|
|
|
DecomposedGEP DecompGEP1 = DecomposeGEPExpression(GEP1, DL, &AC, DT);
|
|
|
|
DecomposedGEP DecompGEP2 = DecomposeGEPExpression(V2, DL, &AC, DT);
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
|
2020-04-11 07:24:18 +08:00
|
|
|
// Don't attempt to analyze the decomposed GEP if index scale is not a
|
|
|
|
// compile-time constant.
|
|
|
|
if (!DecompGEP1.HasCompileTimeConstantScale ||
|
|
|
|
!DecompGEP2.HasCompileTimeConstantScale)
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2020-04-11 07:24:18 +08:00
|
|
|
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
assert(DecompGEP1.Base == UnderlyingV1 && DecompGEP2.Base == UnderlyingV2 &&
|
|
|
|
"DecomposeGEPExpression returned a result different from "
|
2020-07-31 12:07:10 +08:00
|
|
|
"getUnderlyingObject");
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
|
2021-03-04 00:58:53 +08:00
|
|
|
// Subtract the GEP2 pointer from the GEP1 pointer to find out their
|
|
|
|
// symbolic difference.
|
|
|
|
DecompGEP1.Offset -= DecompGEP2.Offset;
|
|
|
|
GetIndexDifference(DecompGEP1.VarIndices, DecompGEP2.VarIndices);
|
|
|
|
|
|
|
|
// If an inbounds GEP would have to start from an out of bounds address
|
|
|
|
// for the two to alias, then we can assume noalias.
|
|
|
|
if (*DecompGEP1.InBounds && DecompGEP1.VarIndices.empty() &&
|
|
|
|
V2Size.hasValue() && DecompGEP1.Offset.sge(V2Size.getValue()) &&
|
|
|
|
isBaseOfObject(DecompGEP2.Base))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2021-02-15 01:51:45 +08:00
|
|
|
|
2021-03-04 01:16:56 +08:00
|
|
|
if (isa<GEPOperator>(V2)) {
|
2021-03-04 00:58:53 +08:00
|
|
|
// Symmetric case to above.
|
|
|
|
if (*DecompGEP2.InBounds && DecompGEP1.VarIndices.empty() &&
|
|
|
|
V1Size.hasValue() && DecompGEP1.Offset.sle(-V1Size.getValue()) &&
|
|
|
|
isBaseOfObject(DecompGEP1.Base))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2009-11-26 10:14:59 +08:00
|
|
|
}
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2021-02-15 01:51:45 +08:00
|
|
|
// For GEPs with identical offsets, we can preserve the size and AAInfo
|
|
|
|
// when performing the alias check on the underlying objects.
|
2020-11-22 03:29:28 +08:00
|
|
|
if (DecompGEP1.Offset == 0 && DecompGEP1.VarIndices.empty())
|
2021-02-15 01:51:45 +08:00
|
|
|
return getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(UnderlyingV1, V1Size),
|
|
|
|
MemoryLocation(UnderlyingV2, V2Size), AAQI);
|
2021-02-15 01:51:45 +08:00
|
|
|
|
|
|
|
// Do the base pointers alias?
|
|
|
|
AliasResult BaseAlias = getBestAAResults().alias(
|
|
|
|
MemoryLocation::getBeforeOrAfter(UnderlyingV1),
|
|
|
|
MemoryLocation::getBeforeOrAfter(UnderlyingV2), AAQI);
|
|
|
|
|
|
|
|
// If we get a No or May, then return it immediately, no amount of analysis
|
|
|
|
// will improve this situation.
|
2021-03-05 18:58:13 +08:00
|
|
|
if (BaseAlias != AliasResult::MustAlias) {
|
|
|
|
assert(BaseAlias == AliasResult::NoAlias ||
|
|
|
|
BaseAlias == AliasResult::MayAlias);
|
2021-02-15 01:51:45 +08:00
|
|
|
return BaseAlias;
|
|
|
|
}
|
2009-10-14 14:41:49 +08:00
|
|
|
|
2011-09-08 10:23:31 +08:00
|
|
|
// If there is a constant difference between the pointers, but the difference
|
|
|
|
// is less than the size of the associated memory object, then we know
|
|
|
|
// that the objects are partially overlapping. If the difference is
|
|
|
|
// greater, we know they do not overlap.
|
2020-11-22 03:29:28 +08:00
|
|
|
if (DecompGEP1.Offset != 0 && DecompGEP1.VarIndices.empty()) {
|
2020-12-21 20:33:43 +08:00
|
|
|
APInt &Off = DecompGEP1.Offset;
|
|
|
|
|
|
|
|
// Initialize for Off >= 0 (V2 <= GEP1) case.
|
|
|
|
const Value *LeftPtr = V2;
|
|
|
|
const Value *RightPtr = GEP1;
|
|
|
|
LocationSize VLeftSize = V2Size;
|
|
|
|
LocationSize VRightSize = V1Size;
|
2021-03-16 21:36:17 +08:00
|
|
|
const bool Swapped = Off.isNegative();
|
2020-12-21 20:33:43 +08:00
|
|
|
|
2021-03-16 21:36:17 +08:00
|
|
|
if (Swapped) {
|
2020-12-21 20:33:43 +08:00
|
|
|
// Swap if we have the situation where:
|
2014-01-16 12:53:18 +08:00
|
|
|
// + +
|
|
|
|
// | BaseOffset |
|
|
|
|
// ---------------->|
|
|
|
|
// |-->V1Size |-------> V2Size
|
|
|
|
// GEP1 V2
|
2020-12-21 20:33:43 +08:00
|
|
|
std::swap(LeftPtr, RightPtr);
|
|
|
|
std::swap(VLeftSize, VRightSize);
|
|
|
|
Off = -Off;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (VLeftSize.hasValue()) {
|
|
|
|
const uint64_t LSize = VLeftSize.getValue();
|
|
|
|
if (Off.ult(LSize)) {
|
|
|
|
// Conservatively drop processing if a phi was visited and/or offset is
|
|
|
|
// too big.
|
2021-03-16 21:36:17 +08:00
|
|
|
AliasResult AR = AliasResult::PartialAlias;
|
2021-04-14 01:00:12 +08:00
|
|
|
if (VRightSize.hasValue() && Off.ule(INT32_MAX) &&
|
|
|
|
(Off + VRightSize.getValue()).ule(LSize)) {
|
2020-12-21 20:33:43 +08:00
|
|
|
// Memory referenced by right pointer is nested. Save the offset in
|
2021-03-16 21:36:17 +08:00
|
|
|
// cache. Note that originally offset estimated as GEP1-V2, but
|
|
|
|
// AliasResult contains the shift that represents GEP1+Offset=V2.
|
|
|
|
AR.setOffset(-Off.getSExtValue());
|
|
|
|
AR.swap(Swapped);
|
2020-12-21 20:33:43 +08:00
|
|
|
}
|
2021-03-16 21:36:17 +08:00
|
|
|
return AR;
|
2011-09-08 10:23:31 +08:00
|
|
|
}
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2011-09-08 10:23:31 +08:00
|
|
|
}
|
2010-12-14 06:50:24 +08:00
|
|
|
}
|
|
|
|
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
if (!DecompGEP1.VarIndices.empty()) {
|
2020-11-08 16:48:38 +08:00
|
|
|
APInt GCD;
|
2020-11-22 03:29:28 +08:00
|
|
|
bool AllNonNegative = DecompGEP1.Offset.isNonNegative();
|
|
|
|
bool AllNonPositive = DecompGEP1.Offset.isNonPositive();
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
for (unsigned i = 0, e = DecompGEP1.VarIndices.size(); i != e; ++i) {
|
2020-11-08 16:48:38 +08:00
|
|
|
const APInt &Scale = DecompGEP1.VarIndices[i].Scale;
|
|
|
|
if (i == 0)
|
|
|
|
GCD = Scale.abs();
|
|
|
|
else
|
|
|
|
GCD = APIntOps::GreatestCommonDivisor(GCD, Scale.abs());
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
2020-11-13 04:51:32 +08:00
|
|
|
if (AllNonNegative || AllNonPositive) {
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
// If the Value could change between cycles, then any reasoning about
|
|
|
|
// the Value this cycle may not hold in the next cycle. We'll just
|
|
|
|
// give up if we can't determine conditions that hold for every cycle:
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
const Value *V = DecompGEP1.VarIndices[i].V;
|
2020-12-14 03:27:16 +08:00
|
|
|
const Instruction *CxtI = DecompGEP1.VarIndices[i].CxtI;
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
2020-12-14 03:27:16 +08:00
|
|
|
KnownBits Known = computeKnownBits(V, DL, 0, &AC, CxtI, DT);
|
2017-05-15 14:39:41 +08:00
|
|
|
bool SignKnownZero = Known.isNonNegative();
|
|
|
|
bool SignKnownOne = Known.isNegative();
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
|
|
|
// Zero-extension widens the variable, and so forces the sign
|
|
|
|
// bit to zero.
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
bool IsZExt = DecompGEP1.VarIndices[i].ZExtBits > 0 || isa<ZExtInst>(V);
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
SignKnownZero |= IsZExt;
|
|
|
|
SignKnownOne &= !IsZExt;
|
|
|
|
|
2020-11-13 04:51:32 +08:00
|
|
|
AllNonNegative &= (SignKnownZero && Scale.isNonNegative()) ||
|
|
|
|
(SignKnownOne && Scale.isNonPositive());
|
|
|
|
AllNonPositive &= (SignKnownZero && Scale.isNonPositive()) ||
|
|
|
|
(SignKnownOne && Scale.isNonNegative());
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-11-08 16:48:38 +08:00
|
|
|
// We now have accesses at two offsets from the same base:
|
2020-11-22 03:29:28 +08:00
|
|
|
// 1. (...)*GCD + DecompGEP1.Offset with size V1Size
|
2020-11-08 16:48:38 +08:00
|
|
|
// 2. 0 with size V2Size
|
|
|
|
// Using arithmetic modulo GCD, the accesses are at
|
|
|
|
// [ModOffset..ModOffset+V1Size) and [0..V2Size). If the first access fits
|
|
|
|
// into the range [V2Size..GCD), then we know they cannot overlap.
|
2020-11-22 03:29:28 +08:00
|
|
|
APInt ModOffset = DecompGEP1.Offset.srem(GCD);
|
2020-11-08 16:48:38 +08:00
|
|
|
if (ModOffset.isNegative())
|
|
|
|
ModOffset += GCD; // We want mod, not rem.
|
2020-11-20 05:28:39 +08:00
|
|
|
if (V1Size.hasValue() && V2Size.hasValue() &&
|
|
|
|
ModOffset.uge(V2Size.getValue()) &&
|
2020-11-08 16:48:38 +08:00
|
|
|
(GCD - ModOffset).uge(V1Size.getValue()))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
2020-11-13 04:51:32 +08:00
|
|
|
// If we know all the variables are non-negative, then the total offset is
|
2020-11-22 03:29:28 +08:00
|
|
|
// also non-negative and >= DecompGEP1.Offset. We have the following layout:
|
2020-11-13 04:51:32 +08:00
|
|
|
// [0, V2Size) ... [TotalOffset, TotalOffer+V1Size]
|
2020-11-22 03:29:28 +08:00
|
|
|
// If DecompGEP1.Offset >= V2Size, the accesses don't alias.
|
2020-11-20 05:28:39 +08:00
|
|
|
if (AllNonNegative && V2Size.hasValue() &&
|
2020-11-22 03:29:28 +08:00
|
|
|
DecompGEP1.Offset.uge(V2Size.getValue()))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2020-11-13 04:51:32 +08:00
|
|
|
// Similarly, if the variables are non-positive, then the total offset is
|
2020-11-22 03:29:28 +08:00
|
|
|
// also non-positive and <= DecompGEP1.Offset. We have the following layout:
|
2020-11-13 04:51:32 +08:00
|
|
|
// [TotalOffset, TotalOffset+V1Size) ... [0, V2Size)
|
2020-11-22 03:29:28 +08:00
|
|
|
// If -DecompGEP1.Offset >= V1Size, the accesses don't alias.
|
2020-11-20 05:28:39 +08:00
|
|
|
if (AllNonPositive && V1Size.hasValue() &&
|
2020-11-22 03:29:28 +08:00
|
|
|
(-DecompGEP1.Offset).uge(V1Size.getValue()))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
2020-12-12 20:23:49 +08:00
|
|
|
if (V1Size.hasValue() && V2Size.hasValue()) {
|
|
|
|
// Try to determine whether abs(VarIndex) > 0.
|
|
|
|
Optional<APInt> MinAbsVarIndex;
|
2020-12-11 05:29:33 +08:00
|
|
|
if (DecompGEP1.VarIndices.size() == 1) {
|
|
|
|
// VarIndex = Scale*V. If V != 0 then abs(VarIndex) >= abs(Scale).
|
|
|
|
const VariableGEPIndex &Var = DecompGEP1.VarIndices[0];
|
2020-12-25 18:43:08 +08:00
|
|
|
if (isKnownNonZero(Var.V, DL, 0, &AC, Var.CxtI, DT))
|
2020-12-11 05:29:33 +08:00
|
|
|
MinAbsVarIndex = Var.Scale.abs();
|
|
|
|
} else if (DecompGEP1.VarIndices.size() == 2) {
|
2020-12-12 20:23:49 +08:00
|
|
|
// VarIndex = Scale*V0 + (-Scale)*V1.
|
|
|
|
// If V0 != V1 then abs(VarIndex) >= abs(Scale).
|
|
|
|
// Check that VisitedPhiBBs is empty, to avoid reasoning about
|
|
|
|
// inequality of values across loop iterations.
|
|
|
|
const VariableGEPIndex &Var0 = DecompGEP1.VarIndices[0];
|
|
|
|
const VariableGEPIndex &Var1 = DecompGEP1.VarIndices[1];
|
|
|
|
if (Var0.Scale == -Var1.Scale && Var0.ZExtBits == Var1.ZExtBits &&
|
|
|
|
Var0.SExtBits == Var1.SExtBits && VisitedPhiBBs.empty() &&
|
2020-12-26 01:26:00 +08:00
|
|
|
isKnownNonEqual(Var0.V, Var1.V, DL, &AC, /* CxtI */ nullptr, DT))
|
2020-12-12 20:23:49 +08:00
|
|
|
MinAbsVarIndex = Var0.Scale.abs();
|
|
|
|
}
|
|
|
|
|
|
|
|
if (MinAbsVarIndex) {
|
|
|
|
// The constant offset will have added at least +/-MinAbsVarIndex to it.
|
|
|
|
APInt OffsetLo = DecompGEP1.Offset - *MinAbsVarIndex;
|
|
|
|
APInt OffsetHi = DecompGEP1.Offset + *MinAbsVarIndex;
|
2020-12-06 00:26:33 +08:00
|
|
|
// Check that an access at OffsetLo or lower, and an access at OffsetHi
|
|
|
|
// or higher both do not alias.
|
|
|
|
if (OffsetLo.isNegative() && (-OffsetLo).uge(V1Size.getValue()) &&
|
|
|
|
OffsetHi.isNonNegative() && OffsetHi.uge(V2Size.getValue()))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2020-12-06 00:26:33 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries
If a we have (a) a GEP and (b) a pointer based on an alloca, and the
beginning of the object the GEP points would have a negative offset with
repsect to the alloca, then the GEP can not alias pointer (b).
For example, consider code like:
struct { int f0, int f1, ...} foo;
...
foo alloca;
foo *random = bar(alloca);
int *f0 = &alloca.f0
int *f1 = &random->f1;
Which is lowered, approximately, to:
%alloca = alloca %struct.foo
%random = call %struct.foo* @random(%struct.foo* %alloca)
%f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
%f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
by %alloca. Since the %f1 GEP is inbounds, that means %random must also
point into the same object. But since %f0 points to the beginning of %alloca,
the highest %f1 can be is (%alloca + 3). This means %random can not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
Differential Revision: http://reviews.llvm.org/D20495
llvm-svn: 270777
2016-05-26 06:23:08 +08:00
|
|
|
if (constantOffsetHeuristic(DecompGEP1.VarIndices, V1Size, V2Size,
|
2020-11-22 03:29:28 +08:00
|
|
|
DecompGEP1.Offset, &AC, DT))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2011-09-08 10:37:07 +08:00
|
|
|
}
|
2011-09-08 10:23:31 +08:00
|
|
|
|
2011-06-04 14:50:18 +08:00
|
|
|
// Statically, we can see that the base objects are the same, but the
|
|
|
|
// pointers have dynamic offsets which we can't resolve. And none of our
|
|
|
|
// little tricks above worked.
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2009-10-14 02:42:04 +08:00
|
|
|
}
|
|
|
|
|
2015-06-22 10:16:51 +08:00
|
|
|
static AliasResult MergeAliasResults(AliasResult A, AliasResult B) {
|
2011-06-04 04:17:36 +08:00
|
|
|
// If the results agree, take it.
|
|
|
|
if (A == B)
|
|
|
|
return A;
|
|
|
|
// A mix of PartialAlias and MustAlias is PartialAlias.
|
2021-03-05 18:58:13 +08:00
|
|
|
if ((A == AliasResult::PartialAlias && B == AliasResult::MustAlias) ||
|
|
|
|
(B == AliasResult::PartialAlias && A == AliasResult::MustAlias))
|
|
|
|
return AliasResult::PartialAlias;
|
2011-06-04 04:17:36 +08:00
|
|
|
// Otherwise, we don't know anything.
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2011-06-04 04:17:36 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Provides a bunch of ad-hoc rules to disambiguate a Select instruction
|
|
|
|
/// against another.
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
AliasResult
|
|
|
|
BasicAAResult::aliasSelect(const SelectInst *SI, LocationSize SISize,
|
2020-10-24 16:39:24 +08:00
|
|
|
const Value *V2, LocationSize V2Size,
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AAQueryInfo &AAQI) {
|
2009-10-27 05:55:43 +08:00
|
|
|
// If the values are Selects with the same condition, we can do a more precise
|
|
|
|
// check: just check for aliases between the values on corresponding arms.
|
|
|
|
if (const SelectInst *SI2 = dyn_cast<SelectInst>(V2))
|
|
|
|
if (SI->getCondition() == SI2->getCondition()) {
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AliasResult Alias = getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(SI->getTrueValue(), SISize),
|
|
|
|
MemoryLocation(SI2->getTrueValue(), V2Size), AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Alias == AliasResult::MayAlias)
|
|
|
|
return AliasResult::MayAlias;
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AliasResult ThisAlias = getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(SI->getFalseValue(), SISize),
|
|
|
|
MemoryLocation(SI2->getFalseValue(), V2Size), AAQI);
|
2011-06-04 04:17:36 +08:00
|
|
|
return MergeAliasResults(ThisAlias, Alias);
|
2009-10-27 05:55:43 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
// If both arms of the Select node NoAlias or MustAlias V2, then returns
|
|
|
|
// NoAlias / MustAlias. Otherwise, returns MayAlias.
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AliasResult Alias = getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(V2, V2Size),
|
|
|
|
MemoryLocation(SI->getTrueValue(), SISize), AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Alias == AliasResult::MayAlias)
|
|
|
|
return AliasResult::MayAlias;
|
2010-06-29 05:16:52 +08:00
|
|
|
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AliasResult ThisAlias = getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(V2, V2Size),
|
|
|
|
MemoryLocation(SI->getFalseValue(), SISize), AAQI);
|
2011-06-04 04:17:36 +08:00
|
|
|
return MergeAliasResults(ThisAlias, Alias);
|
2009-10-27 05:55:43 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Provide a bunch of ad-hoc rules to disambiguate a PHI instruction against
|
|
|
|
/// another.
|
2018-05-26 05:16:58 +08:00
|
|
|
AliasResult BasicAAResult::aliasPHI(const PHINode *PN, LocationSize PNSize,
|
2020-10-24 16:39:24 +08:00
|
|
|
const Value *V2, LocationSize V2Size,
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AAQueryInfo &AAQI) {
|
2021-06-05 07:17:02 +08:00
|
|
|
if (!PN->getNumIncomingValues())
|
|
|
|
return AliasResult::NoAlias;
|
2009-10-27 05:55:43 +08:00
|
|
|
// If the values are PHIs in the same block, we can do a more precise
|
|
|
|
// as well as efficient check: just check for aliases between the values
|
|
|
|
// on corresponding edges.
|
|
|
|
if (const PHINode *PN2 = dyn_cast<PHINode>(V2))
|
|
|
|
if (PN2->getParent() == PN->getParent()) {
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
Optional<AliasResult> Alias;
|
2012-11-17 10:33:15 +08:00
|
|
|
for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AliasResult ThisAlias = getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(PN->getIncomingValue(i), PNSize),
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
MemoryLocation(
|
2020-10-24 16:39:24 +08:00
|
|
|
PN2->getIncomingValueForBlock(PN->getIncomingBlock(i)), V2Size),
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AAQI);
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
if (Alias)
|
|
|
|
*Alias = MergeAliasResults(*Alias, ThisAlias);
|
|
|
|
else
|
|
|
|
Alias = ThisAlias;
|
2021-03-05 18:58:13 +08:00
|
|
|
if (*Alias == AliasResult::MayAlias)
|
2011-06-04 04:17:36 +08:00
|
|
|
break;
|
2009-10-27 05:55:43 +08:00
|
|
|
}
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return *Alias;
|
2009-10-27 05:55:43 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 15:57:58 +08:00
|
|
|
SmallVector<Value *, 4> V1Srcs;
|
2020-11-22 00:29:13 +08:00
|
|
|
// If a phi operand recurses back to the phi, we can still determine NoAlias
|
|
|
|
// if we don't alias the underlying objects of the other phi operands, as we
|
|
|
|
// know that the recursive phi needs to be based on them in some way.
|
2015-07-16 03:32:22 +08:00
|
|
|
bool isRecursive = false;
|
2020-07-16 22:42:05 +08:00
|
|
|
auto CheckForRecPhi = [&](Value *PV) {
|
|
|
|
if (!EnableRecPhiAnalysis)
|
|
|
|
return false;
|
2020-11-22 00:29:13 +08:00
|
|
|
if (getUnderlyingObject(PV) == PN) {
|
|
|
|
isRecursive = true;
|
|
|
|
return true;
|
2020-07-16 22:42:05 +08:00
|
|
|
}
|
|
|
|
return false;
|
|
|
|
};
|
|
|
|
|
|
|
|
if (PV) {
|
2018-07-30 19:52:08 +08:00
|
|
|
// If we have PhiValues then use it to get the underlying phi values.
|
|
|
|
const PhiValues::ValueSet &PhiValueSet = PV->getValuesForPhi(PN);
|
|
|
|
// If we have more phi values than the search depth then return MayAlias
|
|
|
|
// conservatively to avoid compile time explosion. The worst possible case
|
|
|
|
// is if both sides are PHI nodes. In which case, this is O(m x n) time
|
|
|
|
// where 'm' and 'n' are the number of PHI sources.
|
|
|
|
if (PhiValueSet.size() > MaxLookupSearchDepth)
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2018-07-30 19:52:08 +08:00
|
|
|
// Add the values to V1Srcs
|
|
|
|
for (Value *PV1 : PhiValueSet) {
|
2020-07-16 22:42:05 +08:00
|
|
|
if (CheckForRecPhi(PV1))
|
|
|
|
continue;
|
2009-10-14 06:02:20 +08:00
|
|
|
V1Srcs.push_back(PV1);
|
2018-07-30 19:52:08 +08:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// If we don't have PhiInfo then just look at the operands of the phi itself
|
|
|
|
// FIXME: Remove this once we can guarantee that we have PhiInfo always
|
|
|
|
SmallPtrSet<Value *, 4> UniqueSrc;
|
2021-03-05 05:03:54 +08:00
|
|
|
Value *OnePhi = nullptr;
|
2018-07-30 19:52:08 +08:00
|
|
|
for (Value *PV1 : PN->incoming_values()) {
|
2021-03-05 05:03:54 +08:00
|
|
|
if (isa<PHINode>(PV1)) {
|
|
|
|
if (OnePhi && OnePhi != PV1) {
|
|
|
|
// To control potential compile time explosion, we choose to be
|
|
|
|
// conserviate when we have more than one Phi input. It is important
|
|
|
|
// that we handle the single phi case as that lets us handle LCSSA
|
|
|
|
// phi nodes and (combined with the recursive phi handling) simple
|
|
|
|
// pointer induction variable patterns.
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2021-03-05 05:03:54 +08:00
|
|
|
}
|
|
|
|
OnePhi = PV1;
|
|
|
|
}
|
2018-07-30 19:52:08 +08:00
|
|
|
|
2020-07-16 22:42:05 +08:00
|
|
|
if (CheckForRecPhi(PV1))
|
|
|
|
continue;
|
2018-07-30 19:52:08 +08:00
|
|
|
|
|
|
|
if (UniqueSrc.insert(PV1).second)
|
|
|
|
V1Srcs.push_back(PV1);
|
|
|
|
}
|
2021-03-05 05:03:54 +08:00
|
|
|
|
|
|
|
if (OnePhi && UniqueSrc.size() > 1)
|
|
|
|
// Out of an abundance of caution, allow only the trivial lcssa and
|
|
|
|
// recursive phi cases.
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2009-10-14 06:02:20 +08:00
|
|
|
}
|
|
|
|
|
2018-07-30 19:52:08 +08:00
|
|
|
// If V1Srcs is empty then that means that the phi has no underlying non-phi
|
|
|
|
// value. This should only be possible in blocks unreachable from the entry
|
|
|
|
// block, but return MayAlias just in case.
|
|
|
|
if (V1Srcs.empty())
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2018-07-30 19:52:08 +08:00
|
|
|
|
2020-11-22 00:29:13 +08:00
|
|
|
// If this PHI node is recursive, indicate that the pointer may be moved
|
|
|
|
// across iterations. We can only prove NoAlias if different underlying
|
|
|
|
// objects are involved.
|
2015-07-16 03:32:22 +08:00
|
|
|
if (isRecursive)
|
2020-11-22 00:29:13 +08:00
|
|
|
PNSize = LocationSize::beforeOrAfterPointer();
|
2015-07-16 03:32:22 +08:00
|
|
|
|
2020-10-23 03:44:09 +08:00
|
|
|
// In the recursive alias queries below, we may compare values from two
|
|
|
|
// different loop iterations. Keep track of visited phi blocks, which will
|
|
|
|
// be used when determining value equivalence.
|
2020-10-24 02:50:05 +08:00
|
|
|
bool BlockInserted = VisitedPhiBBs.insert(PN->getParent()).second;
|
2020-10-23 03:50:18 +08:00
|
|
|
auto _ = make_scope_exit([&]() {
|
2020-10-24 02:50:05 +08:00
|
|
|
if (BlockInserted)
|
2020-10-23 03:50:18 +08:00
|
|
|
VisitedPhiBBs.erase(PN->getParent());
|
|
|
|
});
|
2020-10-23 03:44:09 +08:00
|
|
|
|
2020-10-24 02:50:05 +08:00
|
|
|
// If we inserted a block into VisitedPhiBBs, alias analysis results that
|
|
|
|
// have been cached earlier may no longer be valid. Perform recursive queries
|
|
|
|
// with a new AAQueryInfo.
|
2021-02-12 22:41:22 +08:00
|
|
|
AAQueryInfo NewAAQI = AAQI.withEmptyCache();
|
2020-10-24 02:50:05 +08:00
|
|
|
AAQueryInfo *UseAAQI = BlockInserted ? &NewAAQI : &AAQI;
|
|
|
|
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AliasResult Alias = getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(V2, V2Size),
|
|
|
|
MemoryLocation(V1Srcs[0], PNSize), *UseAAQI);
|
2015-07-16 03:32:22 +08:00
|
|
|
|
2009-10-14 13:22:03 +08:00
|
|
|
// Early exit if the check of the first PHI source against V2 is MayAlias.
|
|
|
|
// Other results are not possible.
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Alias == AliasResult::MayAlias)
|
|
|
|
return AliasResult::MayAlias;
|
2020-07-02 17:56:58 +08:00
|
|
|
// With recursive phis we cannot guarantee that MustAlias/PartialAlias will
|
|
|
|
// remain valid to all elements and needs to conservatively return MayAlias.
|
2021-03-05 18:58:13 +08:00
|
|
|
if (isRecursive && Alias != AliasResult::NoAlias)
|
|
|
|
return AliasResult::MayAlias;
|
2009-10-14 13:22:03 +08:00
|
|
|
|
2009-10-14 06:02:20 +08:00
|
|
|
// If all sources of the PHI node NoAlias or MustAlias V2, then returns
|
|
|
|
// NoAlias / MustAlias. Otherwise, returns MayAlias.
|
|
|
|
for (unsigned i = 1, e = V1Srcs.size(); i != e; ++i) {
|
|
|
|
Value *V = V1Srcs[i];
|
2009-10-27 05:55:43 +08:00
|
|
|
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AliasResult ThisAlias = getBestAAResults().alias(
|
2020-10-24 16:39:24 +08:00
|
|
|
MemoryLocation(V2, V2Size), MemoryLocation(V, PNSize), *UseAAQI);
|
2011-06-04 04:17:36 +08:00
|
|
|
Alias = MergeAliasResults(ThisAlias, Alias);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Alias == AliasResult::MayAlias)
|
2011-06-04 04:17:36 +08:00
|
|
|
break;
|
2009-10-14 06:02:20 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
return Alias;
|
|
|
|
}
|
|
|
|
|
2015-11-17 16:15:08 +08:00
|
|
|
/// Provides a bunch of ad-hoc rules to disambiguate in common cases, such as
|
2015-08-06 16:17:06 +08:00
|
|
|
/// array references.
|
2018-05-26 05:16:58 +08:00
|
|
|
AliasResult BasicAAResult::aliasCheck(const Value *V1, LocationSize V1Size,
|
2020-10-18 23:34:22 +08:00
|
|
|
const Value *V2, LocationSize V2Size,
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
AAQueryInfo &AAQI) {
|
2010-04-09 02:11:50 +08:00
|
|
|
// If either of the memory references is empty, it doesn't matter what the
|
|
|
|
// pointer values are.
|
2018-12-23 02:23:21 +08:00
|
|
|
if (V1Size.isZero() || V2Size.isZero())
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2010-04-09 02:11:50 +08:00
|
|
|
|
2009-10-14 02:42:04 +08:00
|
|
|
// Strip off any casts if they exist.
|
2021-02-15 04:24:36 +08:00
|
|
|
V1 = V1->stripPointerCastsForAliasAnalysis();
|
|
|
|
V2 = V2->stripPointerCastsForAliasAnalysis();
|
2009-10-14 02:42:04 +08:00
|
|
|
|
2015-05-06 02:10:49 +08:00
|
|
|
// If V1 or V2 is undef, the result is NoAlias because we can always pick a
|
|
|
|
// value for undef that aliases nothing in the program.
|
|
|
|
if (isa<UndefValue>(V1) || isa<UndefValue>(V2))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2015-05-06 02:10:49 +08:00
|
|
|
|
2009-10-14 02:42:04 +08:00
|
|
|
// Are we checking for alias of the same value?
|
2016-01-18 07:13:48 +08:00
|
|
|
// Because we look 'through' phi nodes, we could look at "Value" pointers from
|
2014-01-03 13:47:03 +08:00
|
|
|
// different iterations. We must therefore make sure that this is not the
|
|
|
|
// case. The function isValueEqualInPotentialCycles ensures that this cannot
|
|
|
|
// happen by looking at the visited phi nodes and making sure they cannot
|
|
|
|
// reach the value.
|
|
|
|
if (isValueEqualInPotentialCycles(V1, V2))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MustAlias;
|
2009-10-14 02:42:04 +08:00
|
|
|
|
2010-02-16 19:11:14 +08:00
|
|
|
if (!V1->getType()->isPointerTy() || !V2->getType()->isPointerTy())
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias; // Scalars cannot alias each other
|
2009-10-14 02:42:04 +08:00
|
|
|
|
|
|
|
// Figure out what objects these things are pointing to if we can.
|
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
2020-11-11 03:43:46 +08:00
|
|
|
const Value *O1 = getUnderlyingObject(V1, MaxLookupSearchDepth);
|
|
|
|
const Value *O2 = getUnderlyingObject(V2, MaxLookupSearchDepth);
|
2009-10-14 02:42:04 +08:00
|
|
|
|
2009-11-10 03:29:11 +08:00
|
|
|
// Null values in the default address space don't point to any object, so they
|
|
|
|
// don't alias any other pointer.
|
|
|
|
if (const ConstantPointerNull *CPN = dyn_cast<ConstantPointerNull>(O1))
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
if (!NullPointerIsDefined(&F, CPN->getType()->getAddressSpace()))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2009-11-10 03:29:11 +08:00
|
|
|
if (const ConstantPointerNull *CPN = dyn_cast<ConstantPointerNull>(O2))
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
if (!NullPointerIsDefined(&F, CPN->getType()->getAddressSpace()))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2009-11-10 03:29:11 +08:00
|
|
|
|
2009-10-14 02:42:04 +08:00
|
|
|
if (O1 != O2) {
|
2016-01-18 07:13:48 +08:00
|
|
|
// If V1/V2 point to two different objects, we know that we have no alias.
|
2010-07-07 22:27:09 +08:00
|
|
|
if (isIdentifiedObject(O1) && isIdentifiedObject(O2))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2009-11-14 14:15:14 +08:00
|
|
|
|
|
|
|
// Constant pointers can't alias with non-const isIdentifiedObject objects.
|
2010-07-07 22:27:09 +08:00
|
|
|
if ((isa<Constant>(O1) && isIdentifiedObject(O2) && !isa<Constant>(O2)) ||
|
|
|
|
(isa<Constant>(O2) && isIdentifiedObject(O1) && !isa<Constant>(O1)))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2009-11-14 14:15:14 +08:00
|
|
|
|
2013-05-28 16:17:48 +08:00
|
|
|
// Function arguments can't alias with things that are known to be
|
|
|
|
// unambigously identified at the function level.
|
|
|
|
if ((isa<Argument>(O1) && isIdentifiedFunctionLocal(O2)) ||
|
|
|
|
(isa<Argument>(O2) && isIdentifiedFunctionLocal(O1)))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2009-10-14 02:42:04 +08:00
|
|
|
|
2010-07-07 22:30:04 +08:00
|
|
|
// If one pointer is the result of a call/invoke or load and the other is a
|
|
|
|
// non-escaping local object within the same function, then we know the
|
|
|
|
// object couldn't escape to a point where the call could return it.
|
|
|
|
//
|
|
|
|
// Note that if the pointers are in different functions, there are a
|
|
|
|
// variety of complications. A call with a nocapture argument may still
|
|
|
|
// temporary store the nocapture argument's value in a temporary memory
|
|
|
|
// location if that memory location doesn't escape. Or it may pass a
|
|
|
|
// nocapture value to other functions as long as they don't capture it.
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
if (isEscapeSource(O1) &&
|
|
|
|
isNonEscapingLocalObject(O2, &AAQI.IsCapturedCache))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.
AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.
MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.
Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59315
llvm-svn: 356783
2019-03-23 01:22:19 +08:00
|
|
|
if (isEscapeSource(O2) &&
|
|
|
|
isNonEscapingLocalObject(O1, &AAQI.IsCapturedCache))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2010-07-07 22:30:04 +08:00
|
|
|
}
|
|
|
|
|
2009-10-14 02:42:04 +08:00
|
|
|
// If the size of one access is larger than the entire object on the other
|
|
|
|
// side, then we know such behavior is undefined and can assume no alias.
|
llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
llvm-svn: 336613
2018-07-10 06:27:23 +08:00
|
|
|
bool NullIsValidLocation = NullPointerIsDefined(&F);
|
2019-08-24 01:56:10 +08:00
|
|
|
if ((isObjectSmallerThan(
|
|
|
|
O2, getMinimalExtentFrom(*V1, V1Size, DL, NullIsValidLocation), DL,
|
|
|
|
TLI, NullIsValidLocation)) ||
|
|
|
|
(isObjectSmallerThan(
|
|
|
|
O1, getMinimalExtentFrom(*V2, V2Size, DL, NullIsValidLocation), DL,
|
|
|
|
TLI, NullIsValidLocation)))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::NoAlias;
|
2013-08-24 22:16:00 +08:00
|
|
|
|
2020-11-18 03:11:09 +08:00
|
|
|
// If one the accesses may be before the accessed pointer, canonicalize this
|
|
|
|
// by using unknown after-pointer sizes for both accesses. This is
|
|
|
|
// equivalent, because regardless of which pointer is lower, one of them
|
|
|
|
// will always came after the other, as long as the underlying objects aren't
|
|
|
|
// disjoint. We do this so that the rest of BasicAA does not have to deal
|
|
|
|
// with accesses before the base pointer, and to improve cache utilization by
|
|
|
|
// merging equivalent states.
|
|
|
|
if (V1Size.mayBeBeforePointer() || V2Size.mayBeBeforePointer()) {
|
|
|
|
V1Size = LocationSize::afterPointer();
|
|
|
|
V2Size = LocationSize::afterPointer();
|
|
|
|
}
|
|
|
|
|
2021-02-19 06:13:33 +08:00
|
|
|
// FIXME: If this depth limit is hit, then we may cache sub-optimal results
|
|
|
|
// for recursive queries. For this reason, this limit is chosen to be large
|
|
|
|
// enough to be very rarely hit, while still being small enough to avoid
|
|
|
|
// stack overflows.
|
|
|
|
if (AAQI.Depth >= 512)
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2021-02-19 06:13:33 +08:00
|
|
|
|
2011-06-04 08:31:50 +08:00
|
|
|
// Check the cache before climbing up use-def chains. This also terminates
|
|
|
|
// otherwise infinitely recursive queries.
|
2021-04-03 16:53:56 +08:00
|
|
|
AAQueryInfo::LocPair Locs({V1, V1Size}, {V2, V2Size});
|
2021-03-16 21:36:17 +08:00
|
|
|
const bool Swapped = V1 > V2;
|
|
|
|
if (Swapped)
|
2011-06-04 08:31:50 +08:00
|
|
|
std::swap(Locs.first, Locs.second);
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
const auto &Pair = AAQI.AliasCache.try_emplace(
|
2021-03-05 18:58:13 +08:00
|
|
|
Locs, AAQueryInfo::CacheEntry{AliasResult::NoAlias, 0});
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
if (!Pair.second) {
|
|
|
|
auto &Entry = Pair.first->second;
|
|
|
|
if (!Entry.isDefinitive()) {
|
|
|
|
// Remember that we used an assumption.
|
|
|
|
++Entry.NumAssumptionUses;
|
2021-01-17 04:47:01 +08:00
|
|
|
++AAQI.NumAssumptionUses;
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
}
|
2021-03-16 21:36:17 +08:00
|
|
|
// Cache contains sorted {V1,V2} pairs but we should return original order.
|
|
|
|
auto Result = Entry.Result;
|
|
|
|
Result.swap(Swapped);
|
|
|
|
return Result;
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
}
|
2011-06-04 08:31:50 +08:00
|
|
|
|
2021-01-17 04:47:01 +08:00
|
|
|
int OrigNumAssumptionUses = AAQI.NumAssumptionUses;
|
|
|
|
unsigned OrigNumAssumptionBasedResults = AAQI.AssumptionBasedResults.size();
|
2020-10-24 16:39:24 +08:00
|
|
|
AliasResult Result =
|
|
|
|
aliasCheckRecursive(V1, V1Size, V2, V2Size, AAQI, O1, O2);
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
|
|
|
|
auto It = AAQI.AliasCache.find(Locs);
|
|
|
|
assert(It != AAQI.AliasCache.end() && "Must be in cache");
|
|
|
|
auto &Entry = It->second;
|
|
|
|
|
|
|
|
// Check whether a NoAlias assumption has been used, but disproven.
|
2021-03-05 18:58:13 +08:00
|
|
|
bool AssumptionDisproven =
|
|
|
|
Entry.NumAssumptionUses > 0 && Result != AliasResult::NoAlias;
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
if (AssumptionDisproven)
|
2021-03-05 18:58:13 +08:00
|
|
|
Result = AliasResult::MayAlias;
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
|
|
|
|
// This is a definitive result now, when considered as a root query.
|
2021-01-17 04:47:01 +08:00
|
|
|
AAQI.NumAssumptionUses -= Entry.NumAssumptionUses;
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
Entry.Result = Result;
|
2021-03-16 21:36:17 +08:00
|
|
|
// Cache contains sorted {V1,V2} pairs.
|
|
|
|
Entry.Result.swap(Swapped);
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
Entry.NumAssumptionUses = -1;
|
|
|
|
|
|
|
|
// If the assumption has been disproven, remove any results that may have
|
|
|
|
// been based on this assumption. Do this after the Entry updates above to
|
|
|
|
// avoid iterator invalidation.
|
|
|
|
if (AssumptionDisproven)
|
2021-01-17 04:47:01 +08:00
|
|
|
while (AAQI.AssumptionBasedResults.size() > OrigNumAssumptionBasedResults)
|
|
|
|
AAQI.AliasCache.erase(AAQI.AssumptionBasedResults.pop_back_val());
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
|
|
|
|
// The result may still be based on assumptions higher up in the chain.
|
|
|
|
// Remember it, so it can be purged from the cache later.
|
2021-03-05 18:58:13 +08:00
|
|
|
if (OrigNumAssumptionUses != AAQI.NumAssumptionUses &&
|
|
|
|
Result != AliasResult::MayAlias)
|
2021-01-17 04:47:01 +08:00
|
|
|
AAQI.AssumptionBasedResults.push_back(Locs);
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return Result;
|
|
|
|
}
|
|
|
|
|
|
|
|
AliasResult BasicAAResult::aliasCheckRecursive(
|
2020-10-24 16:39:24 +08:00
|
|
|
const Value *V1, LocationSize V1Size,
|
|
|
|
const Value *V2, LocationSize V2Size,
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
AAQueryInfo &AAQI, const Value *O1, const Value *O2) {
|
2010-10-19 02:04:47 +08:00
|
|
|
if (const GEPOperator *GV1 = dyn_cast<GEPOperator>(V1)) {
|
2020-10-24 16:39:24 +08:00
|
|
|
AliasResult Result = aliasGEP(GV1, V1Size, V2, V2Size, O1, O2, AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Result != AliasResult::MayAlias)
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return Result;
|
2020-10-18 23:34:22 +08:00
|
|
|
} else if (const GEPOperator *GV2 = dyn_cast<GEPOperator>(V2)) {
|
2020-10-24 16:39:24 +08:00
|
|
|
AliasResult Result = aliasGEP(GV2, V2Size, V1, V1Size, O2, O1, AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Result != AliasResult::MayAlias)
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return Result;
|
2010-10-19 02:04:47 +08:00
|
|
|
}
|
2009-10-14 06:02:20 +08:00
|
|
|
|
2010-10-19 02:04:47 +08:00
|
|
|
if (const PHINode *PN = dyn_cast<PHINode>(V1)) {
|
2020-10-24 16:39:24 +08:00
|
|
|
AliasResult Result = aliasPHI(PN, V1Size, V2, V2Size, AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Result != AliasResult::MayAlias)
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return Result;
|
2020-10-18 23:34:22 +08:00
|
|
|
} else if (const PHINode *PN = dyn_cast<PHINode>(V2)) {
|
2020-10-24 16:39:24 +08:00
|
|
|
AliasResult Result = aliasPHI(PN, V2Size, V1, V1Size, AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Result != AliasResult::MayAlias)
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return Result;
|
2010-10-19 02:04:47 +08:00
|
|
|
}
|
2005-04-22 05:13:18 +08:00
|
|
|
|
2010-10-19 02:04:47 +08:00
|
|
|
if (const SelectInst *S1 = dyn_cast<SelectInst>(V1)) {
|
2020-10-24 16:39:24 +08:00
|
|
|
AliasResult Result = aliasSelect(S1, V1Size, V2, V2Size, AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Result != AliasResult::MayAlias)
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return Result;
|
2020-10-18 23:34:22 +08:00
|
|
|
} else if (const SelectInst *S2 = dyn_cast<SelectInst>(V2)) {
|
2020-10-24 16:39:24 +08:00
|
|
|
AliasResult Result = aliasSelect(S2, V2Size, V1, V1Size, AAQI);
|
2021-03-05 18:58:13 +08:00
|
|
|
if (Result != AliasResult::MayAlias)
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
return Result;
|
2010-10-19 02:04:47 +08:00
|
|
|
}
|
2009-10-27 05:55:43 +08:00
|
|
|
|
2011-01-19 05:16:06 +08:00
|
|
|
// If both pointers are pointing into the same object and one of them
|
2016-01-18 07:13:48 +08:00
|
|
|
// accesses the entire object, then the accesses must overlap in some way.
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
if (O1 == O2) {
|
|
|
|
bool NullIsValidLocation = NullPointerIsDefined(&F);
|
2018-10-10 14:39:40 +08:00
|
|
|
if (V1Size.isPrecise() && V2Size.isPrecise() &&
|
2018-10-09 11:18:56 +08:00
|
|
|
(isObjectSize(O1, V1Size.getValue(), DL, TLI, NullIsValidLocation) ||
|
2020-10-18 22:41:44 +08:00
|
|
|
isObjectSize(O2, V2Size.getValue(), DL, TLI, NullIsValidLocation)))
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::PartialAlias;
|
[BasicAA] Fix BatchAA results for phi-phi assumptions
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
2020-11-23 01:23:53 +08:00
|
|
|
}
|
2011-01-19 05:16:06 +08:00
|
|
|
|
2021-03-05 18:58:13 +08:00
|
|
|
return AliasResult::MayAlias;
|
2003-02-27 03:41:54 +08:00
|
|
|
}
|
2014-01-02 11:31:36 +08:00
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Check whether two Values can be considered equivalent.
|
|
|
|
///
|
|
|
|
/// In addition to pointer equivalence of \p V1 and \p V2 this checks whether
|
|
|
|
/// they can not be part of a cycle in the value graph by looking at all
|
|
|
|
/// visited phi nodes an making sure that the phis cannot reach the value. We
|
|
|
|
/// have to do this because we are looking through phi nodes (That is we say
|
|
|
|
/// noalias(V, phi(VA, VB)) if noalias(V, VA) and noalias(V, VB).
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
bool BasicAAResult::isValueEqualInPotentialCycles(const Value *V,
|
|
|
|
const Value *V2) {
|
2014-01-02 11:31:36 +08:00
|
|
|
if (V != V2)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
const Instruction *Inst = dyn_cast<Instruction>(V);
|
|
|
|
if (!Inst)
|
|
|
|
return true;
|
|
|
|
|
2015-03-21 02:05:49 +08:00
|
|
|
if (VisitedPhiBBs.empty())
|
|
|
|
return true;
|
|
|
|
|
2014-01-03 13:47:03 +08:00
|
|
|
if (VisitedPhiBBs.size() > MaxNumPhiBBsValueReachabilityCheck)
|
|
|
|
return false;
|
2014-01-02 11:31:36 +08:00
|
|
|
|
2014-01-03 13:47:03 +08:00
|
|
|
// Make sure that the visited phis cannot reach the Value. This ensures that
|
|
|
|
// the Values cannot come from different iterations of a potential cycle the
|
|
|
|
// phi nodes could be involved in.
|
2014-08-25 07:23:06 +08:00
|
|
|
for (auto *P : VisitedPhiBBs)
|
2021-03-17 12:16:07 +08:00
|
|
|
if (isPotentiallyReachable(&P->front(), Inst, nullptr, DT))
|
2014-01-03 13:47:03 +08:00
|
|
|
return false;
|
2014-01-02 11:31:36 +08:00
|
|
|
|
2014-01-03 13:47:03 +08:00
|
|
|
return true;
|
2014-01-02 11:31:36 +08:00
|
|
|
}
|
|
|
|
|
2015-08-06 16:17:06 +08:00
|
|
|
/// Computes the symbolic difference between two de-composed GEPs.
|
|
|
|
///
|
|
|
|
/// Dest and Src are the variable indices from two decomposed GetElementPtr
|
|
|
|
/// instructions GEP1 and GEP2 which have common base pointers.
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
void BasicAAResult::GetIndexDifference(
|
2014-01-02 11:31:36 +08:00
|
|
|
SmallVectorImpl<VariableGEPIndex> &Dest,
|
|
|
|
const SmallVectorImpl<VariableGEPIndex> &Src) {
|
|
|
|
if (Src.empty())
|
|
|
|
return;
|
|
|
|
|
|
|
|
for (unsigned i = 0, e = Src.size(); i != e; ++i) {
|
|
|
|
const Value *V = Src[i].V;
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
unsigned ZExtBits = Src[i].ZExtBits, SExtBits = Src[i].SExtBits;
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
APInt Scale = Src[i].Scale;
|
2014-01-02 11:31:36 +08:00
|
|
|
|
|
|
|
// Find V in Dest. This is N^2, but pointer indices almost never have more
|
|
|
|
// than a few variable indexes.
|
|
|
|
for (unsigned j = 0, e = Dest.size(); j != e; ++j) {
|
2014-01-03 13:47:03 +08:00
|
|
|
if (!isValueEqualInPotentialCycles(Dest[j].V, V) ||
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
Dest[j].ZExtBits != ZExtBits || Dest[j].SExtBits != SExtBits)
|
2014-01-02 11:31:36 +08:00
|
|
|
continue;
|
|
|
|
|
|
|
|
// If we found it, subtract off Scale V's from the entry in Dest. If it
|
|
|
|
// goes to zero, remove the entry.
|
|
|
|
if (Dest[j].Scale != Scale)
|
|
|
|
Dest[j].Scale -= Scale;
|
|
|
|
else
|
|
|
|
Dest.erase(Dest.begin() + j);
|
|
|
|
Scale = 0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
// If we didn't consume this entry, add it to the end of the Dest list.
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
if (!!Scale) {
|
2020-12-14 03:27:16 +08:00
|
|
|
VariableGEPIndex Entry = {V, ZExtBits, SExtBits, -Scale, Src[i].CxtI};
|
2014-01-02 11:31:36 +08:00
|
|
|
Dest.push_back(Entry);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
bool BasicAAResult::constantOffsetHeuristic(
|
2018-10-09 10:14:33 +08:00
|
|
|
const SmallVectorImpl<VariableGEPIndex> &VarIndices,
|
2020-07-09 18:51:24 +08:00
|
|
|
LocationSize MaybeV1Size, LocationSize MaybeV2Size, const APInt &BaseOffset,
|
2018-10-09 10:14:33 +08:00
|
|
|
AssumptionCache *AC, DominatorTree *DT) {
|
2020-11-20 04:53:20 +08:00
|
|
|
if (VarIndices.size() != 2 || !MaybeV1Size.hasValue() ||
|
|
|
|
!MaybeV2Size.hasValue())
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
return false;
|
|
|
|
|
2018-10-09 11:18:56 +08:00
|
|
|
const uint64_t V1Size = MaybeV1Size.getValue();
|
|
|
|
const uint64_t V2Size = MaybeV2Size.getValue();
|
2018-10-09 10:14:33 +08:00
|
|
|
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
const VariableGEPIndex &Var0 = VarIndices[0], &Var1 = VarIndices[1];
|
|
|
|
|
|
|
|
if (Var0.ZExtBits != Var1.ZExtBits || Var0.SExtBits != Var1.SExtBits ||
|
2021-03-29 03:20:50 +08:00
|
|
|
Var0.Scale != -Var1.Scale || Var0.V->getType() != Var1.V->getType())
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
return false;
|
|
|
|
|
|
|
|
// We'll strip off the Extensions of Var0 and Var1 and do another round
|
|
|
|
// of GetLinearExpression decomposition. In the example above, if Var0
|
|
|
|
// is zext(%x + 1) we should get V1 == %x and V1Offset == 1.
|
|
|
|
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
LinearExpression E0 =
|
|
|
|
GetLinearExpression(ExtendedValue(Var0.V), DL, 0, AC, DT);
|
|
|
|
LinearExpression E1 =
|
|
|
|
GetLinearExpression(ExtendedValue(Var1.V), DL, 0, AC, DT);
|
|
|
|
if (E0.Scale != E1.Scale || E0.Val.ZExtBits != E1.Val.ZExtBits ||
|
|
|
|
E0.Val.SExtBits != E1.Val.SExtBits ||
|
|
|
|
!isValueEqualInPotentialCycles(E0.Val.V, E1.Val.V))
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
return false;
|
|
|
|
|
|
|
|
// We have a hit - Var0 and Var1 only differ by a constant offset!
|
|
|
|
|
|
|
|
// If we've been sext'ed then zext'd the maximum difference between Var0 and
|
|
|
|
// Var1 is possible to calculate, but we're just interested in the absolute
|
2015-10-24 19:38:01 +08:00
|
|
|
// minimum difference between the two. The minimum distance may occur due to
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
// wrapping; consider "add i3 %i, 5": if %i == 7 then 7 + 5 mod 8 == 4, and so
|
|
|
|
// the minimum distance between %i and %i + 5 is 3.
|
[BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:
First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)
Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.
Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.
This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:23:35 +08:00
|
|
|
APInt MinDiff = E0.Offset - E1.Offset, Wrapped = -MinDiff;
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
MinDiff = APIntOps::umin(MinDiff, Wrapped);
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
APInt MinDiffBytes =
|
|
|
|
MinDiff.zextOrTrunc(Var0.Scale.getBitWidth()) * Var0.Scale.abs();
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
|
|
|
|
// We can't definitely say whether GEP1 is before or after V2 due to wrapping
|
|
|
|
// arithmetic (i.e. for some values of GEP1 and V2 GEP1 < V2, and for other
|
|
|
|
// values GEP1 > V2). We'll therefore only declare NoAlias if both V1Size and
|
|
|
|
// V2Size can fit in the MinDiffBytes gap.
|
[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)
Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).
Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).
After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).
As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.
Patch by me with contributions from Michael Ferguson (mferguson@cray.com).
Differential Revision: https://reviews.llvm.org/D38662
llvm-svn: 350220
2019-01-03 00:28:09 +08:00
|
|
|
return MinDiffBytes.uge(V1Size + BaseOffset.abs()) &&
|
|
|
|
MinDiffBytes.uge(V2Size + BaseOffset.abs());
|
[BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!
This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.
This should fix PR24596.
Original commit message for r221876:
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).
Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!
llvm-svn: 246502
2015-09-01 06:32:47 +08:00
|
|
|
}
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
// BasicAliasAnalysis Pass
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2016-11-24 01:53:26 +08:00
|
|
|
AnalysisKey BasicAA::Key;
|
2016-03-11 18:22:49 +08:00
|
|
|
|
2016-08-09 08:28:15 +08:00
|
|
|
BasicAAResult BasicAA::run(Function &F, FunctionAnalysisManager &AM) {
|
2021-01-21 08:53:03 +08:00
|
|
|
auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);
|
|
|
|
auto &AC = AM.getResult<AssumptionAnalysis>(F);
|
|
|
|
auto *DT = &AM.getResult<DominatorTreeAnalysis>(F);
|
|
|
|
auto *PV = AM.getCachedResult<PhiValuesAnalysis>(F);
|
2021-03-17 12:16:07 +08:00
|
|
|
return BasicAAResult(F.getParent()->getDataLayout(), F, TLI, AC, DT, PV);
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
}
|
|
|
|
|
2015-10-27 05:22:58 +08:00
|
|
|
BasicAAWrapperPass::BasicAAWrapperPass() : FunctionPass(ID) {
|
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
2019-11-14 05:15:01 +08:00
|
|
|
initializeBasicAAWrapperPassPass(*PassRegistry::getPassRegistry());
|
2015-10-27 05:22:58 +08:00
|
|
|
}
|
|
|
|
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
char BasicAAWrapperPass::ID = 0;
|
2017-08-12 05:30:02 +08:00
|
|
|
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
void BasicAAWrapperPass::anchor() {}
|
|
|
|
|
2020-06-25 07:41:24 +08:00
|
|
|
INITIALIZE_PASS_BEGIN(BasicAAWrapperPass, "basic-aa",
|
2020-02-08 02:15:59 +08:00
|
|
|
"Basic Alias Analysis (stateless AA impl)", true, true)
|
2016-12-19 16:22:17 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
|
2016-03-11 21:53:18 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
|
2020-02-08 02:15:59 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(PhiValuesWrapperPass)
|
2020-06-25 07:41:24 +08:00
|
|
|
INITIALIZE_PASS_END(BasicAAWrapperPass, "basic-aa",
|
2020-02-08 02:15:59 +08:00
|
|
|
"Basic Alias Analysis (stateless AA impl)", true, true)
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
|
|
|
|
FunctionPass *llvm::createBasicAAWrapperPass() {
|
|
|
|
return new BasicAAWrapperPass();
|
|
|
|
}
|
|
|
|
|
|
|
|
bool BasicAAWrapperPass::runOnFunction(Function &F) {
|
2016-12-19 16:22:17 +08:00
|
|
|
auto &ACT = getAnalysis<AssumptionCacheTracker>();
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
auto &TLIWP = getAnalysis<TargetLibraryInfoWrapperPass>();
|
2016-03-11 21:53:18 +08:00
|
|
|
auto &DTWP = getAnalysis<DominatorTreeWrapperPass>();
|
2018-07-30 19:52:08 +08:00
|
|
|
auto *PVWP = getAnalysisIfAvailable<PhiValuesWrapperPass>();
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
|
Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.
This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.
Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.
There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.
Reviewers: chandlerc, hfinkel
Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66428
llvm-svn: 371284
2019-09-07 11:09:36 +08:00
|
|
|
Result.reset(new BasicAAResult(F.getParent()->getDataLayout(), F,
|
|
|
|
TLIWP.getTLI(F), ACT.getAssumptionCache(F),
|
|
|
|
&DTWP.getDomTree(),
|
2018-07-30 19:52:08 +08:00
|
|
|
PVWP ? &PVWP->getResult() : nullptr));
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
void BasicAAWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
|
|
|
|
AU.setPreservesAll();
|
2021-01-02 07:05:53 +08:00
|
|
|
AU.addRequiredTransitive<AssumptionCacheTracker>();
|
|
|
|
AU.addRequiredTransitive<DominatorTreeWrapperPass>();
|
|
|
|
AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();
|
2018-07-30 19:52:08 +08:00
|
|
|
AU.addUsedIfAvailable<PhiValuesWrapperPass>();
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
BasicAAResult llvm::createLegacyPMBasicAAResult(Pass &P, Function &F) {
|
|
|
|
return BasicAAResult(
|
Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.
This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.
Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.
There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.
Reviewers: chandlerc, hfinkel
Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66428
llvm-svn: 371284
2019-09-07 11:09:36 +08:00
|
|
|
F.getParent()->getDataLayout(), F,
|
|
|
|
P.getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F),
|
2016-12-19 16:22:17 +08:00
|
|
|
P.getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F));
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
}
|