llvm-project/llvm/lib/Analysis
Matt Arsenault 5e999cbe8d IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.

This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.

My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.

This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.

Background:

There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.

It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.

The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.

This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.

Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.

The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.

Additional context is in D79744.
2020-07-20 10:23:09 -04:00
..
models/inliner [llvm][NFC] ML Policies: changed the saved_model protobuf to text 2020-07-13 11:07:07 -07:00
AliasAnalysis.cpp [BasicAA] Rename -disable-basicaa to -disable-basic-aa to be consistent with the canonical name "basic-aa" 2020-06-26 20:55:44 -07:00
AliasAnalysisEvaluator.cpp [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). 2020-04-27 22:17:03 -07:00
AliasAnalysisSummary.cpp AliasAnalysisSummary.h - cleanup includes and forward declarations. NFC. 2020-04-21 11:32:58 +01:00
AliasAnalysisSummary.h AliasAnalysisSummary.h - cleanup includes and forward declarations. NFC. 2020-04-21 11:32:58 +01:00
AliasSetTracker.cpp [NFC] Remove trailing space 2020-02-18 10:49:13 +08:00
Analysis.cpp [MustExec] Add a generic "must-be-executed-context" explorer 2019-08-23 15:17:27 +00:00
AssumeBundleQueries.cpp Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" 2020-07-16 11:54:04 -07:00
AssumptionCache.cpp [NFC][DwarfDebug] Add test for variables with a single location which 2020-05-11 11:49:11 +02:00
BasicAliasAnalysis.cpp [BasicAA] Fix -basicaa-recphi for geps with negative offsets 2020-07-16 17:22:40 +01:00
BlockFrequencyInfo.cpp [BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare. 2020-05-07 11:58:00 -07:00
BlockFrequencyInfoImpl.cpp [BFI] Add a debug check for unknown block queries. 2020-02-04 10:05:28 -08:00
BranchProbabilityInfo.cpp [BPI] Compile time improvement when erasing blocks (NFC) 2020-07-10 16:55:54 -07:00
CFG.cpp CFG.h - reduce includes to forward declarations. NFC. 2020-06-06 15:06:42 +01:00
CFGPrinter.cpp [CFG] Turning on Heat Colors for CFG by default 2020-04-29 20:44:10 +00:00
CFLAndersAliasAnalysis.cpp [ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers. 2020-04-14 14:11:02 +03:00
CFLGraph.h
CFLSteensAliasAnalysis.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
CGSCCPassManager.cpp [NewPM] Move debugging log printing after PassInstrumentation before-pass-callbacks 2020-06-25 10:03:25 -07:00
CMakeLists.txt Revert "[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks" 2020-07-19 08:49:04 -07:00
CallGraph.cpp [CallGraph] Ignore callback uses 2020-07-14 13:08:49 -07:00
CallGraphSCCPass.cpp [CallGraph] Update callback call sites in RefreshCallGraph 2020-07-14 22:33:57 -05:00
CallPrinter.cpp [CallPrinter] Adding heat coloring to CallPrinter 2020-06-16 21:15:29 +00:00
CaptureTracking.cpp [Analysis] Ensure we include CommandLine.h if we declare any cl::opt flags. NFC. 2020-06-23 12:29:51 +01:00
CmpInstAnalysis.cpp
CodeMetrics.cpp CodeMetrics.cpp - remove unused includes. NFC. 2020-05-10 16:59:55 +01:00
ConstantFolding.cpp [ConstantFolding] check applicability of AllOnes constant creation first 2020-07-19 13:13:57 -04:00
CostModel.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
DDG.cpp [DDG] Data Dependence Graph - Graph Simplification 2020-02-19 13:41:51 -05:00
Delinearization.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
DemandedBits.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
DependenceAnalysis.cpp DependenceAnalysis.h - reduce AliasAnalysis.h include to forward declaration. NFC. 2020-06-07 12:47:37 +01:00
DependenceGraphBuilder.cpp SmallPtrSet::find -> SmallPtrSet::count 2020-06-07 22:38:08 +02:00
DivergenceAnalysis.cpp [DA] conservatively mark the join of every divergent branch 2020-06-18 17:39:20 +05:30
DomPrinter.cpp [CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo 2020-04-06 17:42:54 +00:00
DomTreeUpdater.cpp [DomTreeUpdater] Use const auto * when iterating over pointers (NFC). 2020-07-10 16:39:15 +01:00
DominanceFrontier.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
EHPersonalities.cpp
GlobalsModRef.cpp GlobalsModRef.h - reduce CallGraph.h include to forward declarations. NFC. 2020-06-25 16:00:43 +01:00
GuardUtils.cpp [NFC] Remove trailing space 2020-02-18 10:49:13 +08:00
HeatUtils.cpp [CallPrinter] Remove static constructor. 2020-06-17 13:02:58 +02:00
IVDescriptors.cpp [IVDescriptors] Remove unnecessary DemandedBits.h include; NFC 2020-04-04 12:07:57 +02:00
IVUsers.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
IndirectCallPromotionAnalysis.cpp [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
InlineAdvisor.cpp Revert "[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks" 2020-07-19 08:49:04 -07:00
InlineCost.cpp [InlineCost] GetElementPtr with constant operands 2020-06-25 18:09:51 +00:00
InlineFeaturesAnalysis.cpp [llvm][NFC] Fix license on InlineFeaturesAnalysis.{h|cpp} 2020-06-15 19:34:33 -07:00
InlineSizeEstimatorAnalysis.cpp [llvm] Moved InlineSizeEstimatorAnalysis test to .ll 2020-07-16 12:25:16 -07:00
InstCount.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
InstructionPrecedenceTracking.cpp [IPT] Don't use OrderedInstructions (NFC) 2020-04-20 18:25:31 +02:00
InstructionSimplify.cpp [InstSimplify] fold fcmp with infinity constant using isKnownNeverInfinity 2020-07-19 09:24:52 -04:00
Interval.cpp
IntervalPartition.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
LLVMBuild.txt [llvm][NFC] Move content of ML subdirectory into Analysis 2020-06-15 14:35:33 -07:00
LazyBlockFrequencyInfo.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
LazyBranchProbabilityInfo.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
LazyCallGraph.cpp [llvm][NFC][CallSite] Remove Implementation uses of CallSite 2020-04-14 14:49:47 -07:00
LazyValueInfo.cpp [NFC] Add 'override' keyword where missing in include/ and lib/. 2020-07-14 09:47:29 -07:00
LegacyDivergenceAnalysis.cpp Resubmit: [DA][TTI][AMDGPU] Add option to select GPUDA with TTI 2020-01-24 10:39:40 -08:00
Lint.cpp [Alignment] TargetLowering::hasPairedLoad must use Align for RequiredAlignment 2020-07-01 14:32:30 +00:00
Loads.cpp [Analysis] isDereferenceableAndAlignedPointer(): don't crash on `bitcast <1 x ???*> to ???*` 2020-06-27 18:30:59 +03:00
LoopAccessAnalysis.cpp [LV] Vectorize without versioning-for-unit-stride under -Os/-Oz 2020-07-07 15:04:21 +03:00
LoopAnalysisManager.cpp Add PassManagerImpl.h to hide implementation details 2020-02-03 11:15:55 -08:00
LoopCacheAnalysis.cpp LoopAnalysisManager.h - reduce includes to forward declarations. NFC. 2020-06-06 14:06:46 +01:00
LoopInfo.cpp [NFC] Add missing 'const' notion to LCSSA-related functions 2020-04-17 17:49:34 +07:00
LoopNestAnalysis.cpp [LoopNest]: Analysis to discover properties of a loop nest. 2020-03-03 18:25:19 +00:00
LoopPass.cpp NFC. Remove obsolete SimpleAnalysis infrastructure 2020-01-23 13:58:30 +07:00
LoopUnrollAnalyzer.cpp ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC. 2020-06-17 15:48:23 +01:00
MLInlineAdvisor.cpp [llvm] Release-mode ML InlineAdvisor 2020-06-24 08:18:42 -07:00
MemDepPrinter.cpp GVN.h - reduce AliasAnalysis.h include to forward declaration. NFC. 2020-06-25 16:59:35 +01:00
MemDerefPrinter.cpp Loads.h - reduce AliasAnalysis.h include to forward declarations. NFC. 2020-06-24 13:49:04 +01:00
MemoryBuiltins.cpp IR: Define byref parameter attribute 2020-07-20 10:23:09 -04:00
MemoryDependenceAnalysis.cpp [MemDep] Also remove load instructions from NonLocalDesCache. 2020-06-17 09:36:53 +01:00
MemoryLocation.cpp Fix MemoryLocation.h use without Instructions.h 2020-05-26 17:19:14 +01:00
MemorySSA.cpp [MemorySSA] Pass DT to the upward iterator for proper PhiTranslation. 2020-04-29 14:28:31 -07:00
MemorySSAUpdater.cpp MemorySSAUpdater.h - reduce unnecessary includes to forward declarations. NFC. 2020-06-05 10:45:59 +01:00
ModuleDebugInfoPrinter.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
ModuleSummaryAnalysis.cpp [StackSafety] Pass summary into codegen 2020-06-10 21:02:54 -07:00
MustExecute.cpp MustBeExecutedContextPrinter::runOnModule: Use unique_ptr to simplify/clarify ownership 2020-04-28 11:30:53 -07:00
ObjCARCAliasAnalysis.cpp ObjCARCAnalysisUtils.h - remove unused includes. NFC. 2020-05-26 19:22:15 +01:00
ObjCARCAnalysisUtils.cpp
ObjCARCInstKind.cpp [Analysis/Transforms/Sanitizers] As part of using inclusive language 2020-06-20 00:42:26 -07:00
OptimizationRemarkEmitter.cpp [BPI][NFC] Reuse post dominantor tree from analysis manager when available 2020-04-30 11:31:03 +07:00
PHITransAddr.cpp
PhiValues.cpp [PhiValues] Remove redundant map searches 2019-11-23 10:32:56 +02:00
PostDominators.cpp [CodeMoverUtils] Added an API to check if an instruction can be safely 2019-11-22 21:29:08 +00:00
ProfileSummaryInfo.cpp [NFC] Change getEntryForPercentile to be a static function in ProfileSummaryBuilder. 2020-07-09 16:38:19 -07:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
RegionPass.cpp
RegionPrinter.cpp [CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo 2020-04-06 17:42:54 +00:00
ReleaseModeModelRunner.cpp [llvm] Release-mode ML InlineAdvisor 2020-06-24 08:18:42 -07:00
ScalarEvolution.cpp [SCEV] Fix ScalarEvolution tests under NPM 2020-07-16 11:24:07 -07:00
ScalarEvolutionAliasAnalysis.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
ScalarEvolutionDivision.cpp [NFCI] SCEV: promote ScalarEvolutionDivision into an publicly usable class 2020-06-25 00:58:53 +03:00
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
StackLifetime.cpp [StackSafety,NFC] Don't rerun on LiveIn change 2020-06-19 21:29:31 -07:00
StackSafetyAnalysis.cpp StackSafetyAnalysis.cpp - pass ConstantRange arg as const reference. 2020-07-10 12:13:34 +01:00
StratifiedSets.h
SyncDependenceAnalysis.cpp [CFG/BasicBlock] Rename succ_const to const_succ. [NFC] 2020-03-25 12:40:55 -07:00
SyntheticCountsUtils.cpp [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
TFUtils.cpp [llvm][NFC] Hide the tensorflow dependency from headers. 2020-07-14 21:14:11 -07:00
TargetLibraryInfo.cpp [LLVM] Add libatomic load/store functions to TargetLibraryInfo 2020-07-18 03:18:48 +00:00
TargetTransformInfo.cpp [NFC] Separate Peeling Properties into its own struct (re-land after minor fix) 2020-07-10 18:39:30 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp [Metadata] Add TBAA struct metadata to `AAMDNode` 2020-01-06 11:05:15 +03:00
TypeMetadataUtils.cpp TypeMetadataUtils.h - reduce Instructions.h include to forward declaration. NFC. 2020-06-05 17:40:33 +01:00
VFABIDemangling.cpp [VFABI] Fix parsing of uniform parameters that shouldn't expect step or positional data. 2020-05-27 16:07:45 +00:00
ValueLattice.cpp [ValueLattice] Distinguish between constant ranges with/without undef. 2020-03-31 12:50:20 +01:00
ValueLatticeUtils.cpp [ValueLattice] Simplify canTrackGlobalVariableInterprocedurally (NFC). 2020-07-09 18:33:09 +01:00
ValueTracking.cpp [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison use canCreateUndefOrPoison 2020-07-20 09:21:39 +09:00
VectorUtils.cpp [FPEnv] Intrinsic llvm.roundeven 2020-05-26 19:24:58 +07:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//