with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
This introduces a check that the MBBIndexList is sorted as proposed in
http://reviews.llvm.org/D12443 but split up into a separate commit.
llvm-svn: 247166
Instead of extracting both 32-bit components from the 128-bit
register. This produces fewer copies and is easier for
the copy peephole optimizer to understand and see the actual uses
as extracts from a reg_sequence.
This avoids needing to handle subregister composing in the
PeepholeOptimizer's ValueTracker for this case.
llvm-svn: 247162
Summary:
This can be used for distinguishing between cmake and autoconf builds.
Users may need this in order to handle inconsistencies between the
outputs of the two build systems.
Reviewers: echristo, chandlerc, beanz
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11838
llvm-svn: 247159
Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12256
llvm-svn: 247157
Summary:
We are not scalarizing the wide selects in codegen for i16 and i32 and
therefore we can remove the amortization factor. We still have issues
with i64 vectors in codegen though.
Reviewers: mcrosier
Subscribers: mcrosier, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D12724
llvm-svn: 247156
With this patch we create a dynamic string table (it is allocated, unlike
the regular one) and the dynamic section has a DT_STRTAB pointing to it.
llvm-svn: 247155
Summary:
Cross-compilation uses recursive cmake invocations to build native host
tools. These recursive invocations only forward a fixed set of
variables/options, since the native environment is generally the default.
This change adds -DLLVM_TARGET_IS_CROSSCOMPILE_HOST=TRUE to the recursive
cmake invocations, so that cmake files can distinguish these recursive
invocations from top-level ones, which can explain why expected options
are unset.
LLILC will use this to avoid trying to generate its build rules in the
crosscompile native host target (where it is not needed), which would fail
if attempted because LLILC requires a cmake variable passed on the command
line, which is not forwarded in the recursive invocation.
Reviewers: rnk, beanz
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12679
llvm-svn: 247151
The support for pointer expressions is broken as it can only handle
some patterns in the IslExprBuilder. We should to treat pointers in
expressions the same as integers at some point and revert this patch.
llvm-svn: 247147
Add a custom makefile rule to generate lib/External/isl/gitversion.h
from GIT_HEAD_ID and trigger it using BULIT_SOURCES to ensure the file
exists before compilation starts.
The latest ISL creates gitversion.h from Makefile.am only, instead also
from configure.ac in previous version. While the Polly build invokes
configure, it does not invoke ISL's make such that the file was missing.
Invoking ISL's make would come with additional problems such as
triggering automake because of not preserved file time stamps.
Re-running automake might not be successful on other system
configurations for instance because it was preconfigured without
--with-clang option.
llvm-svn: 247142
Predicating stores requires creating extra blocks. It's much cleaner if we do this in one pass instead of mutating the CFG while writing vector instructions.
Besides which we can make use of helper functions to update domtree for us, reducing the work we need to do.
llvm-svn: 247139
The tests in test/CodeGen/arm-target-features.c are currently
passing but warning messages are suppressed. These tests are now
synchronized with the corresponding changes in Target Parser.
This patch will fix the regressions in clang caused by r247136
Differential Revision: http://reviews.llvm.org/D12722
llvm-svn: 247138
Removed "cortex-r5f" and "cortex-m4f" from Target Parser, sinced they are
unknown cpu names for llvm and clang. Also updated default FPUs for R5 and M4
accordingly.
Differential Revision: http://reviews.llvm.org/D12692
Change-Id: Ib81c7216521a361d8ee1296e4b6a2aa00bd479c5
llvm-svn: 247136
In some special case (e.g. signal handlers, hand written assembly) it is
valid to have 2 stack frame with the same CFA value. This CL change the
looping stack detection code to report a loop only if at least 3
consecutive frames have the same CFA.
Differential revision: http://reviews.llvm.org/D12699
llvm-svn: 247133
* Create new dwo symbol file class
* Add handling for .dwo sections
* Change indexes in SymbolFileDWARF to store compile unit offset next to
DIE offset
* Propagate queries from dwarf compile unit to the dwo compile unit
where applicable
Differential revision: http://reviews.llvm.org/D12291
llvm-svn: 247132
* Remove some unused code
* Remove usage of DWARFDebugInfoEntry::Attributes where usage isn't
reasonable
* Cleanup DWARFMappedHash with separating it to header and implementation
file and fixing the visibility of the functions
Differential revision: http://reviews.llvm.org/D12374
llvm-svn: 247131
Summary:
One of the vector splitting paths for extract_vector_elt tries to lower:
define i1 @via_stack_bug(i8 signext %idx) {
%1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx
ret i1 %1
}
to:
define i1 @via_stack_bug(i8 signext %idx) {
%base = alloca <2 x i1>
store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base
%2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx
%3 = load i1, i1* %2
ret i1 %3
}
However, the elements of <2 x i1> are not byte-addressible. The result of this
is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies
to '%base + %idx * 0', and then simply '%base' causing all values of %idx to
extract element zero.
This commit fixes this by promoting the vector elements of <8-bits to i8 before
splitting the vector.
This fixes a number of test failures in pocl.
Reviewers: pekka.jaaskelainen
Subscribers: pekka.jaaskelainen, llvm-commits
Differential Revision: http://reviews.llvm.org/D12591
llvm-svn: 247128
Not all users of our IslNodeBuilder will attach scheduling information to the
AST in the same way IslAstInfo is doing it today. By going through a virtual
function when extracting the schedule of an AST node other users can provide
their own functions for extract scheduling information in case they attach
scheduling information in a different way to the AST nodes.
No functional change for Polly itself intended.
llvm-svn: 247126
Summary:
When lldb is processing a location containing DW_OP_piece, the result is being
stored in the 'pieces' variable. The location is popped from the 'stack' variable.
So this check to see that 'stack' is not empty was invalid and caused the pieces
after the first to not get processed.
I am working on an architecture which has 16-bit and 8-bit registers. So this
problem was quite easy to see. But I was able to re-produce this issue on x86
too with long long variable and compiling woth -m32. It resulted in following
location list.
00000014 08048496 080484b5 (DW_OP_reg6 (esi); DW_OP_piece: 4; DW_OP_reg7 (edi); DW_OP_piece: 4)
and lldb was only showing the contents of first register when I evaluated the
variable as it does not process the 2nd piece due to this check.
Reviewers: clayborg, aprantl
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D12674
llvm-svn: 247124
qXfer:features:read:target.xml packet, or via the
plugin.process.gdb-remote.target-definition-file setting, if the
register definition doesn't give us eh_frame or DWARF register
numbers for that register, try to get that information from the ABI
plugin.
The DWARF/eh_frame register numbers are defined in the ABI
standardization documents - so getting this from the ABI plugin is
reasonable. There's little value in having the remote stub inform
us of this generic information, as long as we can all agree on the
names of the registers.
There's some additional information we could get from the ABI. For
instance, on ABIs where function arguments are passed in registers,
lldb defines alternate names like "arg1", "arg2", "arg3" for these
registers so they can be referred to that way by the user. We could
get this from the ABI if the remote stub doesn't provide that. That
may be something worth doing in the future - but for now, I'm keeping
this a little more minimal.
Thinking about this, what we want/need from the remote stub at a
minimum are:
1. The names of the register
2. The number that the stub will use to refer to the register with
the p/P packets and in the ? response packets (T/S) where
expedited register values are provided
3. The size of the register in bytes
(nice to have, to remove any doubt)
4. The offset of the register in the g/G packet if we're going to
use that for reading/writing registers.
debugserver traditionally provides a lot more information in
addition to this via the qRegisterInfo packet, and debugserver
augments its response to the qXfer:features:read:target.xml
query to include this information. Including:
DWARF regnum, eh_frame regnum, stabs regnum, encoding (ieee754,
Uint, Vector, Sint), format (hex, unsigned, pointer, vectorof*,
float), registers that should be marked as invalid if this
register is modified, and registers that contain this register.
We might want to get all of this from the ABI - I'm not convinced
that it makes sense for the remote stub to provide any of these
details, as long as the ABI and remote stub can agree on register
names.
Anyway, start with eh_frame and DWARF coming from the ABI if
they're not provided by the remote stub. We can look at doing
more in the future.
<rdar://problem/22566585>
llvm-svn: 247121
When lldb receives a gdb-remote protocol packet that has
nonprintable characters, it will print the packet in
gdb-remote logging with binary-hex encoding so we don't
dump random 8-bit characters into the packet log.
I'm changing this check to allow whitespace characters
(newlines, linefeeds, tabs) to be printed if those are
the only non-printable characters in the packet.
This is primarily to get the response to the
qXfer:features:read:target.xml packet to show up in the
packet logs in human readable form. Right now we just
get a dozen kilobytes of hex-ascii and it's hard to
figure out what register number scheme is being used.
llvm-svn: 247120