Reduces the ARCMT migrator plist writer down to a single function,
arcmt::writeARCDiagsToPlist() which shares supporting functions with the
analyzer plist writer.
llvm-svn: 200075
a FunctionPass. With this change the loop vectorizer no longer is a loop
pass and can readily depend on function analyses. In particular, with
this change we no longer have to form a loop pass manager to run the
loop vectorizer which simplifies the entire pass management of LLVM.
The next step here is to teach the loop vectorizer to leverage profile
information through the profile information providing analysis passes.
llvm-svn: 200074
There are a couple of pieces:
* some lazy-evaluation members that store info listed in a qSupported response
* new method SendPacketsAndConcatenateResponses which is used for
fetching fixed-size objects from the remote gdbserver by using multiple
packets if necessary (first use will be to fetch shared-library XML files).
llvm-svn: 200072
GetU32 and GetU64, to use memcpy to copy bytes into a local buffer instead
of having a (uint64_t *) etc local variable, pointing to the address, and
dereferencing it. If compiled on a CPU where data alignment is required
(e.g. the LDM instruction on armv7) and we try to GetU64 out of a mmap'ed
DWARF file, that 8 byte quantity may not be world aligned and the program
can get an unaligned memory access fault.
<rdar://problem/15849231>
llvm-svn: 200069
the loops in a function, and teach LICM to work in the presance of
LCSSA.
Previously, LCSSA was a loop pass. That made passes requiring it also be
loop passes and unable to depend on function analysis passes easily. It
also caused outer loops to have a different "canonical" form from inner
loops during analysis. Instead, we go into LCSSA form and preserve it
through the loop pass manager run.
Note that this has the same problem as LoopSimplify that prevents
enabling its verification -- loop passes which run at the end of the loop
pass manager and don't preserve these are valid, but the subsequent loop
pass runs of outer loops that do preserve this pass trigger too much
verification and fail because the inner loop no longer verifies.
The other problem this exposed is that LICM was completely unable to
handle LCSSA form. It didn't preserve it and it actually would give up
on moving instructions in many cases when they were used by an LCSSA phi
node. I've taught LICM to support detecting LCSSA-form PHI nodes and to
hoist and sink around them. This may actually let LICM fire
significantly more because we put everything into LCSSA form to rotate
the loop before running LICM. =/ Now LICM should handle that fine and
preserve it correctly. The down side is that LICM has to require LCSSA
in order to preserve it. This is just a fact of life for LCSSA. It's
entirely possible we should completely remove LCSSA from the optimizer.
The test updates are essentially accomodating LCSSA phi nodes in the
output of LICM, and the fact that we now completely sink every
instruction in ashr-crash below the loop bodies prior to unrolling.
With this change, LCSSA is computed only three times in the pass
pipeline. One of them could be removed (and potentially a SCEV run and
a separate LoopPassManager entirely!) if we had a LoopPass variant of
InstCombine that ran InstCombine on the loop body but refused to combine
away LCSSA PHI nodes. Currently, this also prevents loop unrolling from
being in the same loop pass manager is rotate, LICM, and unswitch.
There is one thing that I *really* don't like -- preserving LCSSA in
LICM is quite expensive. We end up having to re-run LCSSA twice for some
loops after LICM runs because LICM can undo LCSSA both in the current
loop and the parent loop. I don't really see good solutions to this
other than to completely move away from LCSSA and using tools like
SSAUpdater instead.
llvm-svn: 200067
right after the space for it is allocated on the stack, instead of trying
to initialize it in all the different places in this method. It's too easy
for another uninitialized code path to sneak in as it is written right now.
llvm-svn: 200066
This commit caused -Woverloaded-virtual warnings. The two new
TargetTransformInfo::getIntImmCost functions were only added to the superclass,
and to the X86 subclass. The other targets were not updated, and the
warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was
hiding the two new getIntImmCost variants.
We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost"
to the various subclasses, or turning it off, but I suspect that it's wrong to
leave the functions unimplemnted in those targets. The default implementations
return TCC_Free, which I don't think is right e.g. for ARM.
llvm-svn: 200058
Previously, string literals were ignored in all logical expressions. This
reduces it to only ignore in logical and expressions.
assert(0 && "error"); // No warning
assert(0 || "error"); // Warn
Fixes PR17565
llvm-svn: 200056
This patch uses a common MipsTargetSteamer interface for both
MipsAsmPrinter and MipsAsmParser for recording default and commandline
driven directives that affect ELF header flags.
It has been noted that the .ll tests affected by this patch belong in
test/Codegen/Mips. I will move them in a separate patch.
Also, a number of directives do not get expressed by AsmPrinter in the
resultant .s assembly such as setting the correct ASI. I have noted this
in the tests and they will be addressed in later patches.
llvm-svn: 200051
The i8 type is not registered with any register class.
This causes a segmentation fault in MachineLICM::getRegisterClassIDAndCost.
The code selects the first type associated with register class FPR8,
which happens to be i8.
It uses this type (i8) to get the representative class pointer, which is 0.
It then uses this pointer to access a field, resulting in segmentation fault.
Since i8 type is not being used for printing any neon instruction
we can safely remove it.
llvm-svn: 200046
allow this, and we should warn on it, but it turns out that people were already
relying on this.
We should introduce a -Wgcc-compat warning for this if the attributes are known
to GCC, but we don't currently track enough information about attributes to do
so reliably.
llvm-svn: 200045
It was redundant (since ARM mode is the default) and misleading since
(e.g.) Cortex-A15 would not satisfy the #ifdef but would be in ARM
mode regardless.
llvm-svn: 200043
This change does not affect anything because everybody seems to be using
Object/COFF.h instead. But the definition is not for PE32 but for PE32+,
so fix it anyway.
llvm-svn: 200038
remains ARM mode only, supporting thumb requires explicit it prefixes
for the predicted adds/subs and adjusting the offset computation for the
different block sizes.
llvm-svn: 200035
Retry commit r200022 with a fix for the build bot errors. Constant expressions
have (unlike instructions) module scope use lists and therefore may have users
in different functions. The fix is to simply ignore these out-of-function uses.
llvm-svn: 200034
DAGCombiner::GatherAllAliases, which is only used when AA used is enabled
during DAGCombine, had a fundamentally incorrect assumption for which this
change compensates. GatherAllAliases, which is used to find aliasing
predecessor chain nodes (so that a better chain can be selected for a load or
store to enable subsequent optimizations) assumed that walking up the chain
would always catch all possibly-aliasing loads and stores. This is not true: To
really find all aliases, we also need to search for aliases through the value
operand of a store, etc. Consider the following situation:
Token1 = ...
L1 = load Token1, %52
S1 = store Token1, L1, %51
L2 = load Token1, %52+8
S2 = store Token1, L2, %51+8
Token2 = Token(S1, S2)
L3 = load Token2, %53
S3 = store Token2, L3, %52
L4 = load Token2, %53+8
S4 = store Token2, L4, %52+8
If we search for aliases of S3 (which loads address %52), and we look only
through the chain, then we'll miss the trivial dependence on L1 (which loads
from %52). We then might change all loads and stores to use Token1 as their
chain operand, which could result in copying %53 into %52 before copying
%52 into %51 (which should happen first).
The problem is, however, that searching for such data dependencies can become
expensive, and the cost is not directly related to the chain depth. Instead,
we'll rule out such configurations by insisting that we've visited all chain
users (except for users of the original chain, which is not necessary). When
doing this, we need to look through nodes we don't care about (otherwise,
things like register copies will interfere with trivial use cases).
Unfortunately, I don't have a small test case for this problem. Creating the
underlying situation is not hard (a pair of memcpys will do it), but arranging
for the default instruction schedule to be incorrect is very fragile.
This unbreaks self hosting on PPC64 when using
-mllvm -combiner-global-alias-analysis -mllvm -combiner-alias-analysis.
llvm-svn: 200033
might have a smaller size as compared to the stand-alone type of the base class.
This is possible when the derived class is packed and hence might have smaller
alignment requirement than the base class.
Differential Revision: http://llvm-reviews.chandlerc.com/D2599
llvm-svn: 200031
These transformations obviously won't work for indexed (pre/post-inc) loads and
stores. In practice, I'm not sure there is any benefit to enabling them for
indexed nodes because other transformations that these might enable likely also
won't handle indexed nodes.
I don't have an in-tree test case that hits this problem, but an upcoming bug
fix will make it much more likely.
llvm-svn: 200023
This pass identifies expensive constants to hoist and coalesces them to
better prepare it for SelectionDAG-based code generation. This works around the
limitations of the basic-block-at-a-time approach.
First it scans all instructions for integer constants and calculates its
cost. If the constant can be folded into the instruction (the cost is
TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't
consider it expensive and leave it alone. This is the default behavior and
the default implementation of getIntImmCost will always return TCC_Free.
If the cost is more than TCC_BASIC, then the integer constant can't be folded
into the instruction and it might be beneficial to hoist the constant.
Similar constants are coalesced to reduce register pressure and
materialization code.
When a constant is hoisted, it is also hidden behind a bitcast to force it to
be live-out of the basic block. Otherwise the constant would be just
duplicated and each basic block would have its own copy in the SelectionDAG.
The SelectionDAG recognizes such constants as opaque and doesn't perform
certain transformations on them, which would create a new expensive constant.
This optimization is only applied to integer constants in instructions and
simple (this means not nested) constant cast experessions. For example:
%0 = load i64* inttoptr (i64 big_constant to i64*)
Reviewed by Eric
llvm-svn: 200022