a trivial dense map. Use this in RewriteSingleStoreAlloca to
avoid aggressively rescanning blocks over and over again. This
fixes PR2925, speeding up mem2reg on the testcase in that bug
from 4.56s to 0.02s in a debug build on my machine.
llvm-svn: 58227
target-independent code to target-specific code. This prevents it
from running on targets that aren't using fast-isel.
In addition to saving compile time, this addresses the problem
that not all targets are prepared for it. In order to use this
pass, all instructions must declare all their fixed uses and
defs of physical registers.
llvm-svn: 58144
variable is moved to the execution engine. The JIT calls the TargetJITInfo
to allocate thread local storage. Currently, only linux/x86 knows how to
allocate thread local global variables.
llvm-svn: 58142
LHS is a foldable load, then LHS and RHS are swapped
and SetCCOpcode is changed to SETUGT. But the later
code is expecting operands to be the wrong way round
for SETUGT, but they are not in this case, resulting
in an inverted compare. The solution is to move the
load normalization before the correction for SETUGT.
This bug was tickled by LegalizeTypes which happened
to legalize the testcase slightly differently to
LegalizeDAG.
llvm-svn: 58092
LoopPass*.
- Although less precise, this means they can be used in clients
without RTTI (who would otherwise need to include LoopPass.h, which
eventually includes things using dynamic_cast). This was the
simplest solution that presented itself, but I am happy to use a
better one if available.
llvm-svn: 58010
assume that i64 has been turned into a BUILD_PAIR
node (when called from LegalizeTypes this hasn't
happened yet) and don't use a vector shuffle mask
with an illegal element type.
llvm-svn: 57972
may return i8, which can result in SELECT nodes for
which the type of the condition is i8, but there are
no patterns for select with i8 condition. Tweak the
LegalizeTypes logic to avoid this as much as possible.
This isn't a real fix because it is still perfectly
possible to end up with such select nodes - CellSPU
needs to be fixed IMHO.
llvm-svn: 57968
that is not of type MVT::i1 in SELECT and SETCC nodes.
Relax the LegalizeTypes SELECT condition promotion
sanity checks to allow other condition types than i1.
llvm-svn: 57966
to have a different type to the vector element
type. This should be fairly harmless because in
the past guys like this were being built all over
the place (and were cleaned up when I added this
check). The reason for relaxing this check is
that it helps LegalizeTypes legalize vector
shuffles: the mask is a BUILD_VECTOR that it is
*not always possible* to legalize while keeping it
a BUILD_VECTOR (vector_shuffle requires the mask
to be a BUILD_VECTOR, as opposed to a vector with
the right vector type). With this check it is even
harder to legalize the mask - turning the check off
means that LegalizeTypes manages to legalize almost
all vector shuffles encountered in practice. The
correct solution is to change vector_shuffle to be a
variadic node with the mask built into it as operands.
While waiting for that change, this hack stops the
problem with vector_shuffle from blocking the turning
on of LegalizeTypes.
llvm-svn: 57965
The same one Apple gcc uses, faster. Also gets the
extreme case in gcc.c-torture/execute/ieee/rbug.c
correct which we weren't before; this is not
sufficient to get the test to pass though, there
is another bug.
llvm-svn: 57926
in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.
Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.
llvm-svn: 57885
Where previously LLVM might emit code like this:
ucomisd %xmm1, %xmm0
setne %al
setp %cl
orb %al, %cl
jne .LBB4_2
it now emits this:
ucomisd %xmm1, %xmm0
jne .LBB4_2
jp .LBB4_2
It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.
To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.
Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.
llvm-svn: 57873
the copy instruction from the instruction list before asking the
target to create the new instruction. This gets the old instruction
out of the way so that it doesn't interfere with the target's
rematerialization code. In the case of x86, this helps it find
more cases where EFLAGS is not live.
Also, in the X86InstrInfo.cpp, teach isSafeToClobberEFLAGS to check
to see if it reached the end of the block after scanning each
instruction, instead of just before. This lets it notice when the
end of the block is only two instructions away, without doing any
additional scanning.
These changes allow rematerialization to clobber EFLAGS in more
cases, for example using xor instead of mov to set the return value
to zero in the included testcase.
llvm-svn: 57872
for strange asm conditions earlier. In this case, we have a
double being passed in an integer reg class. Convert to like
sized integer register so that we allocate the right number
for the class (two i32's for the f64 in this case).
llvm-svn: 57862
is re-written by the callback to branch directly to the compiled code
in future invocations.
Added back in range-based memory permission functions for the updating of
the stub on Darwin.
llvm-svn: 57846
result type when the result type is legal but
not the operand type. Add additional support
for EXTRACT_SUBVECTOR and CONCAT_VECTORS,
needed to handle such cases.
llvm-svn: 57840
LowerOperation if it doesn't know what else to do.
This methods should probably be factorized some,
but this is good enough for the moment. Have
LowerATOMIC_BINARY_64 use EXTRACT_ELEMENT rather
than assuming the operand is a BUILD_PAIR (if it
is then getNode will automagically simplify the
EXTRACT_ELEMENT). This way LowerATOMIC_BINARY_64
usable from LegalizeTypes.
llvm-svn: 57831
elements. Otherwise LegalizeTypes will, reasonably
enough, legalize the mask, which may result in it
no longer being a BUILD_VECTOR node (LegalizeDAG
simply ignores the legality or not of vector masks).
llvm-svn: 57782
the previous patch this one actually passes make check.
"Fix PR2356 on PowerPC: if we have an input and output that are tied together
that have different sizes (e.g. i32 and i64) make sure to reserve registers for
the bigger operand."
llvm-svn: 57771
and add a TargetLowering hook for it to use to determine when this
is legal (i.e. not in PIC mode, etc.)
This allows instruction selection to emit folded constant offsets
in more cases, such as the included testcase, eliminating the need
for explicit arithmetic instructions.
This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp
that attempted to achieve the same effect, but wasn't as effective.
Also, fix handling of offsets in GlobalAddressSDNodes in several
places, including changing GlobalAddressSDNode's offset from
int to int64_t.
The Mips, Alpha, Sparc, and CellSPU targets appear to be
unaware of GlobalAddress offsets currently, so set the hook to
false on those targets.
llvm-svn: 57748
have an unreachable block in a function. This was triggering the assert. This is
a horrid hack to cover this up.
Oh! for a good debug info architecture!
llvm-svn: 57714
in 32-bit mode instead of assigning a register pair. This has nothing to
do with PR2356, but I happened to notice it while working on it.
llvm-svn: 57704