have their own custom memcpy lowering code. This code needs to be factored out
into a target-independent lowering method with hooks to the backend. In the
meantime, just call memcpy if we're trying to copy onto a stack.
llvm-svn: 43262
- Avoid attempting stride-reuse in the case that there are users that
aren't addresses. In that case, there will be places where the
multiplications won't be folded away, so it's better to try to
strength-reduce them.
- Several SSE intrinsics have operands that strength-reduction can
treat as addresses. The previous item makes this more visible, as
any non-address use of an IV can inhibit stride-reuse.
- Make ValidStride aware of whether there's likely to be a base
register in the address computation. This prevents it from thinking
that things like stride 9 are valid on x86 when the base register is
already occupied.
Also, XFAIL the 2007-08-10-LEA16Use32.ll test; the new logic to avoid
stride-reuse elimintes the LEA in the loop, so the test is no longer
testing what it was intended to test.
llvm-svn: 43231
operations so they work right for integers with funky
bit-widths. For example, consider extending i48 to i64
on a 32 bit machine. The i64 result is expanded to 2 x i32.
We know that the i48 operand will be promoted to i64, then
also expanded to 2 x i32. If we had the expanded promoted
operand to hand, then expanding the result would be trivial.
Unfortunately at this stage we can only get hold of the
promoted operand. So instead we kind of hand-expand, doing
explicit shifting and truncating to get the top and bottom
halves of the i64 operand into 2 x i32, which are then used
to expand the result. This is harmless, because when the
promoted operand is finally expanded all this bit fiddling
turns into trivial operations which are eliminated either
by the expansion code itself or the DAG combiner.
llvm-svn: 43223
Turn a store folding instruction into a load folding instruction. e.g.
xorl %edi, %eax
movl %eax, -32(%ebp)
movl -36(%ebp), %eax
orl %eax, -32(%ebp)
=>
xorl %edi, %eax
orl -36(%ebp), %eax
mov %eax, -32(%ebp)
This enables the unfolding optimization for a subsequent instruction which will
also eliminate the newly introduced store instruction.
llvm-svn: 43192
asserts in later checks rather than producing
the ordinary load it is supposed to. Avoid all
such hassles by directly returning an ordinary
load in this case.
llvm-svn: 43174
To do this it is necessary to add a "always inline" argument to the
memcpy node. For completeness I have also added this node to memmove
and memset. I have also added getMem* functions, because the extra
argument makes it cumbersome to use getNode and because I get confused
by it :-)
llvm-svn: 43172
types. This is needed for SIGN_EXTEND_INREG at least.
It is not clear if this is correct for other operations.
On the other hand, for the various load/store actions
it seems to correct to return the type action, as is
currently done.
Also, it seems that SelectionDAG::getValueType can be
called for extended value types; introduce a map for
holding these, since we don't really want to extend
the vector to be 2^32 pointers long!
Generalize DAGTypeLegalizer::PromoteResult_TRUNCATE
and DAGTypeLegalizer::PromoteResult_INT_EXTEND to handle
the various funky possibilities that apints introduce,
for example that you can promote to a type that needs
to be expanded.
llvm-svn: 43071
codegen support. This should have no effect on codegen
for other types. Debatable bits: (1) the use (abuse?)
of a set in SDNode::getValueTypeList; (2) the length of
getTypeToTransformTo, which maybe should be refactored
with a non-inline part for extended value types.
llvm-svn: 43030
was stored to the acutal stack slot before the parameters were
lowered to their stack slot. This could cause arguments to be
overwritten by the return address if the called function had less
parameters than the caller function. The update should remove the
last failing test case of llc-beta: SPASS.
llvm-svn: 43027
unconditionally creating an i64 bitcast. With the future legalizer
design, operation legalization can't introduce new nodes with illegal
types.
This fixes the rest of olden on ppc32.
llvm-svn: 43005
integer conversion. In some such cases this makes us one or two orders
of magnitude faster than NetBSD's libc. Glibc seems to have a similar
fast path.
Also, tighten up some upper bounds to save a bit of memory.
llvm-svn: 42984
getTypeToExpandTo. The difference is that
getTypeToExpandTo gives the final result of expansion
(eg: i128 -> i32 on a 32 bit machine) while
getTypeToTransformTo does just one step (i128 -> i64).
llvm-svn: 42982
take a deleted nodes vector, instead of requiring it.
One more significant change: Implement the start of a legalizer that
just works on types. This legalizer is designed to run before the
operation legalizer and ensure just that the input dag is transformed
into an output dag whose operand and result types are all legal, even
if the operations on those types are not.
This design/impl has the following advantages:
1. When finished, this will *significantly* reduce the amount of code in
LegalizeDAG.cpp. It will remove all the code related to promotion and
expansion as well as splitting and scalarizing vectors.
2. The new code is very simple, idiomatic, and modular: unlike
LegalizeDAG.cpp, it has no 3000 line long functions. :)
3. The implementation is completely iterative instead of recursive, good
for hacking on large dags without blowing out your stack.
4. The implementation updates nodes in place when possible instead of
deallocating and reallocating the entire graph that points to some
mutated node.
5. The code nicely separates out handling of operations with invalid
results from operations with invalid operands, making some cases
simpler and easier to understand.
6. The new -debug-only=legalize-types option is very very handy :),
allowing you to easily understand what legalize types is doing.
This is not yet done. Until the ifdef added to SelectionDAGISel.cpp is
enabled, this does nothing. However, this code is sufficient to legalize
all of the code in 186.crafty, olden and freebench on an x86 machine. The
biggest issues are:
1. Vectors aren't implemented at all yet
2. SoftFP is a mess, I need to talk to Evan about it.
3. No lowering to libcalls is implemented yet.
4. Various operations are missing etc.
5. There are FIXME's for stuff I hax0r'd out, like softfp.
Hey, at least it is a step in the right direction :). If you'd like to help,
just enable the #ifdef in SelectionDAGISel.cpp and compile code with it. If
this explodes it will tell you what needs to be implemented. Help is
certainly appreciated.
Once this goes in, we can do three things:
1. Add a new pass of dag combine between the "type legalizer" and "operation
legalizer" passes. This will let us catch some long-standing isel issues
that we miss because operation legalization often obfuscates the dag with
target-specific nodes.
2. We can rip out all of the type legalization code from LegalizeDAG.cpp,
making it much smaller and simpler. When that happens we can then
reimplement the core functionality left in it in a much more efficient and
non-recursive way.
3. Once the whole legalizer is non-recursive, we can implement whole-function
selectiondags maybe...
llvm-svn: 42981
Make two changes:
1) only xform "store of f32" if i32 is a legal type for the target.
2) only xform "store of f64" if either i64 or i32 are legal for the target.
3) if i64 isn't legal, manually lower to 2 stores of i32 instead of letting a
later pass of legalize do it. This is ugly, but helps future changes I'm
about to commit.
llvm-svn: 42980
the source register will be coalesced to the super register of the LHS. Properly
merge in the live ranges of the resulting coalesced interval that were part of
the original source interval to the live interval of the super-register.
llvm-svn: 42961
Turn this:
movswl %ax, %eax
movl %eax, -36(%ebp)
xorl %edi, -36(%ebp)
into
movswl %ax, %eax
xorl %edi, %eax
movl %eax, -36(%ebp)
by unfolding the load / store xorl into an xorl and a store when we know the
value in the spill slot is available in a register. This doesn't change the
number of instructions but reduce the number of times memory is accessed.
Also unfold some load folding instructions and reuse the value when similar
situation presents itself.
llvm-svn: 42947
from user input strings.
Such conversions are more intricate and subtle than they may appear;
it is unlikely I have got it completely right first time. I would
appreciate being informed of any bugs and incorrect roundings you
might discover.
llvm-svn: 42912