that push immediate operands of 1, 2, and 4 bytes (extended to the native
register size in each case). The assembly mnemonics are "pushl" and "pushq."
One such instruction appears at the beginning of the "start" function , so this
is essential for accurate disassembly when unwinding."
Patch by Sean Callanan!
llvm-svn: 73407
- Change register allocation hint to a pair of unsigned integers. The hint type is zero (which means prefer the register specified as second part of the pair) or entirely target dependent.
- Allow targets to specify alternative register allocation orders based on allocation hint.
Part 2.
- Use the register allocation hint system to implement more aggressive load / store multiple formation.
- Aggressively form LDRD / STRD. These are formed *before* register allocation. It has to be done this way to shorten live interval of base and offset registers. e.g.
v1025 = LDR v1024, 0
v1026 = LDR v1024, 0
=>
v1025,v1026 = LDRD v1024, 0
If this transformation isn't done before allocation, v1024 will overlap v1025 which means it more difficult to allocate a register pair.
- Even with the register allocation hint, it may not be possible to get the desired allocation. In that case, the post-allocation load / store multiple pass must fix the ldrd / strd instructions. They can either become ldm / stm instructions or back to a pair of ldr / str instructions.
This is work in progress, not yet enabled.
llvm-svn: 73381
consecutive addresses togther. This makes it easier for the post-allocation pass
to form ldm / stm.
This is step 1. We are still missing a lot of ldm / stm opportunities because
of register allocation are not done in the desired order. More enhancements
coming.
llvm-svn: 73291
out of sync with regular cc.
The only difference between the tail call cc and the normal
cc was that one parameter register - R9 - was reserved for
calling functions through a function pointer. After time the
tail call cc has gotten out of sync with the regular cc.
We can use R11 which is also caller saved but not used as
parameter register for potential function pointers and
remove the special tail call cc on x86-64.
llvm-svn: 73233
Emission for globals, using the correct data sections
Function alignment can be computed for each target using TargetELFWriterInfo
Some small fixes
llvm-svn: 73201
on x86 to handle more cases. Fix a bug in said code that would cause it
to read past the end of an object. Rewrite the code in
SelectionDAGLegalize::ExpandBUILD_VECTOR to be a bit more general.
Remove PerformBuildVectorCombine, which is no longer necessary with
these changes. In addition to simplifying the code, with this change,
we can now catch a few more cases of consecutive loads.
llvm-svn: 73012
nodes for vectors with an i16 element type. Add an optimization for
building a vector which is all zeros/undef except for the bottom
element, where the bottom element is an i8 or i16.
llvm-svn: 72988
Update code generator to use this attribute and remove NoImplicitFloat target option.
Update llc to set this attribute when -no-implicit-float command line option is used.
llvm-svn: 72959
build vectors with i64 elements will only appear on 32b x86 before legalize.
Since vector widening occurs during legalize, and produces i64 build_vector
elements, the dag combiner is never run on these before legalize splits them
into 32b elements.
Teach the build_vector dag combine in x86 back end to recognize consecutive
loads producing the low part of the vector.
Convert the two uses of TLI's consecutive load recognizer to pass LoadSDNodes
since that was required implicitly.
Add a testcase for the transform.
Old:
subl $28, %esp
movl 32(%esp), %eax
movl 4(%eax), %ecx
movl %ecx, 4(%esp)
movl (%eax), %eax
movl %eax, (%esp)
movaps (%esp), %xmm0
pmovzxwd %xmm0, %xmm0
movl 36(%esp), %eax
movaps %xmm0, (%eax)
addl $28, %esp
ret
New:
movl 4(%esp), %eax
pmovzxwd (%eax), %xmm0
movl 8(%esp), %eax
movaps %xmm0, (%eax)
ret
llvm-svn: 72957