successors) and use inverse depth first search to traverse the BBs. However
that doesn't work when the CFG has infinite loops. Simply do a linear
traversal of all BBs work just fine.
rdar://9344645
llvm-svn: 130324
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.
Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.
rdar://9329627
llvm-svn: 130245
Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>.
t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the
assembly printer correctly prints the 's' suffix.
Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags.
Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS.
Fixes ARM SBC lowering to check for live carry (potential bug).
llvm-svn: 130048
fix bugs exposed by the gcc dejagnu testsuite:
1. The load may actually be used by a dead instruction, which
would cause an assert.
2. The load may not be used by the current chain of instructions,
and we could move it past a side-effecting instruction. Change
how we process uses to define the problem away.
llvm-svn: 130018
On x86 this allows to fold a load into the cmp, greatly reducing register pressure.
movzbl (%rdi), %eax
cmpl $47, %eax
->
cmpb $47, (%rdi)
This shaves 8k off gcc.o on i386. I'll leave applying the patch in README.txt to Chris :)
llvm-svn: 130005
This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
uint64_t foo(uint64_t x) { return (x&1) << 42; }
which used to compile into bloated code:
shlq $42, %rdi ## encoding: [0x48,0xc1,0xe7,0x2a]
movabsq $4398046511104, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
andq %rdi, %rax ## encoding: [0x48,0x21,0xf8]
ret ## encoding: [0xc3]
with this patch we can fold the immediate into the and:
andq $1, %rdi ## encoding: [0x48,0x83,0xe7,0x01]
movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
shlq $42, %rax ## encoding: [0x48,0xc1,0xe0,0x2a]
ret ## encoding: [0xc3]
It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
that without making this code even more complicated. See the TODOs in the code.
llvm-svn: 129990
add <rd>, sp, #<imm8>
ldr <rd>, [sp, #<imm8>]
When the offset from sp is multiple of 4 and in range of 0-1020.
This saves code size by utilizing 16-bit instructions.
rdar://9321541
llvm-svn: 129971
This patch depends on the prior fix r129908 that changes to use std::find,
rather than std::binary_search, on unordered array.
Patch by Dan Bailey
llvm-svn: 129909
used by Clang. To help Clang integration, the PTX target has been split
into two targets: ptx32 and ptx64, depending on the desired pointer size.
- Add GCCBuiltin class to all intrinsics
- Split PTX target into ptx32 and ptx64
llvm-svn: 129851
manually and pass all (now) 4 arguments to the mul libcall. Add a new
ExpandLibCall for just this (copied gratuitously from type legalization).
Fixes rdar://9292577
llvm-svn: 129842
- As before, there is a minor semantic change here (evidenced by the test
change) for Darwin triples that have no version component. I debated changing
the default behavior of isOSVersionLT, but decided it made more sense for
triples to be explicit.
llvm-svn: 129805