variable declaration as an argument because we want that address
anyhow for our debug information.
This seems to fix rdar://9965111, at least we have more debug
information than before and from reading the assembly it appears
to be the correct location.
llvm-svn: 151335
I'll let the buildbots determine the compile time improvements from this
change, but 464.h264ref has 5% faster codegen at -O2.
This patch does cause some assembly changes. Branch folding can make
different decisions about calls with dead return values.
CriticalAntiDepBreaker may choose different registers because its
liveness tracking is affected. MachineCopyPropagation may sometimes
leave a dead copy behind.
llvm-svn: 151331
The tied source operand of tMUL is the second source operand, not the
first like every other two-address thumb instruction. Special case it
in the size reduction pass to make sure we create the tMUL instruction
properly.
llvm-svn: 151315
In historical reason, Interpreter's external entries had prefix "lle_X_" as C linkage, even for well-known entries in EE/Interpreter.
Now, at least on ToT, they are resolved via FuncNames[] mapper.
We will not need their symbols are expected to be exported any more.
Clang r150128 has introduced the warning <"%0 has C-linkage specified, but returns user-defined type %1 which is incompatible with C">.
llvm-svn: 151312
Assuming that a single std::set node adds 3 control words, a bitvector
can store (3*8+4)*8=224 registers in the allocated memory of a single
element in the std::set (x86_64). Also we don't have to call malloc
for every register added.
llvm-svn: 151269
rdar://10873652
As part of this I updated the llvm-mc disassembler C API to always call the
SymbolLookUp call back even if there is no getOpInfo call back. If there is a
getOpInfo call back that is tried first and then if that gets no information
then the SymbolLookUp is called. I also made the code more robust by
memset(3)'ing to zero the LLVMOpInfo1 struct before then setting
SymbolicOp.Value before for the call to getOpInfo. And also don't use any
values from the LLVMOpInfo1 struct if getOpInfo returns 0. And also don't
use any of the ReferenceType or ReferenceName values from SymbolLookUp if it
returns NULL. rdar://10873563 and rdar://10873683
For the X86 target also fixed bugs so the annotations get printed.
Also fixed a few places in the ARM target that was not producing symbolic
operands for some instructions. rdar://10878166
llvm-svn: 151267
Before register allocation, instructions can be moved across calls in
order to reduce register pressure. After register allocation, we don't
gain a lot by moving callee-saved defs across calls. In fact, since the
scheduler doesn't have a good idea how registers are used in the callee,
it can't really make good scheduling decisions.
This changes the schedule in two ways: 1. Latencies to call uses and
defs are no longer accounted for, causing some random shuffling around
calls. This isn't really a problem since those uses and defs are
inaccurate proxies for what happens inside the callee. They don't
represent registers used by the call instruction itself.
2. Instructions are no longer moved across calls. This didn't happen
very often, and the scheduling decision was made on dubious information
anyway.
As with any scheduling change, benchmark numbers shift around a bit,
but there is no positive or negative trend from this change.
This makes the post-ra scheduler 5% faster for ARM targets.
The secret motivation for this patch is the introduction of register
mask operands representing call clobbers. The most efficient way of
handling regmasks in ScheduleDAGInstrs is to model them as barriers for
physreg live ranges, but not for virtreg live ranges. That's fine
pre-ra, but post-ra it would have the same effect as this patch.
llvm-svn: 151265
it with memcpy. This also fixes a problem on big-endian hosts, where
addUnaligned would return different results depending on the alignment
of the data.
llvm-svn: 151247
value is zero. Instead of a cmov + op, issue an conditional op instead. e.g.
cmp r9, r4
mov r4, #0
moveq r4, #1
orr lr, lr, r4
should be:
cmp r9, r4
orreq lr, lr, #1
That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend
this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y).
It's possible to extend this to ADD and SUB but I don't think they are common.
rdar://8659097
llvm-svn: 151224