had gotten out of sync: isCastable didn't think it was possible to
cast the x86_mmx type to anything, while it did think it possible
to cast an i64 to x86_mmx.
llvm-svn: 128705
all LDR/STR changes and left them to a future patch. Passing all
checks now.
- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
fix the encoding wherever is possible.
- Add a new encoding bit to describe the index mode used and teach
printAddrMode2Operand to check by the addressing mode which index
mode to print.
- Testcases
llvm-svn: 128689
This way, shrinkToUses() will ignore the instruction that is about to be
deleted, and we avoid leaving invalid live ranges that SplitKit doesn't like.
Fix a misunderstanding in MachineVerifier about <def,undef> operands. The
<undef> flag is valid on def operands where it has the same meaning as <undef>
on a use operand. It only applies to sub-register defines which also read the
full register.
llvm-svn: 128642
- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
{STR,LDC}{2}_{PRE,POST} fixing the encoding wherever is possible.
- Move all instructions which use am2offset without a pattern to use
addrmode2.
- Add a new encoding bit to describe the index mode used and teach
printAddrMode2Operand to check by the addressing mode which index
mode to print.
- Testcases
llvm-svn: 128632
We don't expect the real "powf()" on some hosts (and powf() would be available on other hosts).
For consistency, std::pow(double,double) may be called instead.
Or, precision issue might attack us, to see unstable regalloc and stack coloring.
llvm-svn: 128629
The rematerialized instruction may require a more constrained register class
than the register being spilled. In the test case, the spilled register has been
inflated to the DPR register class, but we are rematerializing a load of the
ssub_0 sub-register which only exists for DPR_VFP2 registers.
The register class is reinflated after spilling, so the conservative choice is
only temporary.
llvm-svn: 128610
{STR,LDC}{2}_PRE.
- Fixed the encoding in some places.
- Some of those instructions were using am2offset and now use addrmode2.
Codegen isn't affected, instructions which use SelectAddrMode2Offset were not
touched.
- Teach printAddrMode2Operand to check by the addressing mode which index
mode to print.
- This is a work in progress, more work to come. The idea is to change places
which use am2offset to use addrmode2 instead, as to unify assembly parser.
- Add testcases for assembly parser
llvm-svn: 128585
that one of the numbers is signed while the other is unsigned. This could lead
to a wrong result when the signed was promoted to an unsigned int.
* Add the data layout line to the testcase so that it will test the appropriate
thing.
Patch by David Terei!
llvm-svn: 128577
StringMap was not properly updating NumTombstones after a clear or rehash.
This was not fatal until now because the table was growing faster than
NumTombstones could, but with the previous change of preventing infinite
growth of the table the invariant (NumItems + NumTombstones <= NumBuckets)
stopped being observed, causing infinite loops in certain situations.
Patch by José Fonseca!
llvm-svn: 128567
When the hash function uses object pointers all free entries eventually
become tombstones as they are used at least once, regardless of the size.
DenseMap cannot function with zero empty keys, so it double size to get
get ridof the tombstones.
However DenseMap never shrinks automatically unless it is cleared, so
the net result is that certain tables grow infinitely.
The solution is to make a fresh copy of the table without tombstones
instead of doubling size, by simply calling grow with the current size.
Patch by José Fonseca!
llvm-svn: 128564
The rewriter can keep track of multiple stack slots in the same register if they
happen to have the same value. When an instruction modifies a stack slot by
defining a register that is mapped to a stack slot, other stack slots in that
register are no longer valid.
This is a very rare problem, and I don't have a simple test case. I get the
impression that VirtRegRewriter knows it is about to be deleted, inventing a
last opaque problem.
<rdar://problem/9204040>
llvm-svn: 128562
The idea is, that if an ieee 754 float is divided by a power of two, we can
turn the division into a cheaper multiplication. This function sees if we can
get an exact multiplicative inverse for a divisor and returns it if possible.
This is the hard part of PR9587.
I tested many inputs against llvm-gcc's frotend implementation of this
optimization and didn't find any difference. However, floating point is the
land of weird edge cases, so any review would be appreciated.
llvm-svn: 128545
When DCE clones a live range because it separates into connected components,
make sure that the clones enter the same register allocator stage as the
register they were cloned from.
For instance, clones may be split even when they where created during spilling.
Other registers created during spilling are not candidates for splitting or even
(re-)spilling.
llvm-svn: 128524
was lowering them to sext / uxt + mul instructions. Unfortunately the
optimization passes may hoist the extensions out of the loop and separate them.
When that happens, the long multiplication instructions can be broken into
several scalar instructions, causing significant performance issue.
Note the vmla and vmls intrinsics are not added back. Frontend will codegen them
as intrinsics vmull* + add / sub. Also note the isel optimizations for catching
mul + sext / zext are not changed either.
First part of rdar://8832507, rdar://9203134
llvm-svn: 128502
- Also emit a list of packages and groups sorted by name
- Avoid iterating over DenseSet so that the output of the arrays is deterministic.
llvm-svn: 128489
unnecesary conditionals and introduced a new convenience function.
The problem was that the list of libraries for Clang's unit tests was
<clang libraries> <system libraries> <llvm libraries>. As the llvm
libraries references symbols defined on the system libraries, those
were reported as undefined.
llvm-svn: 128484
The instruction to be rematerialized may not be the one defining the register
that is being spilled. The traceSiblingValue() function sees through sibling
copies to find the remat candidate.
llvm-svn: 128449
isel lowering to fold the zero-extend's and take advantage of no-stall
back to back vmul + vmla:
vmull q0, d4, d6
vmlal q0, d5, d6
is faster than
vaddl q0, d4, d5
vmovl q1, d6
vmul q0, q0, q1
This allows us to vmull + vmlal for:
f = vmull_u8( vget_high_u8(s), c);
f = vmlal_u8(f, vget_low_u8(s), c);
rdar://9197392
llvm-svn: 128444
otool(1), this time with the needed fix for case sensitive file systems :) .
This is a work in progress as the interface for producing symbolic operands is
not done. But a hacked prototype using information from the object file's
relocation entiries and replacing immediate operands with MCExpr's has been
shown to work with no changes to the instrucion printer. These APIs will be
moved into a dynamic library at some point.
llvm-svn: 128415
The reassignment phase was able to move interference with a higher spill weight,
but it didn't happen very often and it was fairly expensive.
The existing interference eviction picks up the slack.
llvm-svn: 128397
removes one use of X which helps it pass the many hasOneUse() checks.
In my analysis, this turns up very often where X = A >>exact B and that can't be
simplified unless X has one use (except by increasing the lifetime of A which is
generally a performance loss).
llvm-svn: 128373
The main register class may have been inflated by live range splitting, so that
register class is not necessarily valid for the snippet instructions.
Use the original register class for the stack slot interval.
llvm-svn: 128351
It couldn't be used outside of the file because SDISelAsmOperandInfo
is local to SelectionDAGBuilder.cpp. Making it a static function avoids
a weird linkage dance.
llvm-svn: 128342
There are two ways that a later store can comletely overlap a previous store:
1. They both start at the same offset, but the earlier store's size is <= the
later's size, or
2. The earlier store's offset is > the later's offset, but it's offset + size
doesn't extend past the later's offset + size.
llvm-svn: 128332
Correctly terminate the range of register DBG_VALUEs when the register is
clobbered or when the basic block ends.
The code is now ready to deal with variables that are sometimes in a register
and sometimes on the stack. We just need to teach emitDebugLoc to say 'stack
slot'.
llvm-svn: 128327
masks to match inversely for the code as is to work. For the example given
we actually want:
bfi r0, r2, #1, #1
not #0, however, given the way the pattern is written it's not possible
at the moment.
Fixes rdar://9177502
llvm-svn: 128320
This is a work in progress as the interface for producing symbolic operands is
not done. But a hacked prototype using information from the object file's
relocation entiries and replacing immediate operands with MCExpr's has been
shown to work with no changes to the instrucion printer. These APIs will be
moved into a dynamic library at some point.
llvm-svn: 128308