is equivalent to any other relevant value; it isn't true in general.
If it is equivalent, the LoopPromoter will tell the AST the equivalence.
Also, delete the PreheaderLoad if it is unused.
Chris, since you were the last one to make major changes here, can you check
that this is sane?
llvm-svn: 129049
Since these "Advanced SIMD and VFP" instructions have more specfic encoding bits
specified, if coproc == 10 or 11, we should reject the insn as invalid.
rdar://problem/9239922
rdar://problem/9239596
llvm-svn: 129027
About 90% of the relevant blocks are live-through without uses, and the only
information required about them is their number. This saves memory and enables
later optimizations that need to look at only the use-blocks.
llvm-svn: 128985
Add more complete sanity check for LdStFrm instructions where if IBit (Inst{25})
is 1, Inst{4} should be 0. Otherwise, we should reject the insn as invalid.
rdar://problem/9239347
rdar://problem/9239467
llvm-svn: 128977
Start teaching the runtime Dyld interface to use the memory manager API
for allocating space. Rather than mapping directly into the MachO object,
we extract the payload for each object and copy it into a dedicated buffer
allocated via the memory manager. For now, just do Segment64, so this works
on x86_64, but not yet on ARM.
llvm-svn: 128973
Qd -> bit[12] == 0
Qn -> bit[16] == 0
Qm -> bit[0] == 0
If one of these bits is 1, the instruction is UNDEFINED.
rdar://problem/9238399
rdar://problem/9238445
llvm-svn: 128949
For register-controlled shifts, we should check that the encoding constraint
Inst{7} = 0 and Inst{4} = 1 is satisfied.
rdar://problem/9237693
llvm-svn: 128941
developers can see if their driver changed any cl::Option's. The
current implementation isn't perfect but handles most kinds of
options. This is nice to have when decomposing the stages of
compilation and moving between different drivers. It's also a good
sanity check when comparing results produced by different command line
invocations that are expected to produce the comparable results.
Note: This is not an attempt to prolong the life of cl::Option. On the
contrary, it's a placeholder for a feature that must exist when
cl::Option is replaced by a more appropriate framework. A new
framework needs: a central option registry, dynamic name lookup,
non-global containers of option values (e.g. per-module,
per-function), *and* the ability to print options values and their defaults at
any point during compilation.
llvm-svn: 128910
getEDInfo(), in which case this code would dereference
NULL. EDInst can already handle NULL info, so avoid
the dereference and pass NULL through.
Reviewed by Sean Callanan
llvm-svn: 128904
An alternative syntax is available for a modified immediate constant that permits the programmer to specify
the encoding directly. In this syntax, #<const> is instead written as #<byte>,#<rot>, where:
<byte> is the numeric value of abcdefgh, in the range 0-255
<rot> is twice the numeric value of rotation, an even number in the range 0-30.
llvm-svn: 128897
There can be multiple defs for a single virtual register when they are defining
sub-registers.
The missing <dead> flag was stopping the inline spiller from eliminating dead
code after rematerialization.
llvm-svn: 128888
This allows us to always keep the smaller slot for an instruction which is what
we want when a register has early clobber defines.
Drop the UsingInstrs set and the UsingBlocks map. They are no longer needed.
llvm-svn: 128886
space info. We crash with an assert in this case. This change checks that the
address space of the bitcasted pointer is the same as the gep ptr.
llvm-svn: 128884
inlined path for the common case.
Most basic blocks don't contain a call that may throw, so the last split point
os simply the first terminator.
llvm-svn: 128874
It needed to be moved closer to the setjmp statement, because the code directly
after the setjmp needs to know about values that are on the stack. Also, the
'bitcast' of the function context was causing a dead load. This wouldn't be too
horrible, except that at -O0 it wasn't optimized out, and because it wasn't
using the correct base pointer (if there is a VLA), it would try to access a
value from a garbage address.
<rdar://problem/9130540>
llvm-svn: 128873
rdar://problem/9229922 ARM disassembler discrepancy: erroneously accepting RFE
Also LDC/STC instructions are predicated while LDC2/STC2 instructions are not, fixed while
doing regression testings.
llvm-svn: 128859
The JITMemory manager references LLVM IR constructs directly, while the
runtime Dyld works at a lower level and can handle objects which may not
originate from LLVM IR. Introduce a new layer for the memory manager to
handle the interface between them. For the MCJIT, this layer will be almost
entirely simply a call-through w/ translation between the IR objects and
symbol names.
llvm-svn: 128851
When a virtual register has a single value that is defined as a copy of a
reserved register, permit that copy to be joined. These virtual register are
usually copies of the stack pointer:
%vreg75<def> = COPY %ESP; GR32:%vreg75
MOV32mr %vreg75, 1, %noreg, 0, %noreg, %vreg74<kill>
MOV32mi %vreg75, 1, %noreg, 8, %noreg, 0
MOV32mi %vreg75<kill>, 1, %noreg, 4, %noreg, 0
CALLpcrel32 ...
Coalescing these virtual registers early decreases register pressure.
Previously, they were coalesced by RALinScan::attemptTrivialCoalescing after
register allocation was completed.
The lower register pressure causes the mcinst-lowering-cmp0.ll test case to fail
because it depends on linear scan spilling a particular register.
I am deleting 2008-08-05-SpillerBug.ll because it is counting the number of
instructions emitted, and its revision history shows the 'correct' count being
edited many times.
llvm-svn: 128845
The code inserted by PPCTargetLowering::EmitInstrWithCustomInserter for ppc64 is
wrong, and I don't know how to fix it. It seems to be using the correct register
classes for pointers, but it inserts all 32-bit instructions.
llvm-svn: 128835
also fix the encoding of the later.
- Add a new encoding bit to describe the index mode used in AM3.
- Teach printAddrMode3Operand to check by the addressing mode which
index mode to print.
- Testcases.
llvm-svn: 128832
Define most shift masks incrementally to reduce the redundant
hard-coding. Introduce new shift for the VEX flags to replace the
magic constant 32 in various places.
llvm-svn: 128822
- Adds support for sniffing PE/COFF files on win32 (.exe and .dll)
which are COFF files that have an MS-DOS compatibility stub on
the front of them.
- Fixes a bug in the COFFObjectFile's support for the Microsoft COFF
extension for long symbol names, wherein it was attempting to parse
the leading '/' in an extended symbol name reference as part of the
integer offset.
- Fixes bugs in COFFObjectFile and ELFObjectFile wherein section
and symbol iterators were being returned with uninitialized bytes;
the type DataRefImpl is a union between 2 32-bit words (d.a and d.b)
and a single intptr_t word (p). Only p was being initialized, so in
32-bit builds the result would be iterators with random upper 32-bit
words in their DataRefImpls. This caused random failures when
seeking around in object files.
Patch by Graydon Hoare!
llvm-svn: 128799
after the given instruction; make sure to handle that case correctly.
(It's difficult to trigger; the included testcase involves a dead
block, but I don't think that's a requirement.)
While I'm here, get rid of the unnecessary warning about
SimplifyInstructionsInBlock, since it should work correctly as far as I know.
llvm-svn: 128782
It's possible to craft an input that hits the recursion limits in a way
that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits
can infer which bits are zero.
No test case as it depends on too many other things. Fixes PR9609.
llvm-svn: 128777
If someone first configure build with LLVM_ENABLE_FFI=1 and then turn it
off, the build will fail in lib/ExecutionEngine/Interpreter because
Interpreter will try still to #include <ffi/ffi.h>, but there are no
include_directories(${FFI_INCLUDE_DIR}) now.
This patch unset()'s HAVE_FFI_H and HAVE_FFI_FFI_H from cache file if
LLVM_ENABLE_FFI=0. This forces CMake to update config.h.
Patch by arrowdodger!
llvm-svn: 128769
When the greedy register allocator is splitting multiple global live ranges, it
tends to look at the same interference data many times. The InterferenceCache
class caches queries for unaltered LiveIntervalUnions.
llvm-svn: 128764
registers that arise from argument shuffling with the soft float ABI. These
instructions are particularly slow on Cortex A8. This fixes one half of
<rdar://problem/8674845>.
llvm-svn: 128759
transformations in target-specific DAG combines without causing DAGCombiner to
delete the same node twice. If you know of a better way to avoid this (see my
next patch for an example), please let me know.
llvm-svn: 128758
- Localize the check if an icmp has one use to a place where we know we're
introducing something that's likely more expensive than a sext from i1.
- Add an assert to make sure a case that would lead to a miscompilation is
folded away earlier.
- Fix a typo.
llvm-svn: 128744
with the contents of CMAKE_C(XX)_FLAGS too, else `llvm-config
--c(xx)flags' doesn't tell the absolute truth.
This comes from PR9603 and is based on a patch by Ryuta Suzuki!
llvm-svn: 128727
had gotten out of sync: isCastable didn't think it was possible to
cast the x86_mmx type to anything, while it did think it possible
to cast an i64 to x86_mmx.
llvm-svn: 128705
all LDR/STR changes and left them to a future patch. Passing all
checks now.
- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
fix the encoding wherever is possible.
- Add a new encoding bit to describe the index mode used and teach
printAddrMode2Operand to check by the addressing mode which index
mode to print.
- Testcases
llvm-svn: 128689
This way, shrinkToUses() will ignore the instruction that is about to be
deleted, and we avoid leaving invalid live ranges that SplitKit doesn't like.
Fix a misunderstanding in MachineVerifier about <def,undef> operands. The
<undef> flag is valid on def operands where it has the same meaning as <undef>
on a use operand. It only applies to sub-register defines which also read the
full register.
llvm-svn: 128642
- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
{STR,LDC}{2}_{PRE,POST} fixing the encoding wherever is possible.
- Move all instructions which use am2offset without a pattern to use
addrmode2.
- Add a new encoding bit to describe the index mode used and teach
printAddrMode2Operand to check by the addressing mode which index
mode to print.
- Testcases
llvm-svn: 128632
We don't expect the real "powf()" on some hosts (and powf() would be available on other hosts).
For consistency, std::pow(double,double) may be called instead.
Or, precision issue might attack us, to see unstable regalloc and stack coloring.
llvm-svn: 128629
The rematerialized instruction may require a more constrained register class
than the register being spilled. In the test case, the spilled register has been
inflated to the DPR register class, but we are rematerializing a load of the
ssub_0 sub-register which only exists for DPR_VFP2 registers.
The register class is reinflated after spilling, so the conservative choice is
only temporary.
llvm-svn: 128610
{STR,LDC}{2}_PRE.
- Fixed the encoding in some places.
- Some of those instructions were using am2offset and now use addrmode2.
Codegen isn't affected, instructions which use SelectAddrMode2Offset were not
touched.
- Teach printAddrMode2Operand to check by the addressing mode which index
mode to print.
- This is a work in progress, more work to come. The idea is to change places
which use am2offset to use addrmode2 instead, as to unify assembly parser.
- Add testcases for assembly parser
llvm-svn: 128585
that one of the numbers is signed while the other is unsigned. This could lead
to a wrong result when the signed was promoted to an unsigned int.
* Add the data layout line to the testcase so that it will test the appropriate
thing.
Patch by David Terei!
llvm-svn: 128577
StringMap was not properly updating NumTombstones after a clear or rehash.
This was not fatal until now because the table was growing faster than
NumTombstones could, but with the previous change of preventing infinite
growth of the table the invariant (NumItems + NumTombstones <= NumBuckets)
stopped being observed, causing infinite loops in certain situations.
Patch by José Fonseca!
llvm-svn: 128567
When the hash function uses object pointers all free entries eventually
become tombstones as they are used at least once, regardless of the size.
DenseMap cannot function with zero empty keys, so it double size to get
get ridof the tombstones.
However DenseMap never shrinks automatically unless it is cleared, so
the net result is that certain tables grow infinitely.
The solution is to make a fresh copy of the table without tombstones
instead of doubling size, by simply calling grow with the current size.
Patch by José Fonseca!
llvm-svn: 128564
The rewriter can keep track of multiple stack slots in the same register if they
happen to have the same value. When an instruction modifies a stack slot by
defining a register that is mapped to a stack slot, other stack slots in that
register are no longer valid.
This is a very rare problem, and I don't have a simple test case. I get the
impression that VirtRegRewriter knows it is about to be deleted, inventing a
last opaque problem.
<rdar://problem/9204040>
llvm-svn: 128562
The idea is, that if an ieee 754 float is divided by a power of two, we can
turn the division into a cheaper multiplication. This function sees if we can
get an exact multiplicative inverse for a divisor and returns it if possible.
This is the hard part of PR9587.
I tested many inputs against llvm-gcc's frotend implementation of this
optimization and didn't find any difference. However, floating point is the
land of weird edge cases, so any review would be appreciated.
llvm-svn: 128545
When DCE clones a live range because it separates into connected components,
make sure that the clones enter the same register allocator stage as the
register they were cloned from.
For instance, clones may be split even when they where created during spilling.
Other registers created during spilling are not candidates for splitting or even
(re-)spilling.
llvm-svn: 128524
was lowering them to sext / uxt + mul instructions. Unfortunately the
optimization passes may hoist the extensions out of the loop and separate them.
When that happens, the long multiplication instructions can be broken into
several scalar instructions, causing significant performance issue.
Note the vmla and vmls intrinsics are not added back. Frontend will codegen them
as intrinsics vmull* + add / sub. Also note the isel optimizations for catching
mul + sext / zext are not changed either.
First part of rdar://8832507, rdar://9203134
llvm-svn: 128502
- Also emit a list of packages and groups sorted by name
- Avoid iterating over DenseSet so that the output of the arrays is deterministic.
llvm-svn: 128489
unnecesary conditionals and introduced a new convenience function.
The problem was that the list of libraries for Clang's unit tests was
<clang libraries> <system libraries> <llvm libraries>. As the llvm
libraries references symbols defined on the system libraries, those
were reported as undefined.
llvm-svn: 128484
The instruction to be rematerialized may not be the one defining the register
that is being spilled. The traceSiblingValue() function sees through sibling
copies to find the remat candidate.
llvm-svn: 128449
isel lowering to fold the zero-extend's and take advantage of no-stall
back to back vmul + vmla:
vmull q0, d4, d6
vmlal q0, d5, d6
is faster than
vaddl q0, d4, d5
vmovl q1, d6
vmul q0, q0, q1
This allows us to vmull + vmlal for:
f = vmull_u8( vget_high_u8(s), c);
f = vmlal_u8(f, vget_low_u8(s), c);
rdar://9197392
llvm-svn: 128444
otool(1), this time with the needed fix for case sensitive file systems :) .
This is a work in progress as the interface for producing symbolic operands is
not done. But a hacked prototype using information from the object file's
relocation entiries and replacing immediate operands with MCExpr's has been
shown to work with no changes to the instrucion printer. These APIs will be
moved into a dynamic library at some point.
llvm-svn: 128415
The reassignment phase was able to move interference with a higher spill weight,
but it didn't happen very often and it was fairly expensive.
The existing interference eviction picks up the slack.
llvm-svn: 128397
removes one use of X which helps it pass the many hasOneUse() checks.
In my analysis, this turns up very often where X = A >>exact B and that can't be
simplified unless X has one use (except by increasing the lifetime of A which is
generally a performance loss).
llvm-svn: 128373
The main register class may have been inflated by live range splitting, so that
register class is not necessarily valid for the snippet instructions.
Use the original register class for the stack slot interval.
llvm-svn: 128351
It couldn't be used outside of the file because SDISelAsmOperandInfo
is local to SelectionDAGBuilder.cpp. Making it a static function avoids
a weird linkage dance.
llvm-svn: 128342
There are two ways that a later store can comletely overlap a previous store:
1. They both start at the same offset, but the earlier store's size is <= the
later's size, or
2. The earlier store's offset is > the later's offset, but it's offset + size
doesn't extend past the later's offset + size.
llvm-svn: 128332
Correctly terminate the range of register DBG_VALUEs when the register is
clobbered or when the basic block ends.
The code is now ready to deal with variables that are sometimes in a register
and sometimes on the stack. We just need to teach emitDebugLoc to say 'stack
slot'.
llvm-svn: 128327
masks to match inversely for the code as is to work. For the example given
we actually want:
bfi r0, r2, #1, #1
not #0, however, given the way the pattern is written it's not possible
at the moment.
Fixes rdar://9177502
llvm-svn: 128320
This is a work in progress as the interface for producing symbolic operands is
not done. But a hacked prototype using information from the object file's
relocation entiries and replacing immediate operands with MCExpr's has been
shown to work with no changes to the instrucion printer. These APIs will be
moved into a dynamic library at some point.
llvm-svn: 128308
The .dot directives don't need labels, that is a leftover from when we created
line number info manually.
Instructions following a DBG_VALUE can share its label since the DBG_VALUE
doesn't produce any code.
llvm-svn: 128284